Geometry-Aware Neural RenderingUnderstanding the 3-dimensional structure of the world is a core challenge in
computer vision and robotics. Neural rendering approaches learn an implicit 3D
model by predicting what a camera would see from an arbitrary viewpoint. We
extend existing neural rendering to more complex, higher dimensional scenes
than previously possible. We propose Epipolar Cross Attention (ECA), an
attention mechanism that leverages the geometry of the scene to perform
efficient non-local operations, requiring only $O(n)$ comparisons per spatial
dimension instead of $O(n^2)$. We introduce three new simulated datasets
inspired by real-world robotics and demonstrate that ECA significantly improves
the quantitative and qualitative performance of Generative Query Networks
(GQN).
arxiv.org