Home

Awesome

<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en"> <head> <meta name="generator" content="jemdoc, see http://jemdoc.jaboc.net/" /> <meta http-equiv="Content-Type" content="text/html;charset=utf-8" /> <link rel="stylesheet" href="jemdoc.css" type="text/css" /> </head> <!-- <div id="layout-content"> <div id="toptitle"> <h1 align="center">3D-aware Image Synthesis &ndash; Papers, Codes and Datasets</h1> </div> --> <div id="layout-content"> <!-- <div id="toptitle"> --> <p align="center"> <h1 align="center">A Survey on Deep Generative 3D-aware Image Synthesis</h1> <p align="center"> ACM Computing Surveys, 2023 <br /> <a href="https://weihaox.github.io/"><strong>Weihao Xia</strong></a> · <a href="http://www.homepages.ucl.ac.uk/~ucakjxu/"><strong>Jing-Hao Xue</strong></a> </p> <p align="center"> <a href='https://arxiv.org/abs/2210.14267'> <img src='https://img.shields.io/badge/Paper-Paper-green?style=flat&logo=arxiv&logoColor=green' alt='arxiv Paper'> </a> <a href='https://weihaox.github.io/3D-aware-Gen/' style='padding-left: 0.5rem;'> <img src='https://img.shields.io/badge/Project-Page-blue?style=flat&logo=Google%20chrome&logoColor=blue' alt='Project Page'> </a> <a href='https://dl.acm.org/doi/10.1145/3626193' style='padding-left: 0.5rem;'> <img src='https://img.shields.io/badge/CSUR-Paper-red?style=flat&logoColor=red' alt='CSUR Paper'> </a> </p> </p> <!-- </div> -->

Introduction

This project lists representative papers/codes/datasets about deep 3D-aware image synthesis. Besides 3D-aware Generative Models (GANs and Diffusion Models) discussed in this survey, this project additionally covers novel view synthesis studies, especially those based on implicit neural representations such as NeRF.

We aim to constantly update the latest relevant papers and help the community track this topic. Please feel free to join us and contribute to the project. Please do not hesitate to reach out if you have any questions or suggestions.

Survey paper

3D Control of 2D GANs

3D Control Latent Directions

For 3D control over diffusion models simiar to GAN, please refer to semantic manipulation in diffusion latent spaces.

3D Parameters as Controls

3D Prior Knowledge as Constraints

3D-aware GANs for a Single Image Category

Unconditional 3D Generative Models

Conditional 3D Generative Models

3D-aware Diffusion Models for a Single Image Category

3D-Aware Generative Models on ImageNet

3D-aware Video Synthesis

INR-based 3D Novel View Synthesis

Neural Scene Representations

Acceleration

From Constrained to In-the-wild Conditions

Few Images

Pose-free

Varying Appearance

Large-scale Scene

Dynamic Scene

The following papers are not directly related to 3D-aware image synthesis. But it would be beneficial to pay attention to those works. For example, in our survey, inverse rendering are not classified as 3D-aware image synthesis as they are not deliberately designed for this purpose. But with the inferred intrinsic components, photorealistic images can be rendered. 3D reconstruction models geometry only with no appearance information, meaning them not able to render images with photorealistic textures. But these representations have been introduced as the geometric representation along with a textural representation (e.g., Texture Field) for 3D-aware image synthesis.

3D Representations

Neural Inverse Rendering (Neural De-rendering)

Inverse rendering is to infer underlying intrinsic components of a scene from rendered 2D images. These properties include shape (surface, depth, normal), material (albedo, reflectivity, shininess), and lighting (direction, intensity), which can be further used to render photorealistic images.

Neural Rerendering

Datasets

Summary of popular 3D-aware image synthesis datasets.

Multi-view image collections

The images are rendered or collected according to different experimental settings, such as Synthetic-NeRF dataset, the DTU dataset, and the Tanks and Temples dataset for general purposes, the crowded Phototourism dataset for varying lighting conditions, the Blender Forward Facing (BLEFF) dataset to benchmark camera parameter estimation and novel view synthesis quality, and the San Francisco Alamo Square Dataset for large-scale scenes.

Examples of multi-view image datasets.

datasetpublished in# scene# samples per scenerange (m × m)resolutionkeyword
DeepVoxelsCVPR 20194 simple objects479 / 1,000\512 × 512synthetic, 360 degree
NeRF SyntheticsECCV 20208 complex objects100 / 200\800 ×800synthetic, 360 degree
NeRF CapturedECCV 20208 complex scenes20-62a few1,008 × 756real, forward-facing
DTUCVPR 2014124 scenes49 or 64a few to thousand1,600 × 1,200often used in few-views
Tanks and TemplesCVPR 201514 objects and scenes4,395 - 21,871dozen to thousand8-megapixelreal, large-scale
PhototourismIJCV 20216 landmarks763-2,000dozen to thousand564-1,417 megapixelvarying illumination
Alamo SquareCVPR 2022San Francisco2,818,745570 × 9601,200 × 900real, large-scale

Single-view image collections

Summary of popular single-view image datasets organized by their major categories and sorted by their popularity.

datasetyearcategory# samplesresolutionkeyword
FFHQCVPR 2019Human Face70k1024 × 1024single simple-shape
AFHQCVPR 2020Cat, Dog, and Wildlife15k512 × 512single simple-shape
CompCarsCVPR 2015Real Car136K256 × 256single simple-shape
CARLACoRL 2017Synthetic Car10k128 × 128single simple-shape
CLEVRCVPR 2017Objects100k256 × 256multiple, simple-shape
LSUN2015Bedroom300K256 × 256single, simple-shape
CelebAICCV 2015Human Face200k178 × 218single simple-shape
CelebA-HQICLR 2018Human Face30k1024 × 1024single, simple-shape
MetFacesNeurIPS 2020Art Face13361024 × 1024single, simple-shape
M-PlantsNeurIPS 2022Variable-Shape141,824256 × 256single, variable-shape
M-FoodNeurIPS 2022Variable-Shape25,472256 × 256single, variable-shape

Citation

If this repository benefits your research, please consider citing our paper.

  @inproceedings{xia2023survey,
    title={A Survey on Deep Generative 3D-aware Image Synthesis},
    author={Xia, Weihao and Xue, Jing-Hao},
    booktitle={ACM Computing Surveys (CSUR)},
    year={2023}
  }

License

<a rel="license" href="http://creativecommons.org/licenses/by/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</a>.