3D-aware GANs offer new capabilities for view synthesis while preserving the editing functionalities of their 2D counterparts. GAN inversion is a crucial step that seeks the latent code to reconstruct input images or videos, subsequently enabling diverse editing tasks through manipulation of this latent code. However, a model pre-trained on a particular dataset (e.g., FFHQ) often has difficulty reconstructing images with out-of-distribution (OOD) objects such as faces with heavy make-up or occluding objects. We address this issue by explicitly modeling OOD objects from the input in 3D-aware GANs. Our core idea is to represent the image using two individual neural radiance fields: one for the in-distribution content and the other for the out-of-distribution object. The final reconstruction is achieved by optimizing the composition of these two radiance fields with carefully designed regularization. We demonstrate that our explicit decomposition alleviates the inherent trade-off between reconstruction fidelity and editability. We evaluate reconstruction accuracy and editability of our method on challenging real face images and videos and showcase favorable results against other baselines.
Our method can be applied to diverse images from the Internet. We show reconstruction results from single images below.
We can edit it with available semantic editing methods, e.g., InterfaceGAN, StyleCLIP.
eyeglasses
Unlike HFGI3D, which warps and computes visibility map, resulting a time-consuming optimization, our method only relies on a depth map from MiDaS, and can synthesize faithful views.
Our method can be applied to diverse videos from the Internet. We show reconstruction results below.
We can edit it with available semantic editing methods, e.g., InterfaceGAN, StyleCLIP.
eyeglasses
After reconstruction, we can acquire novel views.
By setting the weight of OOD pixels to 0, we can remove the OOD object.
Our method is built upon prior work. We share some useful links below.
@inproceedings{xu2024innout,
author = {Xu, Yiran and Shu, Zhixin and Smith, Cameron and Oh, Seoung Wug and Huang, Jia-Bin},
title = {In-N-Out: Faithful 3D GAN Inversion with Volumetric Decomposition for Face Editing},
booktitle = {CVPR},
year = {2024},
}