arxiv:2211.11448

Delving StyleGAN Inversion for Image Editing: A Foundation Latent Space Viewpoint

Published on Nov 21, 2022

Authors:

Abstract

GAN inversion and editing method using contrastive learning to improve latent code precision in the foundation space W, followed by cross-attention encoding to enhance both reconstruction fidelity and editability in extended spaces W⁺ and F.

AI-generated summary

GAN inversion and editing via StyleGAN maps an input image into the embedding spaces (W, W^+, and F) to simultaneously maintain image fidelity and meaningful manipulation. From latent space W to extended latent space W^+ to feature space F in StyleGAN, the editability of GAN inversion decreases while its reconstruction quality increases. Recent GAN inversion methods typically explore W^+ and F rather than W to improve reconstruction fidelity while maintaining editability. As W^+ and F are derived from W that is essentially the foundation latent space of StyleGAN, these GAN inversion methods focusing on W^+ and F spaces could be improved by stepping back to W. In this work, we propose to first obtain the precise latent code in foundation latent space W. We introduce contrastive learning to align W and the image space for precise latent code discovery. %The obtaining process is by using contrastive learning to align W and the image space. Then, we leverage a cross-attention encoder to transform the obtained latent code in W into W^+ and F, accordingly. Our experiments show that our exploration of the foundation latent space W improves the representation ability of latent codes in W^+ and features in F, which yields state-of-the-art reconstruction fidelity and editability results on the standard benchmarks. Project page: https://kumapowerliu.github.io/CLCAE.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2211.11448 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2211.11448 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2211.11448 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.