| # StyleGAN2-ADA Pipeline for Image Projection | |
| This guide provides a step-by-step explanation of how to align a face image, project it into the latent space of StyleGAN2-ADA, and visualize the results. | |
| ## Requirements | |
| ### Dependencies | |
| - Python 3.7+ | |
| - PyTorch | |
| - Required libraries installed via `requirements.txt` in the repository | |
| - Kaggle environment with internet enabled | |
| ### Models and Methods Used | |
| - **Face Alignment:** `align_images.py` uses the `shape_predictor_68_face_landmarks.dat` model from DLib for precise facial alignment. | |
| - **Image Projection:** `projector.py` projects an aligned image into the latent space of StyleGAN2 using a pre-trained model (`ffhq.pkl` from NVIDIA Labs). | |
| - **Pre-trained Models:** | |
| - Face landmark model: `shape_predictor_68_face_landmarks.dat` | |
| - StyleGAN2-ADA pre-trained weights: `ffhq.pkl` | |
| --- | |
| ## Step-by-Step Execution | |
| ### 1. Clone the Repository | |
| Clone the repository for StyleGAN2-ADA: | |
| ```bash | |
| !git clone https://github.com/rkuo2000/stylegan2-ada-pytorch.git | |
| %cd stylegan2-ada-pytorch | |
| ``` | |
| ### 2. Prepare the Raw Images | |
| Create a directory for raw images and copy the desired file: | |
| ```bash | |
| !mkdir -p raw | |
| !cp /kaggle/input/test-notebook-images/profile-image.jpg raw/example.jpg | |
| ``` | |
| Verify the file: | |
| ```bash | |
| !ls raw | |
| ``` | |
| ### 3. Align the Face Image | |
| Run the face alignment script: | |
| ```bash | |
| !python align_images.py raw aligned | |
| ``` | |
| - **Input:** `raw/example.jpg` | |
| - **Output:** Aligned image saved as `aligned/example_01.png` | |
| ### 4. Verify Alignment | |
| List the aligned directory to confirm output: | |
| ```bash | |
| !ls aligned | |
| ``` | |
| ### 5. Project the Image into Latent Space | |
| Run the projection script: | |
| ```bash | |
| !python projector.py --outdir=out --target=aligned/example_01.png \ | |
| --network=https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/ffhq.pkl | |
| ``` | |
| - **Output:** | |
| - Latent space projection results saved in the `out/` directory | |
| - A video (`proj.mp4`) showing optimization progress | |
| --- | |
| ## Viewing Results | |
| ### 1. Inline Video Playback | |
| Use the following command to view the progress video inline: | |
| ```python | |
| from IPython.display import Video | |
| Video('out/proj.mp4', embed=True) | |
| ``` | |
| ### 2. Download the Video | |
| To download the video file, use: | |
| ```python | |
| from IPython.display import FileLink | |
| FileLink('out/proj.mp4') | |
| ``` | |
| Click the generated link to download `proj.mp4` to your local machine. | |
| --- | |
| ## Adding Gradio for Runtime Image Upload | |
| You can integrate Gradio to allow users to upload a photo and generate the GAN output (image and video) on runtime. Here is how to modify the pipeline: | |
| ### Install Gradio | |
| ```bash | |
| !pip install gradio | |
| ``` | |
| ### Update the Code | |
| Add the following Python script to create a Gradio interface: | |
| ```python | |
| import gradio as gr | |
| import subprocess | |
| from PIL import Image | |
| def process_image(input_image): | |
| # Save the input image to raw directory | |
| input_path = "raw/input_image.jpg" | |
| input_image.save(input_path) | |
| # Align the face | |
| subprocess.run(["python", "align_images.py", "raw", "aligned"]) | |
| # Run projection | |
| subprocess.run([ | |
| "python", "projector.py", "--outdir=out", "--target=aligned/input_image_01.png", \ | |
| "--network=https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/ffhq.pkl" | |
| ]) | |
| # Load generated image and video | |
| output_image_path = "out/proj.png" # Adjust if necessary | |
| output_video_path = "out/proj.mp4" | |
| output_image = Image.open(output_image_path) | |
| return output_image, output_video_path | |
| # Gradio Interface | |
| demo = gr.Interface( | |
| fn=process_image, | |
| inputs=[gr.Image(type="pil", label="Upload an Image")], | |
| outputs=[ | |
| gr.Image(type="pil", label="Generated Image"), | |
| gr.Video(label="Projection Video") | |
| ], | |
| title="StyleGAN2-ADA Image Projection", | |
| description="Upload a face image to generate GAN output and projection video." | |
| ) | |
| demo.launch() | |
| ``` | |
| ### Running the Gradio Interface | |
| Save the above script and run it in your environment. A Gradio web interface will open, allowing users to upload images and see the generated results in real time. | |
| --- | |
| ## Notes | |
| 1. Ensure the internet is enabled in your Kaggle notebook for downloading required models. | |
| 2. Verify input paths to match your dataset and file structure. | |
| 3. Outputs are saved in the following structure: | |
| - `raw/`: Original images | |
| - `aligned/`: Aligned face images | |
| - `out/`: Projection results and video | |
| --- | |
| ## Acknowledgments | |
| - StyleGAN2-ADA by NVIDIA Labs: [GitHub Repository](https://github.com/NVlabs/stylegan2-ada-pytorch) | |
| - DLib for face alignment: [DLib Library](http://dlib.net/) | |