File size: 4,718 Bytes
b2b650f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
# StyleGAN2-ADA Pipeline for Image Projection

This guide provides a step-by-step explanation of how to align a face image, project it into the latent space of StyleGAN2-ADA, and visualize the results.

## Requirements

### Dependencies
- Python 3.7+
- PyTorch
- Required libraries installed via `requirements.txt` in the repository
- Kaggle environment with internet enabled

### Models and Methods Used
- **Face Alignment:** `align_images.py` uses the `shape_predictor_68_face_landmarks.dat` model from DLib for precise facial alignment.
- **Image Projection:** `projector.py` projects an aligned image into the latent space of StyleGAN2 using a pre-trained model (`ffhq.pkl` from NVIDIA Labs).
- **Pre-trained Models:**
  - Face landmark model: `shape_predictor_68_face_landmarks.dat`
  - StyleGAN2-ADA pre-trained weights: `ffhq.pkl`

---

## Step-by-Step Execution

### 1. Clone the Repository
Clone the repository for StyleGAN2-ADA:
```bash

!git clone https://github.com/rkuo2000/stylegan2-ada-pytorch.git

%cd stylegan2-ada-pytorch

```

### 2. Prepare the Raw Images
Create a directory for raw images and copy the desired file:
```bash

!mkdir -p raw

!cp /kaggle/input/test-notebook-images/profile-image.jpg raw/example.jpg

```
Verify the file:
```bash

!ls raw

```

### 3. Align the Face Image
Run the face alignment script:
```bash

!python align_images.py raw aligned

```
- **Input:** `raw/example.jpg`
- **Output:** Aligned image saved as `aligned/example_01.png`

### 4. Verify Alignment
List the aligned directory to confirm output:
```bash

!ls aligned

```

### 5. Project the Image into Latent Space
Run the projection script:
```bash

!python projector.py --outdir=out --target=aligned/example_01.png \

    --network=https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/ffhq.pkl

```
- **Output:**
  - Latent space projection results saved in the `out/` directory
  - A video (`proj.mp4`) showing optimization progress

---

## Viewing Results

### 1. Inline Video Playback
Use the following command to view the progress video inline:
```python

from IPython.display import Video

Video('out/proj.mp4', embed=True)

```

### 2. Download the Video
To download the video file, use:
```python

from IPython.display import FileLink

FileLink('out/proj.mp4')

```
Click the generated link to download `proj.mp4` to your local machine.

---

## Adding Gradio for Runtime Image Upload
You can integrate Gradio to allow users to upload a photo and generate the GAN output (image and video) on runtime. Here is how to modify the pipeline:

### Install Gradio
```bash

!pip install gradio

```

### Update the Code
Add the following Python script to create a Gradio interface:

```python

import gradio as gr

import subprocess

from PIL import Image



def process_image(input_image):

    # Save the input image to raw directory

    input_path = "raw/input_image.jpg"

    input_image.save(input_path)



    # Align the face

    subprocess.run(["python", "align_images.py", "raw", "aligned"])



    # Run projection

    subprocess.run([

        "python", "projector.py", "--outdir=out", "--target=aligned/input_image_01.png", \

        "--network=https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/ffhq.pkl"

    ])



    # Load generated image and video

    output_image_path = "out/proj.png"  # Adjust if necessary

    output_video_path = "out/proj.mp4"

    output_image = Image.open(output_image_path)



    return output_image, output_video_path



# Gradio Interface

demo = gr.Interface(

    fn=process_image,

    inputs=[gr.Image(type="pil", label="Upload an Image")],

    outputs=[

        gr.Image(type="pil", label="Generated Image"),

        gr.Video(label="Projection Video")

    ],

    title="StyleGAN2-ADA Image Projection",

    description="Upload a face image to generate GAN output and projection video."

)



demo.launch()

```

### Running the Gradio Interface
Save the above script and run it in your environment. A Gradio web interface will open, allowing users to upload images and see the generated results in real time.

---

## Notes
1. Ensure the internet is enabled in your Kaggle notebook for downloading required models.
2. Verify input paths to match your dataset and file structure.
3. Outputs are saved in the following structure:
   - `raw/`: Original images
   - `aligned/`: Aligned face images
   - `out/`: Projection results and video

---

## Acknowledgments
- StyleGAN2-ADA by NVIDIA Labs: [GitHub Repository](https://github.com/NVlabs/stylegan2-ada-pytorch)
- DLib for face alignment: [DLib Library](http://dlib.net/)