hash-map commited on
Commit
cd5ce2a
·
verified ·
1 Parent(s): 6f3ce53

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -25
README.md CHANGED
@@ -41,8 +41,7 @@ A higher emphasis is given to **L1 loss**, ensuring that overall brightness and
41
  - Output: single-channel IR image
42
 
43
  ### ⚔️ Discriminator
44
- - PatchGAN (70×70 receptive field)
45
- - Evaluates realism of local patches for fine detail learning
46
 
47
  ---
48
 
@@ -55,7 +54,7 @@ A higher emphasis is given to **L1 loss**, ensuring that overall brightness and
55
  | **Batch Size** | 4 |
56
  | **Optimizer** | Adam (β₁ = 0.5, β₂ = 0.999) |
57
  | **Learning Rate** | 2e-4 |
58
- | **Precision** | Mixed (FP16/32) |
59
  | **Hardware** | NVIDIA T4 (Kaggle GPU Runtime) |
60
 
61
  ---
@@ -94,7 +93,10 @@ L_{G} = \lambda_{L1} L_{L1} + \lambda_{perc} L_{perc} + \lambda_{adv} L_{adv} +
94
  | **Combined GAN** | ![GAN Architecture Combined](gan_architecture_combined.png) |
95
 
96
  ---
 
97
 
 
 
98
 
99
 
100
  ## 🖼️ Visual Results
@@ -141,13 +143,13 @@ All training metrics are logged in:
141
  - The model **captures IR brightness and object distinction**, but early epochs show slight blur due to L1-dominant stages.
142
  - **Contrast and edge sharpness improve** after ~70 epochs as adversarial and perceptual losses gain weight.
143
  - Background variations in LLVIP introduce challenges; future fine-tuning on domain-aligned subsets can further improve realism.
144
-
 
145
  ---
146
 
147
  ## 🚀 Future Work
148
 
149
  - Apply **feature matching loss** for smoother discriminator gradients
150
- - Introduce **spectral normalization** for training stability
151
  - Add **temporal or sequence consistency** for video IR translation
152
  - Adaptive loss balancing with epoch-based dynamic weighting
153
 
@@ -160,23 +162,3 @@ TensorFlow and VGG-19 for perceptual feature extraction
160
 
161
  Kaggle GPU for high-performance model training
162
 
163
- ## 📜 License
164
-
165
- **MIT License © 2025**
166
- Author: **Sai Sumanth Appala**
167
-
168
- ---
169
-
170
- ## 🧾 Citation
171
-
172
- If you use this work, please cite:
173
-
174
- ```bibtex
175
- @misc{appala2025visible2ir,
176
- author = {Appala, Sai Sumanth},
177
- title = {Conditional GAN for Visible-to-Infrared Translation with Multi-Loss Training},
178
- year = {2025},
179
- license = {MIT},
180
- dataset = {UserNae3/LLVIP},
181
- framework = {TensorFlow},
182
- }
 
41
  - Output: single-channel IR image
42
 
43
  ### ⚔️ Discriminator
44
+ - Evaluates realism for fine detail learning
 
45
 
46
  ---
47
 
 
54
  | **Batch Size** | 4 |
55
  | **Optimizer** | Adam (β₁ = 0.5, β₂ = 0.999) |
56
  | **Learning Rate** | 2e-4 |
57
+ | **Precision** | Mixed (32) |
58
  | **Hardware** | NVIDIA T4 (Kaggle GPU Runtime) |
59
 
60
  ---
 
93
  | **Combined GAN** | ![GAN Architecture Combined](gan_architecture_combined.png) |
94
 
95
  ---
96
+ Data Exploration
97
 
98
+ We analysed the LLVIP dataset and found that ~70% of image pairs are captured at < 50 lux lighting and ~30% at 50-200 lux.
99
+ The average pedestrian height in IR channel was X pixels; outliers with <20 pixels height were excluded.
100
 
101
 
102
  ## 🖼️ Visual Results
 
143
  - The model **captures IR brightness and object distinction**, but early epochs show slight blur due to L1-dominant stages.
144
  - **Contrast and edge sharpness improve** after ~70 epochs as adversarial and perceptual losses gain weight.
145
  - Background variations in LLVIP introduce challenges; future fine-tuning on domain-aligned subsets can further improve realism.
146
+ - We compared three variants: (i) U-Net regression (L1 only) → SSIM = 0.80;
147
+ - (ii) cGAN with L1+adv → SSIM = 0.83; (iii) cGAN with L1+adv+perc+edge (our final) → SSIM = 0.8386
148
  ---
149
 
150
  ## 🚀 Future Work
151
 
152
  - Apply **feature matching loss** for smoother discriminator gradients
 
153
  - Add **temporal or sequence consistency** for video IR translation
154
  - Adaptive loss balancing with epoch-based dynamic weighting
155
 
 
162
 
163
  Kaggle GPU for high-performance model training
164