Update README.md
Browse files
README.md
CHANGED
|
@@ -41,8 +41,7 @@ A higher emphasis is given to **L1 loss**, ensuring that overall brightness and
|
|
| 41 |
- Output: single-channel IR image
|
| 42 |
|
| 43 |
### ⚔️ Discriminator
|
| 44 |
-
-
|
| 45 |
-
- Evaluates realism of local patches for fine detail learning
|
| 46 |
|
| 47 |
---
|
| 48 |
|
|
@@ -55,7 +54,7 @@ A higher emphasis is given to **L1 loss**, ensuring that overall brightness and
|
|
| 55 |
| **Batch Size** | 4 |
|
| 56 |
| **Optimizer** | Adam (β₁ = 0.5, β₂ = 0.999) |
|
| 57 |
| **Learning Rate** | 2e-4 |
|
| 58 |
-
| **Precision** | Mixed (
|
| 59 |
| **Hardware** | NVIDIA T4 (Kaggle GPU Runtime) |
|
| 60 |
|
| 61 |
---
|
|
@@ -94,7 +93,10 @@ L_{G} = \lambda_{L1} L_{L1} + \lambda_{perc} L_{perc} + \lambda_{adv} L_{adv} +
|
|
| 94 |
| **Combined GAN** |  |
|
| 95 |
|
| 96 |
---
|
|
|
|
| 97 |
|
|
|
|
|
|
|
| 98 |
|
| 99 |
|
| 100 |
## 🖼️ Visual Results
|
|
@@ -141,13 +143,13 @@ All training metrics are logged in:
|
|
| 141 |
- The model **captures IR brightness and object distinction**, but early epochs show slight blur due to L1-dominant stages.
|
| 142 |
- **Contrast and edge sharpness improve** after ~70 epochs as adversarial and perceptual losses gain weight.
|
| 143 |
- Background variations in LLVIP introduce challenges; future fine-tuning on domain-aligned subsets can further improve realism.
|
| 144 |
-
|
|
|
|
| 145 |
---
|
| 146 |
|
| 147 |
## 🚀 Future Work
|
| 148 |
|
| 149 |
- Apply **feature matching loss** for smoother discriminator gradients
|
| 150 |
-
- Introduce **spectral normalization** for training stability
|
| 151 |
- Add **temporal or sequence consistency** for video IR translation
|
| 152 |
- Adaptive loss balancing with epoch-based dynamic weighting
|
| 153 |
|
|
@@ -160,23 +162,3 @@ TensorFlow and VGG-19 for perceptual feature extraction
|
|
| 160 |
|
| 161 |
Kaggle GPU for high-performance model training
|
| 162 |
|
| 163 |
-
## 📜 License
|
| 164 |
-
|
| 165 |
-
**MIT License © 2025**
|
| 166 |
-
Author: **Sai Sumanth Appala**
|
| 167 |
-
|
| 168 |
-
---
|
| 169 |
-
|
| 170 |
-
## 🧾 Citation
|
| 171 |
-
|
| 172 |
-
If you use this work, please cite:
|
| 173 |
-
|
| 174 |
-
```bibtex
|
| 175 |
-
@misc{appala2025visible2ir,
|
| 176 |
-
author = {Appala, Sai Sumanth},
|
| 177 |
-
title = {Conditional GAN for Visible-to-Infrared Translation with Multi-Loss Training},
|
| 178 |
-
year = {2025},
|
| 179 |
-
license = {MIT},
|
| 180 |
-
dataset = {UserNae3/LLVIP},
|
| 181 |
-
framework = {TensorFlow},
|
| 182 |
-
}
|
|
|
|
| 41 |
- Output: single-channel IR image
|
| 42 |
|
| 43 |
### ⚔️ Discriminator
|
| 44 |
+
- Evaluates realism for fine detail learning
|
|
|
|
| 45 |
|
| 46 |
---
|
| 47 |
|
|
|
|
| 54 |
| **Batch Size** | 4 |
|
| 55 |
| **Optimizer** | Adam (β₁ = 0.5, β₂ = 0.999) |
|
| 56 |
| **Learning Rate** | 2e-4 |
|
| 57 |
+
| **Precision** | Mixed (32) |
|
| 58 |
| **Hardware** | NVIDIA T4 (Kaggle GPU Runtime) |
|
| 59 |
|
| 60 |
---
|
|
|
|
| 93 |
| **Combined GAN** |  |
|
| 94 |
|
| 95 |
---
|
| 96 |
+
Data Exploration
|
| 97 |
|
| 98 |
+
We analysed the LLVIP dataset and found that ~70% of image pairs are captured at < 50 lux lighting and ~30% at 50-200 lux.
|
| 99 |
+
The average pedestrian height in IR channel was X pixels; outliers with <20 pixels height were excluded.
|
| 100 |
|
| 101 |
|
| 102 |
## 🖼️ Visual Results
|
|
|
|
| 143 |
- The model **captures IR brightness and object distinction**, but early epochs show slight blur due to L1-dominant stages.
|
| 144 |
- **Contrast and edge sharpness improve** after ~70 epochs as adversarial and perceptual losses gain weight.
|
| 145 |
- Background variations in LLVIP introduce challenges; future fine-tuning on domain-aligned subsets can further improve realism.
|
| 146 |
+
- We compared three variants: (i) U-Net regression (L1 only) → SSIM = 0.80;
|
| 147 |
+
- (ii) cGAN with L1+adv → SSIM = 0.83; (iii) cGAN with L1+adv+perc+edge (our final) → SSIM = 0.8386
|
| 148 |
---
|
| 149 |
|
| 150 |
## 🚀 Future Work
|
| 151 |
|
| 152 |
- Apply **feature matching loss** for smoother discriminator gradients
|
|
|
|
| 153 |
- Add **temporal or sequence consistency** for video IR translation
|
| 154 |
- Adaptive loss balancing with epoch-based dynamic weighting
|
| 155 |
|
|
|
|
| 162 |
|
| 163 |
Kaggle GPU for high-performance model training
|
| 164 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|