nafnet-realestate / README.md
SebRincon's picture
Upload README.md with huggingface_hub
bb75314 verified
---
license: apache-2.0
language:
- en
tags:
- image-enhancement
- real-estate
- photo-enhancement
- nafnet
- image-restoration
- pytorch
- onnx
- coreml
- ios
pipeline_tag: image-to-image
library_name: pytorch
datasets:
- custom
metrics:
- psnr
- ssim
model-index:
- name: NAFNet Real Estate Enhancement
results:
- task:
type: image-enhancement
name: Image Enhancement
metrics:
- type: psnr
value: 21.69
name: PSNR
- type: ssim
value: 0.8968
name: SSIM
---
# NAFNet Real Estate Enhancement
A fine-tuned NAFNet model for enhancing real estate photography. Trained on 577 before/after image pairs to improve lighting, color, and overall image quality.
## Model Details
| Metric | Value |
|--------|-------|
| **Architecture** | NAFNet (width=32) |
| **Parameters** | 29.2 million |
| **Model Size** | 111 MB (FP32) / 56 MB (FP16) |
| **Training Time** | 5 hours |
| **Training Images** | 577 pairs |
| **Final PSNR** | 21.69 dB |
| **Final SSIM** | 0.8968 |
## Available Formats
| Format | File | Size | Use Case |
|--------|------|------|----------|
| PyTorch | `nafnet_realestate.pth` | 117 MB | Training, fine-tuning |
| ONNX | `nafnet_realestate.onnx` | 117 MB | Cross-platform deployment |
| Core ML | Convert from ONNX | ~56 MB | iOS/macOS apps |
## Performance Benchmarks
Tested on 100 high-resolution real estate images (avg 7.25 megapixels):
### Timing
| Metric | Value |
|--------|-------|
| Average per image | 4.0 seconds |
| Throughput | 0.25 images/second |
| Megapixels/second | 1.81 MP/s |
### Memory Usage
| Resource | Usage |
|----------|-------|
| **RAM** | 581 MB total |
| **GPU VRAM** | 8.3 GB peak |
### Scaling by Resolution
| Resolution | RAM | GPU | Time |
|------------|-----|-----|------|
| 1080p (2.1 MP) | 150-250 MB | ~2.5 GB | ~1.2s |
| 1440p (3.7 MP) | 250-400 MB | ~4.3 GB | ~2.0s |
| 3K (7.3 MP) | 500-800 MB | ~8.3 GB | ~4.0s |
| 4K (8.3 MP) | 600-900 MB | ~9.5 GB | ~4.6s |
## Usage
### PyTorch
```python
import torch
from PIL import Image
import numpy as np
# Load model
model = NAFNet(img_channel=3, width=32, middle_blk_num=12,
enc_blk_nums=[2, 2, 4, 8], dec_blk_nums=[2, 2, 2, 2])
checkpoint = torch.load("nafnet_realestate.pth", map_location="cpu")
model.load_state_dict(checkpoint["params"])
model.eval()
# Process image
img = Image.open("input.jpg")
img_tensor = torch.from_numpy(np.array(img)).permute(2, 0, 1).unsqueeze(0).float() / 255.0
with torch.no_grad():
output = model(img_tensor)
output_img = (output.squeeze(0).permute(1, 2, 0).numpy() * 255).astype(np.uint8)
Image.fromarray(output_img).save("enhanced.jpg")
```
### ONNX Runtime
```python
import onnxruntime as ort
import numpy as np
from PIL import Image
sess = ort.InferenceSession("nafnet_realestate.onnx")
img = np.array(Image.open("input.jpg")).astype(np.float32) / 255.0
img = img.transpose(2, 0, 1)[np.newaxis, ...]
output = sess.run(None, {"input": img})[0]
output_img = (output[0].transpose(1, 2, 0) * 255).astype(np.uint8)
Image.fromarray(output_img).save("enhanced.jpg")
```
## Mobile Deployment (iOS)
All resolutions fit within typical mobile RAM budgets (3-4 GB):
1. Convert ONNX to Core ML on macOS:
```bash
pip install coremltools
python convert_on_mac.py
```
2. Add `.mlpackage` to Xcode project
3. Use Vision framework for inference
## Training
- **Framework**: BasicSR + PyTorch
- **Base Model**: NAFNet-SIDD-width32 (pretrained on denoising)
- **Loss**: L1 + Perceptual (VGG19)
- **Optimizer**: AdamW (lr=1e-3)
- **Iterations**: 12,000
## License
Apache 2.0
## Citation
```bibtex
@article{chen2022simple,
title={Simple Baselines for Image Restoration},
author={Chen, Liangyu and Chu, Xiaojie and Zhang, Xiangyu and Sun, Jian},
journal={arXiv preprint arXiv:2204.04676},
year={2022}
}
```
## Links
- **GitHub**: [SebRincon/pixel-sorcery](https://github.com/SebRincon/pixel-sorcery/tree/sebastian/nafnet-realestate)
- **Original NAFNet**: [megvii-research/NAFNet](https://github.com/megvii-research/NAFNet)