|
|
--- |
|
|
title: HAT Super-Resolution for Satellite Images |
|
|
emoji: 🛰️ |
|
|
colorFrom: blue |
|
|
colorTo: green |
|
|
sdk: gradio |
|
|
sdk_version: 5.46.1 |
|
|
app_file: app.py |
|
|
pinned: false |
|
|
--- |
|
|
|
|
|
# HATSAT - Super-Resolution for Satellite Images |
|
|
|
|
|
This Hugging Face Space demonstrates a fine-tuned **Hybrid Attention Transformer (HAT)** model for satellite image super-resolution. The model performs 4x upscaling of satellite imagery, enhancing the resolution while preserving important geographical and structural details. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
- **Architecture**: HAT (Hybrid Attention Transformer) |
|
|
- **Upscaling Factor**: 4x |
|
|
- **Input Channels**: 3 (RGB) |
|
|
- **Training**: Fine-tuned on satellite imagery dataset |
|
|
- **Base Model**: Pre-trained HAT model from ImageNet |
|
|
|
|
|
## Model Configuration |
|
|
|
|
|
- **Window Size**: 16 |
|
|
- **Embed Dimension**: 180 |
|
|
- **Depths**: [6, 6, 6, 6, 6, 6] |
|
|
- **Number of Heads**: [6, 6, 6, 6, 6, 6] |
|
|
- **Compress Ratio**: 3 |
|
|
- **Squeeze Factor**: 30 |
|
|
- **Overlap Ratio**: 0.5 |
|
|
|
|
|
## Usage |
|
|
|
|
|
1. Upload a satellite image (RGB format) |
|
|
2. The model will automatically upscale it by 4x |
|
|
3. Download the enhanced high-resolution result |
|
|
|
|
|
## Training Details |
|
|
|
|
|
The model was fine-tuned using: |
|
|
- **Loss Function**: L1Loss |
|
|
- **Optimizer**: Adam (lr=2e-5) |
|
|
- **Training Iterations**: 20,000 |
|
|
- **Scheduler**: MultiStepLR with milestones at [10000, 50000, 100000, 130000, 140000] |
|
|
|
|
|
## Applications |
|
|
|
|
|
This model is particularly useful for: |
|
|
- Enhancing low-resolution satellite imagery |
|
|
- Geographic analysis and mapping |
|
|
- Environmental monitoring |
|
|
- Urban planning and development |
|
|
- Agricultural monitoring |
|
|
|
|
|
## Technical Implementation |
|
|
|
|
|
The model implements several key architectural components: |
|
|
- **Hybrid Attention Blocks (HAB)**: Combining window-based and overlapping attention |
|
|
- **Overlapping Cross-Attention Blocks (OCAB)**: For enhanced feature extraction |
|
|
- **Residual Hybrid Attention Groups (RHAG)**: Stacked attention layers with residual connections |
|
|
- **Channel Attention Blocks (CAB)**: For feature refinement |
|
|
|
|
|
## Performance |
|
|
|
|
|
The model has been trained for 20,000 iterations with careful monitoring of PSNR and SSIM metrics on satellite imagery validation data. |
|
|
|
|
|
## Acknowledgments |
|
|
|
|
|
This model is a fine tuned version of **HAT (Hybrid Attention Transformer)** and trained on the **SEN2NAIPv2** dataset. |
|
|
|
|
|
### Base Model: HAT |
|
|
- **GitHub Repository**: [https://github.com/XPixelGroup/HAT](https://github.com/XPixelGroup/HAT) |
|
|
- **Paper**: [Activating More Pixels in Image Super-Resolution Transformer](https://arxiv.org/abs/2205.04437) |
|
|
- **Authors**: Xiangyu Chen, Xintao Wang, Jiantao Zhou, Yu Qiao, Chao Dong |
|
|
|
|
|
### Training Dataset: SEN2NAIPv2 |
|
|
- **HuggingFace Dataset**: [https://huggingface.co/datasets/tacofoundation/SEN2NAIPv2](https://huggingface.co/datasets/tacofoundation/SEN2NAIPv2) |
|
|
- **Description**: High-resolution satellite imagery dataset for super-resolution tasks |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use this model in your research, please cite both the original HAT paper and the SEN2NAIPv2 dataset: |
|
|
|
|
|
```bibtex |
|
|
@article{chen2023hat, |
|
|
title={Activating More Pixels in Image Super-Resolution Transformer}, |
|
|
author={Chen, Xiangyu and Wang, Xintao and Zhou, Jiantao and Qiao, Yu and Dong, Chao}, |
|
|
journal={arXiv preprint arXiv:2205.04437}, |
|
|
year={2022} |
|
|
} |
|
|
|
|
|
@misc{sen2naipv2, |
|
|
title={SEN2NAIPv2: A Large-Scale Dataset for Satellite Image Super-Resolution}, |
|
|
author={TACO Foundation}, |
|
|
year={2024}, |
|
|
url={https://huggingface.co/datasets/tacofoundation/SEN2NAIPv2} |
|
|
} |
|
|
``` |