---
title: HAT Super-Resolution for Satellite Images
emoji: 🛰️
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 5.46.1
app_file: app.py
pinned: false
---

# HATSAT - Super-Resolution for Satellite Images

This Hugging Face Space demonstrates a fine-tuned **Hybrid Attention Transformer (HAT)** model for satellite image super-resolution. The model performs 4x upscaling of satellite imagery, enhancing the resolution while preserving important geographical and structural details.

## Model Details

- **Architecture**: HAT (Hybrid Attention Transformer)
- **Upscaling Factor**: 4x
- **Input Channels**: 3 (RGB)
- **Training**: Fine-tuned on satellite imagery dataset
- **Base Model**: Pre-trained HAT model from ImageNet

## Model Configuration

- **Window Size**: 16
- **Embed Dimension**: 180
- **Depths**: [6, 6, 6, 6, 6, 6]
- **Number of Heads**: [6, 6, 6, 6, 6, 6]
- **Compress Ratio**: 3
- **Squeeze Factor**: 30
- **Overlap Ratio**: 0.5

## Usage

1. Upload a satellite image (RGB format)
2. The model will automatically upscale it by 4x
3. Download the enhanced high-resolution result

## Training Details

The model was fine-tuned using:
- **Loss Function**: L1Loss
- **Optimizer**: Adam (lr=2e-5)
- **Training Iterations**: 20,000
- **Scheduler**: MultiStepLR with milestones at [10000, 50000, 100000, 130000, 140000]

## Applications

This model is particularly useful for:
- Enhancing low-resolution satellite imagery
- Geographic analysis and mapping
- Environmental monitoring
- Urban planning and development
- Agricultural monitoring

## Technical Implementation

The model implements several key architectural components:
- **Hybrid Attention Blocks (HAB)**: Combining window-based and overlapping attention
- **Overlapping Cross-Attention Blocks (OCAB)**: For enhanced feature extraction
- **Residual Hybrid Attention Groups (RHAG)**: Stacked attention layers with residual connections
- **Channel Attention Blocks (CAB)**: For feature refinement

## Performance

The model has been trained for 20,000 iterations with careful monitoring of PSNR and SSIM metrics on satellite imagery validation data.

## Acknowledgments

This model is a fine tuned version of **HAT (Hybrid Attention Transformer)** and trained on the **SEN2NAIPv2** dataset.

### Base Model: HAT
- **GitHub Repository**: [https://github.com/XPixelGroup/HAT](https://github.com/XPixelGroup/HAT)
- **Paper**: [Activating More Pixels in Image Super-Resolution Transformer](https://arxiv.org/abs/2205.04437)
- **Authors**: Xiangyu Chen, Xintao Wang, Jiantao Zhou, Yu Qiao, Chao Dong

### Training Dataset: SEN2NAIPv2
- **HuggingFace Dataset**: [https://huggingface.co/datasets/tacofoundation/SEN2NAIPv2](https://huggingface.co/datasets/tacofoundation/SEN2NAIPv2)
- **Description**: High-resolution satellite imagery dataset for super-resolution tasks

## Citation

If you use this model in your research, please cite both the original HAT paper and the SEN2NAIPv2 dataset:

```bibtex
@article{chen2023hat,
  title={Activating More Pixels in Image Super-Resolution Transformer},
  author={Chen, Xiangyu and Wang, Xintao and Zhou, Jiantao and Qiao, Yu and Dong, Chao},
  journal={arXiv preprint arXiv:2205.04437},
  year={2022}
}

@misc{sen2naipv2,
  title={SEN2NAIPv2: A Large-Scale Dataset for Satellite Image Super-Resolution},
  author={TACO Foundation},
  year={2024},
  url={https://huggingface.co/datasets/tacofoundation/SEN2NAIPv2}
}
```