--- title: HAT Super-Resolution for Satellite Images emoji: 🛰️ colorFrom: blue colorTo: green sdk: gradio sdk_version: 5.46.1 app_file: app.py pinned: false --- # HATSAT - Super-Resolution for Satellite Images This Hugging Face Space demonstrates a fine-tuned **Hybrid Attention Transformer (HAT)** model for satellite image super-resolution. The model performs 4x upscaling of satellite imagery, enhancing the resolution while preserving important geographical and structural details. ## Model Details - **Architecture**: HAT (Hybrid Attention Transformer) - **Upscaling Factor**: 4x - **Input Channels**: 3 (RGB) - **Training**: Fine-tuned on satellite imagery dataset - **Base Model**: Pre-trained HAT model from ImageNet ## Model Configuration - **Window Size**: 16 - **Embed Dimension**: 180 - **Depths**: [6, 6, 6, 6, 6, 6] - **Number of Heads**: [6, 6, 6, 6, 6, 6] - **Compress Ratio**: 3 - **Squeeze Factor**: 30 - **Overlap Ratio**: 0.5 ## Usage 1. Upload a satellite image (RGB format) 2. The model will automatically upscale it by 4x 3. Download the enhanced high-resolution result ## Training Details The model was fine-tuned using: - **Loss Function**: L1Loss - **Optimizer**: Adam (lr=2e-5) - **Training Iterations**: 20,000 - **Scheduler**: MultiStepLR with milestones at [10000, 50000, 100000, 130000, 140000] ## Applications This model is particularly useful for: - Enhancing low-resolution satellite imagery - Geographic analysis and mapping - Environmental monitoring - Urban planning and development - Agricultural monitoring ## Technical Implementation The model implements several key architectural components: - **Hybrid Attention Blocks (HAB)**: Combining window-based and overlapping attention - **Overlapping Cross-Attention Blocks (OCAB)**: For enhanced feature extraction - **Residual Hybrid Attention Groups (RHAG)**: Stacked attention layers with residual connections - **Channel Attention Blocks (CAB)**: For feature refinement ## Performance The model has been trained for 20,000 iterations with careful monitoring of PSNR and SSIM metrics on satellite imagery validation data. ## Acknowledgments This model is a fine tuned version of **HAT (Hybrid Attention Transformer)** and trained on the **SEN2NAIPv2** dataset. ### Base Model: HAT - **GitHub Repository**: [https://github.com/XPixelGroup/HAT](https://github.com/XPixelGroup/HAT) - **Paper**: [Activating More Pixels in Image Super-Resolution Transformer](https://arxiv.org/abs/2205.04437) - **Authors**: Xiangyu Chen, Xintao Wang, Jiantao Zhou, Yu Qiao, Chao Dong ### Training Dataset: SEN2NAIPv2 - **HuggingFace Dataset**: [https://huggingface.co/datasets/tacofoundation/SEN2NAIPv2](https://huggingface.co/datasets/tacofoundation/SEN2NAIPv2) - **Description**: High-resolution satellite imagery dataset for super-resolution tasks ## Citation If you use this model in your research, please cite both the original HAT paper and the SEN2NAIPv2 dataset: ```bibtex @article{chen2023hat, title={Activating More Pixels in Image Super-Resolution Transformer}, author={Chen, Xiangyu and Wang, Xintao and Zhou, Jiantao and Qiao, Yu and Dong, Chao}, journal={arXiv preprint arXiv:2205.04437}, year={2022} } @misc{sen2naipv2, title={SEN2NAIPv2: A Large-Scale Dataset for Satellite Image Super-Resolution}, author={TACO Foundation}, year={2024}, url={https://huggingface.co/datasets/tacofoundation/SEN2NAIPv2} } ```