Spaces:

BorisEm
/

HATSAT

Running on CPU Upgrade

App Files Files Community

HATSAT / README.md

BorisEm

Change title

4993aa4 4 months ago

preview code

raw

history blame contribute delete

3.44 kB

	---
	title: HAT Super-Resolution for Satellite Images
	emoji: 🛰️
	colorFrom: blue
	colorTo: green
	sdk: gradio
	sdk_version: 5.46.1
	app_file: app.py
	pinned: false
	---

	# HATSAT - Super-Resolution for Satellite Images

	This Hugging Face Space demonstrates a fine-tuned Hybrid Attention Transformer (HAT) model for satellite image super-resolution. The model performs 4x upscaling of satellite imagery, enhancing the resolution while preserving important geographical and structural details.

	## Model Details

	- Architecture: HAT (Hybrid Attention Transformer)
	- Upscaling Factor: 4x
	- Input Channels: 3 (RGB)
	- Training: Fine-tuned on satellite imagery dataset
	- Base Model: Pre-trained HAT model from ImageNet

	## Model Configuration

	- Window Size: 16
	- Embed Dimension: 180
	- Depths: [6, 6, 6, 6, 6, 6]
	- Number of Heads: [6, 6, 6, 6, 6, 6]
	- Compress Ratio: 3
	- Squeeze Factor: 30
	- Overlap Ratio: 0.5

	## Usage

	1. Upload a satellite image (RGB format)
	2. The model will automatically upscale it by 4x
	3. Download the enhanced high-resolution result

	## Training Details

	The model was fine-tuned using:
	- Loss Function: L1Loss
	- Optimizer: Adam (lr=2e-5)
	- Training Iterations: 20,000
	- Scheduler: MultiStepLR with milestones at [10000, 50000, 100000, 130000, 140000]

	## Applications

	This model is particularly useful for:
	- Enhancing low-resolution satellite imagery
	- Geographic analysis and mapping
	- Environmental monitoring
	- Urban planning and development
	- Agricultural monitoring

	## Technical Implementation

	The model implements several key architectural components:
	- Hybrid Attention Blocks (HAB): Combining window-based and overlapping attention
	- Overlapping Cross-Attention Blocks (OCAB): For enhanced feature extraction
	- Residual Hybrid Attention Groups (RHAG): Stacked attention layers with residual connections
	- Channel Attention Blocks (CAB): For feature refinement

	## Performance

	The model has been trained for 20,000 iterations with careful monitoring of PSNR and SSIM metrics on satellite imagery validation data.

	## Acknowledgments

	This model is a fine tuned version of HAT (Hybrid Attention Transformer) and trained on the SEN2NAIPv2 dataset.

	### Base Model: HAT
	- GitHub Repository: [https://github.com/XPixelGroup/HAT](https://github.com/XPixelGroup/HAT)
	- Paper: [Activating More Pixels in Image Super-Resolution Transformer](https://arxiv.org/abs/2205.04437)
	- Authors: Xiangyu Chen, Xintao Wang, Jiantao Zhou, Yu Qiao, Chao Dong

	### Training Dataset: SEN2NAIPv2
	- HuggingFace Dataset: [https://huggingface.co/datasets/tacofoundation/SEN2NAIPv2](https://huggingface.co/datasets/tacofoundation/SEN2NAIPv2)
	- Description: High-resolution satellite imagery dataset for super-resolution tasks

	## Citation

	If you use this model in your research, please cite both the original HAT paper and the SEN2NAIPv2 dataset:

	```bibtex
	@article{chen2023hat,
	title={Activating More Pixels in Image Super-Resolution Transformer},
	author={Chen, Xiangyu and Wang, Xintao and Zhou, Jiantao and Qiao, Yu and Dong, Chao},
	journal={arXiv preprint arXiv:2205.04437},
	year={2022}
	}

	@misc{sen2naipv2,
	title={SEN2NAIPv2: A Large-Scale Dataset for Satellite Image Super-Resolution},
	author={TACO Foundation},
	year={2024},
	url={https://huggingface.co/datasets/tacofoundation/SEN2NAIPv2}
	}
	```