RSP-Swin-T
Swin Transformer Tiny model for remote sensing scene classification (51 classes).
Usage
from transformers import AutoModelForImageClassification
import torch
# Load model
model = AutoModelForImageClassification.from_pretrained(
"BiliSakura/RSP-Swin-T",
trust_remote_code=True
)
# Inference
model.eval()
input_image = torch.randn(1, 3, 224, 224) # (batch, channels, height, width)
with torch.no_grad():
outputs = model(pixel_values=input_image)
logits = outputs["logits"] # Shape: (1, 51)
predicted_class = logits.argmax(dim=-1).item()
Model Details
- Architecture: Swin Transformer Tiny
- Input size: 224×224×3
- Number of classes: 51
- Parameters: ~27.6M
Citation
If you use this model, please cite the original RSP paper:
@ARTICLE{rsp,
author={Wang, Di and Zhang, Jing and Du, Bo and Xia, Gui-Song and Tao, Dacheng},
journal={IEEE Transactions on Geoscience and Remote Sensing},
title={An Empirical Study of Remote Sensing Pretraining},
year={2023},
volume={61},
number={},
pages={1-20},
doi={10.1109/TGRS.2022.3176603}
}
Original Repository: ViTAE-Transformer/RSP
- Downloads last month
- 17