🌌 deit-gravit-s1
🔭 This model is part of GraViT: Transfer Learning with Vision Transformers and MLP-Mixer for Strong Gravitational Lens Discovery
🔗 GitHub Repository: https://github.com/parlange/gravit
🛰️ Model Details
💻 Quick Start
import torch
import timm
model = timm.create_model(
'hf-hub:parlange/deit-gravit-s1',
pretrained=True
)
model.eval()
dummy_input = torch.randn(1, 3, 224, 224)
with torch.no_grad():
output = model(dummy_input)
predictions = torch.softmax(output, dim=1)
print(f"Lens probability: {predictions[0][1]:.4f}")
⚡️ Training Configuration
Training Dataset: C21 (Cañameras et al. 2021)
Fine-tuning Strategy: classification-head
| 🔧 Parameter |
📝 Value |
| Batch Size |
192 |
| Learning Rate |
AdamW with ReduceLROnPlateau |
| Epochs |
100 |
| Patience |
10 |
| Optimizer |
AdamW |
| Scheduler |
ReduceLROnPlateau |
| Image Size |
224x224 |
| Fine Tune Mode |
classification_head |
| Stochastic Depth Probability |
0.1 |
📈 Training Curves

🏁 Final Epoch Training Metrics
| Metric |
Training |
Validation |
| 📉 Loss |
0.2024 |
0.2813 |
| 🎯 Accuracy |
0.9211 |
0.9000 |
| 📊 AUC-ROC |
0.9761 |
0.9566 |
| ⚖️ F1 Score |
0.9212 |
0.8996 |
☑️ Evaluation Results
ROC Curves and Confusion Matrices
Performance across all test datasets (a through l) in the Common Test Sample (More et al. 2024):

📋 Performance Summary
Average performance across 12 test datasets from the Common Test Sample (More et al. 2024):
| Metric |
Value |
| 🎯 Average Accuracy |
0.8133 |
| 📈 Average AUC-ROC |
0.8334 |
| ⚖️ Average F1-Score |
0.5173 |
📘 Citation
If you use this model in your research, please cite:
@misc{parlange2025gravit,
title={GraViT: Transfer Learning with Vision Transformers and MLP-Mixer for Strong Gravitational Lens Discovery},
author={René Parlange and Juan C. Cuevas-Tello and Octavio Valenzuela and Omar de J. Cabrera-Rosas and Tomás Verdugo and Anupreeta More and Anton T. Jaelani},
year={2025},
eprint={2509.00226},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2509.00226},
}
Model Card Contact
For questions about this model, please contact the author through: https://github.com/parlange/