STING-BEE 7B
STING-BEE, the first domain-aware visual AI assistant for X-ray baggage security. STING-BEE unifies scene comprehension, referring threat localization, visual grounding, and visual question answering (VQA), establishing new benchmarks for multi-modal learning in X-ray security research. Furthermore, it demonstrates state-of-the-art generalization across cross-domain settings, outperforming existing models in handling real-world threat detection scenarios. It is trained on our public multimodal dataset, STCray, which features image-text pairs across 21 threat categories, including complex concealment and novel threat types like IEDs and 3D-printed firearms.
π Model Sources
- π§ Repository: https://github.com/Divs1159/STING-BEE
- π Website: https://divs1159.github.io/STING-BEE/
- π¦ Dataset: https://huggingface.co/datasets/Naoufel555/STCray-Dataset
π BibTeX
@article{velayudhan2025stingbee,
title={STING-BEE: Towards Vision-Language Model for Real-World X-ray Baggage Security Inspection},
author={Divya Velayudhan, Abdelfatah Ahmed, Mohamad Alansari, Neha Gour, Abderaouf Behouch, Taimur Hassan, Syed Talal Wasim, Nabil Maalej, Muzammal Naseer, Juergen Gall, Mohammed Bennamoun, Ernesto Damiani, Naoufel Werghi},
journal={CVPR},
year={2025}
}
- Downloads last month
- 13
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support