aadex
/

faster-rcnn-rope-vit-tiny-coco

+---
+license: apache-2.0
+library_name: mmdetection
+tags:
+  - object-detection
+  - vision-transformer
+  - mmdetection
+  - pytorch
+  - faster-rcnn
+datasets:
+  - coco
+metrics:
+  - map
+---
+# Faster R-CNN with RoPE-ViT Backbone for Object Detection
+This model is a Faster R-CNN object detection model with a RoPE-ViT (Vision Transformer with Rotary Position Embeddings) backbone, trained on the COCO dataset.
+## Model Description
+- **Architecture:** Faster R-CNN
+- **Backbone:** RoPE-ViT Tiny
+- **Dataset:** COCO
+- **Task:** Object Detection
+- **Framework:** MMDetection
+## Training Results
+| Metric | Value |
+|--------|-------|
+| bbox_mAP | 0.0680 |
+| bbox_mAP_50 | 0.1510 |
+| bbox_mAP_75 | 0.0530 |
+| bbox_mAP_s (small) | 0.0360 |
+| bbox_mAP_m (medium) | 0.1260 |
+| bbox_mAP_l (large) | 0.0640 |
+## Usage
+```python
+from mmdet.apis import init_detector, inference_detector
+config_file = 'faster_rcnn_rope_vit_tiny_coco.py'
+checkpoint_file = 'best_coco_bbox_mAP_epoch_12.pth'
+# Initialize the model
+model = init_detector(config_file, checkpoint_file, device='cuda:0')
+# Inference on an image
+result = inference_detector(model, 'demo.jpg')
+```
+## Training Configuration
+The model was trained with the following configuration:
+- Input size: 512x512
+- Training epochs: 12
+- Optimizer: SGD with momentum
+- Learning rate scheduler: Step decay
+## Citation
+If you use this model, please cite:
+```bibtex
+@misc{rope-vit-detection,
+  author = {VLG IITR},
+  title = {Faster R-CNN with RoPE-ViT for Object Detection},
+  year = {2026},
+  publisher = {Hugging Face},
+}
+```
+## License
+This model is released under the Apache 2.0 license.