BMD-45 / README.md
chinmay-1302's picture
Update README.md
ec06f3d verified
---
license: apache-2.0
language: en
tags:
- object-detection
- yolo
- yolo11
- rtdetr
- rfdetr
- pytorch
- urban-traffic
datasets:
- BMD-45
pipeline_tag: object-detection
pretty_name: BMD-45
---
<div align="center">
<!-- Banner Image -->
<img width="50%" src="banner.png" alt="BMD-45 Banner">
<div align="center">
<a href="https://arxiv.org/abs/2511.02563" ><img src="assets/arxiv-logomark-small.svg" height="16" width="11.96" style="display: inline-block; vertical-align: middle; margin: 2px;"> <b style="display: inline-block;"> ArXiv </b></a> |
<a href="https://huggingface.co/datasets/iisc-aim/BMD-45"><img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" height="16" width="16" style="display: inline-block; vertical-align: middle; margin: 2px;"><b style="display: inline-block;"> Dataset </b></a>
</div>
<!-- Performance Graphic -->
<img width="95%" src="assets/Figure-3.png" alt="Performance on BMD-45">
<p><em>
Models trained on BMD-45 deliver up to <b>2.5x</b> performance improvement over UA-DETRAC baselines
</em></p>
</div>
# BMD-45 Vehicle Detection Models — AIM@IISc
High-quality object detection models built for **Indian road traffic** — where vehicle appearance, traffic density, and scene complexity differ significantly from Western datasets like COCO.
These models are trained on the **BMD-45 dataset**, featuring:
- 13 road-relevant vehicle categories
- Real urban environments across India
- Diverse viewpoints, lighting, occlusion & density variations
- Multi-user labeled data with consensus filtering (MV / ST variants)
We currently release **six SOTA detector variants** trained on the dataset:
| Model Family | Sizes | Strengths |
| ------------- | ----- | ------------------------------------------------------- |
| **YOLOv12** | S, X | Fast + lightweight deployment |
| **RT-DETRv2** | X | High-accuracy, transformer-based real-time detection |
| **RF-DETR** | X | Region-focused DETR with strong small-object detection |
| **D-FINE** | X | Fine-grained detection with iterative refinement |
> Designed for Indian mobility — adaptable to real city surveillance, roadside cameras, safety monitoring, and ITS applications.
Model Dataset -> [https://huggingface.co/datasets/iisc-aim/BMD-45](https://huggingface.co/datasets/iisc-aim/BMD-45)
---
## Attribution
<!-- More technical details about the dataset and models are available in our [Technical Report available on arXiv](https://arxiv.org/abs/2511.02563).
If you use these datasets or models, kindly cite the following:
**The BMD-45 Dataset and Models: Towards Image Annotations and Accurate Vision Models for Indian Traffic, Dataset Release, BMD-45-v1.0**,
Akash Sharma, Chinmay Mhatre, Sankalp Gawali, Ruthvik Bokkasam, Brij Kishore, Vishwajeet Pattanaik, Tarun Rambha, Abdul R. Pinjari, Vijay Kovvali, Anirban Chakraborty, Punit Rathore, Raghu Krishnapuram and Yogesh Simmhan,
*Technical Report, Indian Institute of Science*, [arXiv:2511.02563](https://arxiv.org/abs/2511.02563), Nov, 2025. -->
```bibtex
to be added
```
---
### Repository Structure
- **README.md** – This file
- **bmd_classes.txt** – 13 object classes (one per line)
- **configs/** – Model configuration files
- **YOLOv12-S/**
- `config.yaml` – Training hyperparameters
- `data.yaml` – Dataset paths and class names
- **YOLOv12-X/**
- `config.yaml` – Training hyperparameters
- `data.yaml` – Dataset paths and class names
- **RT-DETRv2/**
- `bmd-45-dataset.yaml` – Dataset configuration
- `rtdetrv2_r101vd_6x_bmd-45.yaml` – Model + training configuration
- **RF-DETR/**
- `config.yaml` – Training hyperparameters
- **D-FINE/**
- `bmd-45-dataset.yaml` – Dataset configuration
- `dfine_hgnetv2_x_bmd-45.yaml` – Model + training configuration
- **weights/** – Trained model weights
- **YOLOv12-S/**`best.pt`
- **YOLOv12-X/**`best.pt`
- **RT-DETRv2/**`best.pth`
- **RF-DETR/**`checkpoint_best_total.pth`
- **D-FINE/**`best_stg1.pth`
---
## Classes
The file `uvh_classes.txt` lists all **14 object categories**, one per line:
| ID | Class Name | Description |
| --- | --------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------ |
| 1 | Hatchback | Small passenger cars without a protruding rear boot (“dickey”). |
| 2 | Sedan | Passenger cars with a low-slung design and a separate protruding rear boot (“dickey”). |
| 3 | SUV | Car-like vehicles with high ground clearance, a sturdy body, and no protruding boot. |
| 4 | MUV | Large vehicles with three seating rows, combining passenger and cargo functionality. |
| 5 | Bus | Large passenger vehicles used for public or private transport, including office shuttles and intercity buses. |
| 6 | Truck | Heavy goods carriers with a front cabin and a rear cargo compartment. |
| 7 | Three-wheeler | Compact vehicles with one front wheel and two rear wheels, featuring a covered passenger cabin. |
| 8 | Two-wheeler | Motorbikes and scooters for single or double riders. Bounding boxes include both vehicle and rider. |
| 9 | LCV | Lightweight goods carriers used for short- to medium-distance transport. |
| 10 | Mini-bus | Shorter, compact buses with fewer seats; larger than a Tempo Traveller, often featuring a flat front. |
| 11 | Tempo-traveller | Medium-sized passenger vans with tall roofs and side windows; larger than vans but smaller than minibuses, with a protruding front. |
| 12 | Bicycle | Non-motorized, manually pedalled vehicles including geared, non-geared, women’s, and children’s cycles. Bounding boxes include both vehicle and rider. |
| 13 | Van | Medium-sized vehicles for transporting goods or people, typically with a flat front and sliding side doors; smaller than Tempo Travellers. |
---
## Training Hyperparameters and Architecture
_All models were trained on the BMD-45 dataset with identical batch sizes and consistent augmentation settings for fair comparison._
| Setting | YOLOv12-S | YOLOv12-X | RT-DETRv2-X | D-FINE-X | RF-DETR-X |
| --- | --- | --- | --- | --- | --- |
| **Batch Size** | 16 | 16 | 16 | 16 | 16 |
| **Epochs** | 100 | 100 | 100 | 100 | 100 |
| **Learning Rate** | 0.01 | 0.01 | 1×10⁻⁴ | 2.5×10⁻⁴ | 1×10⁻⁴ |
| **Optimizer** | AdamW | AdamW | AdamW | AdamW | AdamW |
| **Weight Decay** | 5×10⁻⁴ | 5×10⁻⁴ | 1×10⁻⁴ | 1.25×10⁻⁴ | 1×10⁻⁴ |
| **AdamW Betas** | (0.937, 0.999) | (0.937, 0.999) | (0.9, 0.999) | (0.9, 0.999) | (0.9, 0.999) |
| **LR Policy** | Cosine | Cosine | MultiStep | MultiStep | Step LR |
| **Warmup** | 3 epochs | 3 epochs | 2000-iteration linear warmup | 500-step linear warmup | None |
| **Warmup Details** | momentum=0.8; bias LR=0.1 | momentum=0.8; bias LR=0.1 | momentum untouched; uniform LR ramp | no bias/momentum overrides | warmup disabled |
| **Augmentation Summary** | HSV, translate=0.1, scale=0.5, flip=0.5, erase=0.4; no mosaic/mixup | HSV, translate=0.1, scale=0.5, flip=0.5, erase=0.4; no mosaic/mixup | Photometric, ZoomOut, IoU crop; ops disabled after epoch 151 | Photometric, ZoomOut, IoU crop, flip, sanitize, resize | Flip + multi-scale RandomResize/Crop + normalize |
---
## License
- This repository (models, weights, configs) is released under the **Apache License 2.0**.
- _Note:_ The underlying YOLO-family models (e.g., YOLOv12) from Ultralytics are distributed under the **GNU AGPL v3.0** (or newer) license.
---