| --- |
| license: apache-2.0 |
| language: en |
| tags: |
| - object-detection |
| - yolo |
| - yolo11 |
| - rtdetr |
| - rfdetr |
| - pytorch |
| - urban-traffic |
| datasets: |
| - BMD-45 |
| pipeline_tag: object-detection |
| pretty_name: BMD-45 |
| --- |
| |
| <div align="center"> |
|
|
| <!-- Banner Image --> |
| <img width="50%" src="banner.png" alt="BMD-45 Banner"> |
|
|
| <div align="center"> |
| <a href="https://arxiv.org/abs/2511.02563" ><img src="assets/arxiv-logomark-small.svg" height="16" width="11.96" style="display: inline-block; vertical-align: middle; margin: 2px;"> <b style="display: inline-block;"> ArXiv </b></a> | |
| <a href="https://huggingface.co/datasets/iisc-aim/BMD-45"><img src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" height="16" width="16" style="display: inline-block; vertical-align: middle; margin: 2px;"><b style="display: inline-block;"> Dataset </b></a> |
| </div> |
| |
| <!-- Performance Graphic --> |
| <img width="95%" src="assets/Figure-3.png" alt="Performance on BMD-45"> |
|
|
| <p><em> |
| Models trained on BMD-45 deliver up to <b>2.5x</b> performance improvement over UA-DETRAC baselines |
| </em></p> |
| |
| </div> |
|
|
| # BMD-45 Vehicle Detection Models — AIM@IISc |
|
|
| High-quality object detection models built for **Indian road traffic** — where vehicle appearance, traffic density, and scene complexity differ significantly from Western datasets like COCO. |
|
|
| These models are trained on the **BMD-45 dataset**, featuring: |
|
|
| - 13 road-relevant vehicle categories |
| - Real urban environments across India |
| - Diverse viewpoints, lighting, occlusion & density variations |
| - Multi-user labeled data with consensus filtering (MV / ST variants) |
|
|
| We currently release **six SOTA detector variants** trained on the dataset: |
|
|
| | Model Family | Sizes | Strengths | |
| | ------------- | ----- | ------------------------------------------------------- | |
| | **YOLOv12** | S, X | Fast + lightweight deployment | |
| | **RT-DETRv2** | X | High-accuracy, transformer-based real-time detection | |
| | **RF-DETR** | X | Region-focused DETR with strong small-object detection | |
| | **D-FINE** | X | Fine-grained detection with iterative refinement | |
|
|
| > Designed for Indian mobility — adaptable to real city surveillance, roadside cameras, safety monitoring, and ITS applications. |
|
|
| Model Dataset -> [https://huggingface.co/datasets/iisc-aim/BMD-45](https://huggingface.co/datasets/iisc-aim/BMD-45) |
|
|
| --- |
|
|
| ## Attribution |
|
|
| <!-- More technical details about the dataset and models are available in our [Technical Report available on arXiv](https://arxiv.org/abs/2511.02563). |
| If you use these datasets or models, kindly cite the following: |
| **The BMD-45 Dataset and Models: Towards Image Annotations and Accurate Vision Models for Indian Traffic, Dataset Release, BMD-45-v1.0**, |
| Akash Sharma, Chinmay Mhatre, Sankalp Gawali, Ruthvik Bokkasam, Brij Kishore, Vishwajeet Pattanaik, Tarun Rambha, Abdul R. Pinjari, Vijay Kovvali, Anirban Chakraborty, Punit Rathore, Raghu Krishnapuram and Yogesh Simmhan, |
| *Technical Report, Indian Institute of Science*, [arXiv:2511.02563](https://arxiv.org/abs/2511.02563), Nov, 2025. --> |
|
|
|
|
| ```bibtex |
| to be added |
| ``` |
|
|
| --- |
|
|
| ### Repository Structure |
|
|
| - **README.md** – This file |
| - **bmd_classes.txt** – 13 object classes (one per line) |
| - **configs/** – Model configuration files |
| - **YOLOv12-S/** |
| - `config.yaml` – Training hyperparameters |
| - `data.yaml` – Dataset paths and class names |
| - **YOLOv12-X/** |
| - `config.yaml` – Training hyperparameters |
| - `data.yaml` – Dataset paths and class names |
| - **RT-DETRv2/** |
| - `bmd-45-dataset.yaml` – Dataset configuration |
| - `rtdetrv2_r101vd_6x_bmd-45.yaml` – Model + training configuration |
| - **RF-DETR/** |
| - `config.yaml` – Training hyperparameters |
| - **D-FINE/** |
| - `bmd-45-dataset.yaml` – Dataset configuration |
| - `dfine_hgnetv2_x_bmd-45.yaml` – Model + training configuration |
| - **weights/** – Trained model weights |
| - **YOLOv12-S/** – `best.pt` |
| - **YOLOv12-X/** – `best.pt` |
| - **RT-DETRv2/** – `best.pth` |
| - **RF-DETR/** – `checkpoint_best_total.pth` |
| - **D-FINE/** – `best_stg1.pth` |
|
|
|
|
| --- |
|
|
| ## Classes |
|
|
| The file `uvh_classes.txt` lists all **14 object categories**, one per line: |
|
|
| | ID | Class Name | Description | |
| | --- | --------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------ | |
| | 1 | Hatchback | Small passenger cars without a protruding rear boot (“dickey”). | |
| | 2 | Sedan | Passenger cars with a low-slung design and a separate protruding rear boot (“dickey”). | |
| | 3 | SUV | Car-like vehicles with high ground clearance, a sturdy body, and no protruding boot. | |
| | 4 | MUV | Large vehicles with three seating rows, combining passenger and cargo functionality. | |
| | 5 | Bus | Large passenger vehicles used for public or private transport, including office shuttles and intercity buses. | |
| | 6 | Truck | Heavy goods carriers with a front cabin and a rear cargo compartment. | |
| | 7 | Three-wheeler | Compact vehicles with one front wheel and two rear wheels, featuring a covered passenger cabin. | |
| | 8 | Two-wheeler | Motorbikes and scooters for single or double riders. Bounding boxes include both vehicle and rider. | |
| | 9 | LCV | Lightweight goods carriers used for short- to medium-distance transport. | |
| | 10 | Mini-bus | Shorter, compact buses with fewer seats; larger than a Tempo Traveller, often featuring a flat front. | |
| | 11 | Tempo-traveller | Medium-sized passenger vans with tall roofs and side windows; larger than vans but smaller than minibuses, with a protruding front. | |
| | 12 | Bicycle | Non-motorized, manually pedalled vehicles including geared, non-geared, women’s, and children’s cycles. Bounding boxes include both vehicle and rider. | |
| | 13 | Van | Medium-sized vehicles for transporting goods or people, typically with a flat front and sliding side doors; smaller than Tempo Travellers. | |
|
|
| --- |
|
|
| ## Training Hyperparameters and Architecture |
|
|
| _All models were trained on the BMD-45 dataset with identical batch sizes and consistent augmentation settings for fair comparison._ |
|
|
| | Setting | YOLOv12-S | YOLOv12-X | RT-DETRv2-X | D-FINE-X | RF-DETR-X | |
| | --- | --- | --- | --- | --- | --- | |
| | **Batch Size** | 16 | 16 | 16 | 16 | 16 | |
| | **Epochs** | 100 | 100 | 100 | 100 | 100 | |
| | **Learning Rate** | 0.01 | 0.01 | 1×10⁻⁴ | 2.5×10⁻⁴ | 1×10⁻⁴ | |
| | **Optimizer** | AdamW | AdamW | AdamW | AdamW | AdamW | |
| | **Weight Decay** | 5×10⁻⁴ | 5×10⁻⁴ | 1×10⁻⁴ | 1.25×10⁻⁴ | 1×10⁻⁴ | |
| | **AdamW Betas** | (0.937, 0.999) | (0.937, 0.999) | (0.9, 0.999) | (0.9, 0.999) | (0.9, 0.999) | |
| | **LR Policy** | Cosine | Cosine | MultiStep | MultiStep | Step LR | |
| | **Warmup** | 3 epochs | 3 epochs | 2000-iteration linear warmup | 500-step linear warmup | None | |
| | **Warmup Details** | momentum=0.8; bias LR=0.1 | momentum=0.8; bias LR=0.1 | momentum untouched; uniform LR ramp | no bias/momentum overrides | warmup disabled | |
| | **Augmentation Summary** | HSV, translate=0.1, scale=0.5, flip=0.5, erase=0.4; no mosaic/mixup | HSV, translate=0.1, scale=0.5, flip=0.5, erase=0.4; no mosaic/mixup | Photometric, ZoomOut, IoU crop; ops disabled after epoch 151 | Photometric, ZoomOut, IoU crop, flip, sanitize, resize | Flip + multi-scale RandomResize/Crop + normalize | |
|
|
| --- |
|
|
| ## License |
|
|
| - This repository (models, weights, configs) is released under the **Apache License 2.0**. |
| - _Note:_ The underlying YOLO-family models (e.g., YOLOv12) from Ultralytics are distributed under the **GNU AGPL v3.0** (or newer) license. |
|
|
| --- |