|
|
--- |
|
|
license: apache-2.0 |
|
|
gated: true |
|
|
tags: |
|
|
- video-classification |
|
|
- collision-prediction |
|
|
- ego-centric |
|
|
- autonomous-driving |
|
|
- adas |
|
|
- dashcam |
|
|
- v-jepa2 |
|
|
library_name: pytorch |
|
|
pipeline_tag: video-classification |
|
|
extra_gated_fields: |
|
|
First Name: text |
|
|
Last Name: text |
|
|
Title: text |
|
|
Company Email: text |
|
|
Company: text |
|
|
Topic of Interest: |
|
|
type: select |
|
|
options: |
|
|
- Research |
|
|
- Commercial |
|
|
- Education |
|
|
- Other |
|
|
I agree to use this model for lawful purposes only and acknowledge the terms and conditions: checkbox |
|
|
extra_gated_prompt: >- |
|
|
By accessing this model, you agree to use it responsibly and in compliance |
|
|
with applicable laws. This model is intended for research and development |
|
|
purposes in driver assistance systems. |
|
|
datasets: |
|
|
- nexar-ai/nexar_collision_prediction |
|
|
base_model: |
|
|
- facebook/vjepa2-vitl-fpc16-256-ssv2 |
|
|
--- |
|
|
|
|
|
# BADAS-Open: Ego-Centric Collision Prediction Model |
|
|
|
|
|
<div align="center"> |
|
|
<img src="https://img.shields.io/badge/State--of--the--Art-Collision%20Prediction-blue" alt="SOTA"> |
|
|
<img src="https://img.shields.io/badge/V--JEPA2-Foundation%20Model-green" alt="V-JEPA2"> |
|
|
<img src="https://img.shields.io/badge/License-Apache%202.0-orange" alt="License"> |
|
|
</div> |
|
|
|
|
|
## π― Overview |
|
|
|
|
|
**BADAS-Open** (V-JEPA2 Based Advanced Driver Assistance System) is a state-of-the-art collision prediction model specifically designed for **ego-centric** threat detection in real-world driving scenarios. Unlike traditional methods that detect any visible accident, BADAS focuses exclusively on collisions and near-misses that directly threaten the recording vehicle, dramatically reducing false alarms in real-world deployment. |
|
|
|
|
|
<img src="assets/BADAS_Demo.gif" alt="BADAS Collision Prediction Example" width="750"> |
|
|
|
|
|
### Key Features |
|
|
|
|
|
- π― **Ego-Centric Focus**: Distinguishes between ego-vehicle threats and irrelevant accidents |
|
|
- π **Real-World Trained**: Trained on 1,500 real dashcam videos from actual driving scenarios |
|
|
- β‘ **State-of-the-Art Performance**: Outperforms academic methods and commercial ADAS systems |
|
|
- π§ **Foundation Model Based**: Built on V-JEPA2 for superior temporal understanding |
|
|
- π¬ **Near-Miss Learning**: Includes emergency maneuver scenarios for richer training signals |
|
|
|
|
|
## π Performance |
|
|
|
|
|
BADAS-Open achieves state-of-the-art results across all major benchmarks when evaluated on ego-vehicle involved collisions: |
|
|
|
|
|
<p align="center"> |
|
|
<img src="assets/performance.png" alt="Performance Comparison" width="600"> |
|
|
</p> |
|
|
|
|
|
| Dataset | AP β | AUC β | mTTA (s) β | Test Size | |
|
|
|---------|------|-------|------------|-----------| |
|
|
| **Nexar** | **0.86** | **0.88** | 4.9 | 1,344 | |
|
|
| **DoTA** | **0.94** | **0.70** | 4.0 | 367 | |
|
|
| **DADA-2000** | **0.87** | **0.77** | 4.3 | 113 | |
|
|
| **DAD** | **0.66** | **0.87** | 2.7 | 116 | |
|
|
|
|
|
*Compared to baseline methods (DSTA, UString) with AP scores of 0.06-0.53* |
|
|
|
|
|
### What Makes This Different? |
|
|
|
|
|
Traditional collision prediction models are trained to detect **any accident in the camera's view**, leading to excessive false alarms from irrelevant incidents (e.g., accidents in adjacent lanes). BADAS solves this by: |
|
|
|
|
|
1. **Ego-Centric Reformulation**: Only predicting collisions that directly threaten the ego vehicle |
|
|
2. **Real-World Data**: Trained on actual dashcam footage, not synthetic or staged scenarios |
|
|
3. **Consensus-Based Timing**: Alert times validated by 10 certified defensive driving experts |
|
|
4. **Near-Miss Inclusion**: Learning from successfully-avoided dangerous situations |
|
|
|
|
|
<p align="center"> |
|
|
<img src="assets/example.png" alt="Prediction Example" width="750"> |
|
|
<br> |
|
|
<em>Example: BADAS prediction on real dashcam footage</em> |
|
|
</p> |
|
|
|
|
|
## π Quick Start |
|
|
|
|
|
### Installation |
|
|
```bash |
|
|
pip install torch torchvision transformers huggingface_hub opencv-python numpy pillow albumentations |
|
|
``` |
|
|
|
|
|
### Basic Usage |
|
|
```python |
|
|
from huggingface_hub import hf_hub_download |
|
|
import sys |
|
|
import os |
|
|
|
|
|
# Download model loader |
|
|
loader_path = hf_hub_download( |
|
|
repo_id="nexar-ai/badas-open", |
|
|
filename="badas_loader.py" |
|
|
) |
|
|
sys.path.insert(0, os.path.dirname(loader_path)) |
|
|
|
|
|
# Load model |
|
|
from badas_loader import load_badas_model |
|
|
model = load_badas_model() |
|
|
|
|
|
# Predict on video |
|
|
predictions = model.predict("dashcam_video.mp4") |
|
|
|
|
|
# Get collision probability over time |
|
|
for frame_idx, prob in enumerate(predictions): |
|
|
if prob > 0.8: # High risk threshold |
|
|
print(f"β οΈ Collision risk at frame {frame_idx}: {prob:.2%}") |
|
|
``` |
|
|
|
|
|
## ποΈ Model Architecture |
|
|
``` |
|
|
Input Video (16 frames @ 256Γ256) |
|
|
β |
|
|
V-JEPA2 Encoder (ViT-L) |
|
|
β (2048 patches Γ 1024 dim) |
|
|
Attentive Probe Aggregation (12 queries Γ 64 dim) |
|
|
β |
|
|
3-Layer MLP Prediction Head |
|
|
β |
|
|
Collision Probability [0, 1] |
|
|
``` |
|
|
|
|
|
## ποΈ Model Architecture |
|
|
|
|
|
<p align="center"> |
|
|
<img src="assets/arch.png" alt="BADAS Architecture" width="650"> |
|
|
<br> |
|
|
<em>BADAS architecture: V-JEPA2 backbone with attentive probe aggregation and MLP head</em> |
|
|
</p> |
|
|
|
|
|
**Key Components:** |
|
|
- **Backbone**: V-JEPA2 (Vision Joint-Embedding Predictive Architecture v2) |
|
|
- **Patch Aggregation**: Attentive probe with 12 learned queries |
|
|
- **Head**: 3-layer MLP with GELU activation, LayerNorm, dropout 0.1 |
|
|
- **Training**: End-to-end fine-tuning on Nexar dataset (1.5k videos) |
|
|
|
|
|
## π Requirements |
|
|
|
|
|
- Python β₯ 3.8 |
|
|
- PyTorch β₯ 2.0 |
|
|
- torchvision β₯ 0.15 |
|
|
- transformers β₯ 4.30 |
|
|
- opencv-python β₯ 4.8 |
|
|
- huggingface_hub β₯ 0.16 |
|
|
|
|
|
## π Model Files |
|
|
``` |
|
|
badas-open/ |
|
|
βββ model.safetensors # Model weights |
|
|
βββ config.json # Model configuration |
|
|
βββ badas_loader.py # Loading utilities |
|
|
βββ preprocessing.py # Video preprocessing |
|
|
βββ README.md # This file |
|
|
``` |
|
|
|
|
|
## π¬ Training Details |
|
|
|
|
|
- **Dataset**: Nexar Dashcam Collision Prediction Dataset (1,500 videos) |
|
|
- 750 collision/near-miss events |
|
|
- 750 normal driving samples |
|
|
- **Optimizer**: AdamW (lr=1e-5, weight decay=1e-4) |
|
|
- **Loss**: Binary Cross-Entropy |
|
|
- **Augmentations**: Weather simulation (rain, snow), lighting variations, motion blur |
|
|
- **Training Time**: ~8 hours on 4Γ A100 GPUs |
|
|
|
|
|
## π― Use Cases |
|
|
|
|
|
### β
Recommended Uses |
|
|
- Driver assistance system research |
|
|
- Autonomous vehicle safety testing |
|
|
- Dashcam collision warning applications |
|
|
- Traffic safety analysis |
|
|
- Dataset for training improved models |
|
|
|
|
|
### β οΈ Limitations |
|
|
- **Long-tail events**: Performance drops on rare categories (animals, motorcycles) |
|
|
- **Monocular only**: Requires single-camera dashcam input |
|
|
- **Processing delay**: 16-frame buffer introduces latency |
|
|
- **Not for critical safety**: Research model, not certified for production deployment |
|
|
|
|
|
## π Citation |
|
|
|
|
|
If you use this model in your research, please cite: |
|
|
```bibtex |
|
|
@article{goldshmidt2025badas, |
|
|
title={BADAS: Context Aware Collision Prediction Using Real-World Dashcam Data}, |
|
|
author={Goldshmidt, Roni and Scott, Hamish and Niccolini, Lorenzo and |
|
|
Zhu, Shizhan and Moura, Daniel and Zvitia, Orly}, |
|
|
journal={arXiv preprint}, |
|
|
year={2025} |
|
|
} |
|
|
``` |
|
|
|
|
|
## π Resources |
|
|
|
|
|
- π **Paper**: [arXiv](https://arxiv.org) *(coming soon)* |
|
|
- ποΈ **Dataset**: [Nexar Collision Prediction Dataset](https://huggingface.co/datasets/nexar-ai/nexar_collision_prediction) |
|
|
- π **Challenge**: [Kaggle Competition](https://www.kaggle.com/competitions/nexar-collision-prediction) |
|
|
- π **Website**: [www.nexar-ai.com](https://www.nexar-ai.com) |
|
|
- π» **Code**: [GitHub Repository](https://github.com/getnexar/BADAS-Open) |
|
|
|
|
|
## π€ Model Variants |
|
|
|
|
|
- **BADAS-Open** (this model): Trained on 1.5k public videos |
|
|
- **BADAS-Pro**: Commercial variant trained on 40k proprietary videos |
|
|
- Higher performance (AP: 0.91 on Nexar) |
|
|
- Better edge-case handling |
|
|
- Contact Nexar for licensing |
|
|
|
|
|
## βοΈ License & Terms |
|
|
|
|
|
This model is released under **Apache 2.0 License** with the following conditions: |
|
|
|
|
|
1. β
Free for research and commercial use |
|
|
2. β
Modification and redistribution permitted |
|
|
3. β οΈ No warranty provided - use at your own risk |
|
|
4. β οΈ Not certified for safety-critical applications |
|
|
5. π Must provide attribution when using |
|
|
|
|
|
**Responsible AI Notice**: This model is intended to assist human drivers, not replace them. Always maintain full attention while driving. |
|
|
|
|
|
## π Acknowledgments |
|
|
|
|
|
- V-JEPA2 foundation model by Meta AI Research |
|
|
- Nexar's driver community for dataset contribution |
|
|
- Re-annotations of DAD, DADA-2000, and DoTA benchmarks |
|
|
|
|
|
--- |
|
|
|
|
|
<div align="center"> |
|
|
<b>Questions or Issues?</b><br> |
|
|
Open an issue on our <a href="https://github.com/nexar-ai/badas/issues">GitHub</a> or contact us at research@nexar.com |
|
|
</div> |