Instructions to use kr301/d2v2x-adapter with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use kr301/d2v2x-adapter with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
Add model card and metadata for D2-V2X (#1)
Browse files- Add model card and metadata for D2-V2X (3f2b6720a87ed2ae486e13b2f5c267d602d47d5a)
Co-authored-by: Niels Rogge <nielsr@users.noreply.huggingface.co>
README.md
CHANGED
|
@@ -1,3 +1,66 @@
|
|
| 1 |
---
|
| 2 |
license: mit
|
|
|
|
|
|
|
| 3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: mit
|
| 3 |
+
library_name: peft
|
| 4 |
+
pipeline_tag: image-text-to-text
|
| 5 |
---
|
| 6 |
+
|
| 7 |
+
# D2-V2X: Depth-Driven Cooperative V2X Reasoning for Autonomous Driving
|
| 8 |
+
|
| 9 |
+
This repository contains the model weights (adapter) for **D2-V2X**, a spatially-aware Question-Rationale-Answer (QRA) framework designed for cooperative autonomous driving.
|
| 10 |
+
|
| 11 |
+
[**Paper (arXiv)**](https://arxiv.org/abs/2605.24098) | [**GitHub**](https://github.com/KevinRichard1/D2-V2X) | [**Dataset**](https://huggingface.co/datasets/kr301/d2v2x-qra)
|
| 12 |
+
|
| 13 |
+
## Overview
|
| 14 |
+
D2-V2X addresses sensor occlusions in single-vehicle Vision-Language Models (VLMs) by establishing a benchmark for cooperative reasoning using multimodal vehicle and infrastructure sensors (V2X). It establishes a baseline that aligns 3D LiDAR features with the VLM's latent space, enforcing Chain-of-Thought (CoT) rationales to articulate spatial relations explicitly.
|
| 15 |
+
|
| 16 |
+
## Usage
|
| 17 |
+
|
| 18 |
+
For environment setup and data preparation, please refer to the [official GitHub repository](https://github.com/KevinRichard1/D2-V2X).
|
| 19 |
+
|
| 20 |
+
### Training
|
| 21 |
+
To train the model using the provided pipeline:
|
| 22 |
+
```bash
|
| 23 |
+
python train.py \
|
| 24 |
+
--qwen_path="/path/to/qwen/model" \
|
| 25 |
+
--train_path="/path/to/train/dataset" \
|
| 26 |
+
--val_path="/path/to/val/dataset" \
|
| 27 |
+
--img_path="/path/to/images" \
|
| 28 |
+
--train_feature_path="/path/to/train/lidar/features" \
|
| 29 |
+
--val_feature_path="/path/to/val/lidar/features" \
|
| 30 |
+
--output_path="/checkpoint/path" \
|
| 31 |
+
--mode="" \
|
| 32 |
+
--stage="" \
|
| 33 |
+
--lr=2e-5 \
|
| 34 |
+
--epochs=3 \
|
| 35 |
+
--batch_size=1 \
|
| 36 |
+
--accum_steps=64
|
| 37 |
+
```
|
| 38 |
+
|
| 39 |
+
### Evaluation
|
| 40 |
+
To evaluate the model:
|
| 41 |
+
```bash
|
| 42 |
+
python evaluate.py \
|
| 43 |
+
--qwen_path="/path/to/qwen/model" \
|
| 44 |
+
--checkpoint_path="/checkpoint/path" \
|
| 45 |
+
--inference \
|
| 46 |
+
--evaluate \
|
| 47 |
+
--mode="" \
|
| 48 |
+
--json_path="/path/to/test/dataset" \
|
| 49 |
+
--img_path="/path/to/images" \
|
| 50 |
+
--test_feature_path="/path/to/test/lidar/features" \
|
| 51 |
+
--inference_save_path="results.json"
|
| 52 |
+
```
|
| 53 |
+
|
| 54 |
+
## Citation
|
| 55 |
+
If you find this work useful, please cite:
|
| 56 |
+
```bibtex
|
| 57 |
+
@misc{richard2026d2v2xdepthdrivencooperativev2x,
|
| 58 |
+
title={D2-V2X: Depth-Driven Cooperative V2X Reasoning for Autonomous Driving},
|
| 59 |
+
author={Kevin Richard and Alphin Varghese and Colin Pham and David Oh and Srijan Das},
|
| 60 |
+
year={2026},
|
| 61 |
+
eprint={2605.24098},
|
| 62 |
+
archivePrefix={arXiv},
|
| 63 |
+
primaryClass={cs.CV},
|
| 64 |
+
url={https://arxiv.org/abs/2605.24098},
|
| 65 |
+
}
|
| 66 |
+
```
|