Instructions to use kr301/d2v2x-adapter with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use kr301/d2v2x-adapter with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
| license: mit | |
| library_name: peft | |
| pipeline_tag: image-text-to-text | |
| # D2-V2X: Depth-Driven Cooperative V2X Reasoning for Autonomous Driving | |
| This repository contains the model weights (adapter) for **D2-V2X**, a spatially-aware Question-Rationale-Answer (QRA) framework designed for cooperative autonomous driving. | |
| [**Paper (arXiv)**](https://arxiv.org/abs/2605.24098) | [**GitHub**](https://github.com/KevinRichard1/D2-V2X) | [**Dataset**](https://huggingface.co/datasets/kr301/d2v2x-qra) | |
| ## Overview | |
| D2-V2X addresses sensor occlusions in single-vehicle Vision-Language Models (VLMs) by establishing a benchmark for cooperative reasoning using multimodal vehicle and infrastructure sensors (V2X). It establishes a baseline that aligns 3D LiDAR features with the VLM's latent space, enforcing Chain-of-Thought (CoT) rationales to articulate spatial relations explicitly. | |
| ## Usage | |
| For environment setup and data preparation, please refer to the [official GitHub repository](https://github.com/KevinRichard1/D2-V2X). | |
| ### Training | |
| To train the model using the provided pipeline: | |
| ```bash | |
| python train.py \ | |
| --qwen_path="/path/to/qwen/model" \ | |
| --train_path="/path/to/train/dataset" \ | |
| --val_path="/path/to/val/dataset" \ | |
| --img_path="/path/to/images" \ | |
| --train_feature_path="/path/to/train/lidar/features" \ | |
| --val_feature_path="/path/to/val/lidar/features" \ | |
| --output_path="/checkpoint/path" \ | |
| --mode="" \ | |
| --stage="" \ | |
| --lr=2e-5 \ | |
| --epochs=3 \ | |
| --batch_size=1 \ | |
| --accum_steps=64 | |
| ``` | |
| ### Evaluation | |
| To evaluate the model: | |
| ```bash | |
| python evaluate.py \ | |
| --qwen_path="/path/to/qwen/model" \ | |
| --checkpoint_path="/checkpoint/path" \ | |
| --inference \ | |
| --evaluate \ | |
| --mode="" \ | |
| --json_path="/path/to/test/dataset" \ | |
| --img_path="/path/to/images" \ | |
| --test_feature_path="/path/to/test/lidar/features" \ | |
| --inference_save_path="results.json" | |
| ``` | |
| ## Citation | |
| If you find this work useful, please cite: | |
| ```bibtex | |
| @misc{richard2026d2v2xdepthdrivencooperativev2x, | |
| title={D2-V2X: Depth-Driven Cooperative V2X Reasoning for Autonomous Driving}, | |
| author={Kevin Richard and Alphin Varghese and Colin Pham and David Oh and Srijan Das}, | |
| year={2026}, | |
| eprint={2605.24098}, | |
| archivePrefix={arXiv}, | |
| primaryClass={cs.CV}, | |
| url={https://arxiv.org/abs/2605.24098}, | |
| } | |
| ``` |