Instructions to use kr301/d2v2x-adapter with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use kr301/d2v2x-adapter with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
license: mit
library_name: peft
pipeline_tag: image-text-to-text
D2-V2X: Depth-Driven Cooperative V2X Reasoning for Autonomous Driving
This repository contains the model weights (adapter) for D2-V2X, a spatially-aware Question-Rationale-Answer (QRA) framework designed for cooperative autonomous driving.
Paper (arXiv) | GitHub | Dataset
Overview
D2-V2X addresses sensor occlusions in single-vehicle Vision-Language Models (VLMs) by establishing a benchmark for cooperative reasoning using multimodal vehicle and infrastructure sensors (V2X). It establishes a baseline that aligns 3D LiDAR features with the VLM's latent space, enforcing Chain-of-Thought (CoT) rationales to articulate spatial relations explicitly.
Usage
For environment setup and data preparation, please refer to the official GitHub repository.
Training
To train the model using the provided pipeline:
python train.py \
--qwen_path="/path/to/qwen/model" \
--train_path="/path/to/train/dataset" \
--val_path="/path/to/val/dataset" \
--img_path="/path/to/images" \
--train_feature_path="/path/to/train/lidar/features" \
--val_feature_path="/path/to/val/lidar/features" \
--output_path="/checkpoint/path" \
--mode="" \
--stage="" \
--lr=2e-5 \
--epochs=3 \
--batch_size=1 \
--accum_steps=64
Evaluation
To evaluate the model:
python evaluate.py \
--qwen_path="/path/to/qwen/model" \
--checkpoint_path="/checkpoint/path" \
--inference \
--evaluate \
--mode="" \
--json_path="/path/to/test/dataset" \
--img_path="/path/to/images" \
--test_feature_path="/path/to/test/lidar/features" \
--inference_save_path="results.json"
Citation
If you find this work useful, please cite:
@misc{richard2026d2v2xdepthdrivencooperativev2x,
title={D2-V2X: Depth-Driven Cooperative V2X Reasoning for Autonomous Driving},
author={Kevin Richard and Alphin Varghese and Colin Pham and David Oh and Srijan Das},
year={2026},
eprint={2605.24098},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2605.24098},
}