D2-V2X: Depth-Driven Cooperative V2X Reasoning for Autonomous Driving

This repository contains the model weights (adapter) for D2-V2X, a spatially-aware Question-Rationale-Answer (QRA) framework designed for cooperative autonomous driving.

Paper (arXiv) | GitHub | Dataset

Overview

D2-V2X addresses sensor occlusions in single-vehicle Vision-Language Models (VLMs) by establishing a benchmark for cooperative reasoning using multimodal vehicle and infrastructure sensors (V2X). It establishes a baseline that aligns 3D LiDAR features with the VLM's latent space, enforcing Chain-of-Thought (CoT) rationales to articulate spatial relations explicitly.

Usage

For environment setup and data preparation, please refer to the official GitHub repository.

Training

To train the model using the provided pipeline:

python train.py \
    --qwen_path="/path/to/qwen/model" \
    --train_path="/path/to/train/dataset" \
    --val_path="/path/to/val/dataset" \
    --img_path="/path/to/images" \
    --train_feature_path="/path/to/train/lidar/features" \
    --val_feature_path="/path/to/val/lidar/features" \
    --output_path="/checkpoint/path" \
    --mode="" \
    --stage="" \
    --lr=2e-5 \
    --epochs=3 \
    --batch_size=1 \
    --accum_steps=64

Evaluation

To evaluate the model:

python evaluate.py \
    --qwen_path="/path/to/qwen/model" \
    --checkpoint_path="/checkpoint/path" \
    --inference \
    --evaluate \
    --mode="" \
    --json_path="/path/to/test/dataset" \
    --img_path="/path/to/images" \
    --test_feature_path="/path/to/test/lidar/features" \
    --inference_save_path="results.json"

Citation

If you find this work useful, please cite:

@misc{richard2026d2v2xdepthdrivencooperativev2x,
      title={D2-V2X: Depth-Driven Cooperative V2X Reasoning for Autonomous Driving}, 
      author={Kevin Richard and Alphin Varghese and Colin Pham and David Oh and Srijan Das},
      year={2026},
      eprint={2605.24098},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2605.24098}, 
}

Downloads last month: -

Inference Providers NEW

Image-Text-to-Text

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for kr301/d2v2x-adapter

D2-V2X: Depth-Driven Cooperative V2X Reasoning for Autonomous Driving

Paper • 2605.24098 • Published 6 days ago