kr301
/

d2v2x-adapter

Image-Text-to-Text

Model card Files Files and versions

d2v2x-adapter / README.md

kr301's picture

Add model card and metadata for D2-V2X (#1)

866f87b about 23 hours ago

|

history blame contribute delete

2.37 kB

	---
	license: mit
	library_name: peft
	pipeline_tag: image-text-to-text
	---

	# D2-V2X: Depth-Driven Cooperative V2X Reasoning for Autonomous Driving

	This repository contains the model weights (adapter) for D2-V2X, a spatially-aware Question-Rationale-Answer (QRA) framework designed for cooperative autonomous driving.

	[Paper (arXiv)](https://arxiv.org/abs/2605.24098) \| [GitHub](https://github.com/KevinRichard1/D2-V2X) \| [Dataset](https://huggingface.co/datasets/kr301/d2v2x-qra)

	## Overview
	D2-V2X addresses sensor occlusions in single-vehicle Vision-Language Models (VLMs) by establishing a benchmark for cooperative reasoning using multimodal vehicle and infrastructure sensors (V2X). It establishes a baseline that aligns 3D LiDAR features with the VLM's latent space, enforcing Chain-of-Thought (CoT) rationales to articulate spatial relations explicitly.

	## Usage

	For environment setup and data preparation, please refer to the [official GitHub repository](https://github.com/KevinRichard1/D2-V2X).

	### Training
	To train the model using the provided pipeline:
	```bash
	python train.py \
	--qwen_path="/path/to/qwen/model" \
	--train_path="/path/to/train/dataset" \
	--val_path="/path/to/val/dataset" \
	--img_path="/path/to/images" \
	--train_feature_path="/path/to/train/lidar/features" \
	--val_feature_path="/path/to/val/lidar/features" \
	--output_path="/checkpoint/path" \
	--mode="" \
	--stage="" \
	--lr=2e-5 \
	--epochs=3 \
	--batch_size=1 \
	--accum_steps=64
	```

	### Evaluation
	To evaluate the model:
	```bash
	python evaluate.py \
	--qwen_path="/path/to/qwen/model" \
	--checkpoint_path="/checkpoint/path" \
	--inference \
	--evaluate \
	--mode="" \
	--json_path="/path/to/test/dataset" \
	--img_path="/path/to/images" \
	--test_feature_path="/path/to/test/lidar/features" \
	--inference_save_path="results.json"
	```

	## Citation
	If you find this work useful, please cite:
	```bibtex
	@misc{richard2026d2v2xdepthdrivencooperativev2x,
	title={D2-V2X: Depth-Driven Cooperative V2X Reasoning for Autonomous Driving},
	author={Kevin Richard and Alphin Varghese and Colin Pham and David Oh and Srijan Das},
	year={2026},
	eprint={2605.24098},
	archivePrefix={arXiv},
	primaryClass={cs.CV},
	url={https://arxiv.org/abs/2605.24098},
	}
	```