Aleton
/

Autopilot-qwen3-vl

autonomous-driving

vision-language-model

computer-vision

Model card Files Files and versions

Autopilot-qwen3-vl / README.md

Aleton's picture

Update README.md

d152f12 verified 24 days ago

|

history blame contribute delete

2.62 kB

	---
	license: mit
	language:
	- en
	tags:
	- autonomous-driving
	- vision-language-model
	- computer-vision
	- regression
	base_model: Qwen/Qwen3-VL-2B-Instruct
	datasets:
	- jHaselberger/SADC-Situation-Awareness-for-Driver-Centric-Driving-Style-Adaptation
	---

	# 🚗 Autopilot-Qwen3-VL

	Autopilot-Qwen3-VL is an end-to-end autonomous driving model built on top of the powerful `Qwen3-VL-2B-Instruct` Vision-Language Model. It takes a single road/dashcam image as input and directly predicts the vehicle's continuous control parameters: target speed (km/h) and steering torque (N).

	## 🎥 Simulation Demo

	![Simulation Demo](https://huggingface.co/Aleton/Autopilot-qwen3-vl/resolve/main/demo.gif)

	---

	## 🧠 Model Details

	The model utilizes a custom regression head on top of the frozen Qwen3-VL base, trained using Parameter-Efficient Fine-Tuning (PEFT/LoRA) for optimal performance and resource efficiency.

	- Base Model: `Qwen/Qwen3-VL-2B-Instruct`
	- Total Parameters: ~2.13B (2,132,320,770)
	- Trainable Parameters: ~4.78M (4,788,738 / 0.225%)
	- Architecture Type: Vision-Language Model + Dual Regression Head

	![Architecture Diagram](architecture.jpg)

	## 📊 Dataset & Output Format

	Trained on the [SADC Situation Awareness Dataset](https://huggingface.co/datasets/jHaselberger/SADC-Situation-Awareness-for-Driver-Centric-Driving-Style-Adaptation).

	### ⚠️ Important Note on Steering Values
	Based on the dataset's coordinate system (standard automotive physics):
	- Negative values (`-`) = Steering RIGHT
	- Positive values (`+`) = Steering LEFT

	---

	## 🚀 Usage

	To run inference, you need the custom `autopilot_inference.py` script provided in this repository.

	### 1. Download the inference script
	```python
	from huggingface_hub import hf_hub_download

	hf_hub_download(
	repo_id="Aleton/Autopilot-qwen3-vl",
	filename="autopilot_inference.py",
	local_dir="."
	)
	```

	### 2. Run Inference
	```python
	from autopilot_inference import AutopilotInference
	from PIL import Image

	# 1. Load the model (downloads weights automatically)
	autopilot = AutopilotInference.from_pretrained("Aleton/Autopilot-qwen3-vl")

	# 2. Load a dashcam image
	image = Image.open("road.jpg")

	# 3. Get predictions
	result = autopilot.predict(image)

	print(f"Target Speed: {result['speed_kmh']:.1f} km/h")
	print(f"Steering Torque: {result['steering_N']:.3f} N")
	```

	## ⚠️ Disclaimer
	This model is built for educational and research purposes only. It is not designed, tested, or certified for use in real-world autonomous vehicles. Never rely on this model to control a real car.