Autopilot-qwen3-vl / README.md
Aleton's picture
Update README.md
d152f12 verified
---
license: mit
language:
- en
tags:
- autonomous-driving
- vision-language-model
- computer-vision
- regression
base_model: Qwen/Qwen3-VL-2B-Instruct
datasets:
- jHaselberger/SADC-Situation-Awareness-for-Driver-Centric-Driving-Style-Adaptation
---
# πŸš— Autopilot-Qwen3-VL
**Autopilot-Qwen3-VL** is an end-to-end autonomous driving model built on top of the powerful `Qwen3-VL-2B-Instruct` Vision-Language Model. It takes a single road/dashcam image as input and directly predicts the vehicle's continuous control parameters: **target speed (km/h)** and **steering torque (N)**.
## πŸŽ₯ Simulation Demo
![Simulation Demo](https://huggingface.co/Aleton/Autopilot-qwen3-vl/resolve/main/demo.gif)
---
## 🧠 Model Details
The model utilizes a custom regression head on top of the frozen Qwen3-VL base, trained using Parameter-Efficient Fine-Tuning (PEFT/LoRA) for optimal performance and resource efficiency.
- **Base Model:** `Qwen/Qwen3-VL-2B-Instruct`
- **Total Parameters:** ~2.13B (2,132,320,770)
- **Trainable Parameters:** ~4.78M (4,788,738 / 0.225%)
- **Architecture Type:** Vision-Language Model + Dual Regression Head
![Architecture Diagram](architecture.jpg)
## πŸ“Š Dataset & Output Format
Trained on the [SADC Situation Awareness Dataset](https://huggingface.co/datasets/jHaselberger/SADC-Situation-Awareness-for-Driver-Centric-Driving-Style-Adaptation).
### ⚠️ Important Note on Steering Values
Based on the dataset's coordinate system (standard automotive physics):
- **Negative values (`-`)** = Steering **RIGHT**
- **Positive values (`+`)** = Steering **LEFT**
---
## πŸš€ Usage
To run inference, you need the custom `autopilot_inference.py` script provided in this repository.
### 1. Download the inference script
```python
from huggingface_hub import hf_hub_download
hf_hub_download(
repo_id="Aleton/Autopilot-qwen3-vl",
filename="autopilot_inference.py",
local_dir="."
)
```
### 2. Run Inference
```python
from autopilot_inference import AutopilotInference
from PIL import Image
# 1. Load the model (downloads weights automatically)
autopilot = AutopilotInference.from_pretrained("Aleton/Autopilot-qwen3-vl")
# 2. Load a dashcam image
image = Image.open("road.jpg")
# 3. Get predictions
result = autopilot.predict(image)
print(f"Target Speed: {result['speed_kmh']:.1f} km/h")
print(f"Steering Torque: {result['steering_N']:.3f} N")
```
## ⚠️ Disclaimer
This model is built for educational and research purposes only. It is not designed, tested, or certified for use in real-world autonomous vehicles. Never rely on this model to control a real car.