🚗 Autopilot-Qwen3-VL

Autopilot-Qwen3-VL is an end-to-end autonomous driving model built on top of the powerful Qwen3-VL-2B-Instruct Vision-Language Model. It takes a single road/dashcam image as input and directly predicts the vehicle's continuous control parameters: target speed (km/h) and steering torque (N).

🎥 Simulation Demo

🧠 Model Details

The model utilizes a custom regression head on top of the frozen Qwen3-VL base, trained using Parameter-Efficient Fine-Tuning (PEFT/LoRA) for optimal performance and resource efficiency.

Base Model: Qwen/Qwen3-VL-2B-Instruct
Total Parameters: ~2.13B (2,132,320,770)
Trainable Parameters: ~4.78M (4,788,738 / 0.225%)
Architecture Type: Vision-Language Model + Dual Regression Head

📊 Dataset & Output Format

Trained on the SADC Situation Awareness Dataset.

⚠️ Important Note on Steering Values

Based on the dataset's coordinate system (standard automotive physics):

Negative values (-) = Steering RIGHT
Positive values (+) = Steering LEFT

🚀 Usage

To run inference, you need the custom autopilot_inference.py script provided in this repository.

1. Download the inference script

from huggingface_hub import hf_hub_download

hf_hub_download(
    repo_id="Aleton/Autopilot-qwen3-vl", 
    filename="autopilot_inference.py", 
    local_dir="."
)

2. Run Inference

from autopilot_inference import AutopilotInference
from PIL import Image

# 1. Load the model (downloads weights automatically)
autopilot = AutopilotInference.from_pretrained("Aleton/Autopilot-qwen3-vl")

# 2. Load a dashcam image
image = Image.open("road.jpg")

# 3. Get predictions
result = autopilot.predict(image)

print(f"Target Speed: {result['speed_kmh']:.1f} km/h")
print(f"Steering Torque: {result['steering_N']:.3f} N")

⚠️ Disclaimer

This model is built for educational and research purposes only. It is not designed, tested, or certified for use in real-world autonomous vehicles. Never rely on this model to control a real car.

Downloads last month: 5

Model tree for Aleton/Autopilot-qwen3-vl

Base model

Qwen/Qwen3-VL-2B-Instruct

Finetuned

(232)

this model

Dataset used to train Aleton/Autopilot-qwen3-vl

Collection including Aleton/Autopilot-qwen3-vl

Autonomous Driving / Computer Vision (Беспилотники и CV)

Collection

4 items • Updated 13 days ago