| --- |
| license: mit |
| language: |
| - en |
| tags: |
| - autonomous-driving |
| - vision-language-model |
| - computer-vision |
| - regression |
| base_model: Qwen/Qwen3-VL-2B-Instruct |
| datasets: |
| - jHaselberger/SADC-Situation-Awareness-for-Driver-Centric-Driving-Style-Adaptation |
| --- |
| |
| # π Autopilot-Qwen3-VL |
|
|
| **Autopilot-Qwen3-VL** is an end-to-end autonomous driving model built on top of the powerful `Qwen3-VL-2B-Instruct` Vision-Language Model. It takes a single road/dashcam image as input and directly predicts the vehicle's continuous control parameters: **target speed (km/h)** and **steering torque (N)**. |
|
|
| ## π₯ Simulation Demo |
|
|
|  |
|
|
| --- |
|
|
| ## π§ Model Details |
|
|
| The model utilizes a custom regression head on top of the frozen Qwen3-VL base, trained using Parameter-Efficient Fine-Tuning (PEFT/LoRA) for optimal performance and resource efficiency. |
|
|
| - **Base Model:** `Qwen/Qwen3-VL-2B-Instruct` |
| - **Total Parameters:** ~2.13B (2,132,320,770) |
| - **Trainable Parameters:** ~4.78M (4,788,738 / 0.225%) |
| - **Architecture Type:** Vision-Language Model + Dual Regression Head |
|
|
|  |
|
|
| ## π Dataset & Output Format |
|
|
| Trained on the [SADC Situation Awareness Dataset](https://huggingface.co/datasets/jHaselberger/SADC-Situation-Awareness-for-Driver-Centric-Driving-Style-Adaptation). |
|
|
| ### β οΈ Important Note on Steering Values |
| Based on the dataset's coordinate system (standard automotive physics): |
| - **Negative values (`-`)** = Steering **RIGHT** |
| - **Positive values (`+`)** = Steering **LEFT** |
|
|
| --- |
|
|
| ## π Usage |
|
|
| To run inference, you need the custom `autopilot_inference.py` script provided in this repository. |
|
|
| ### 1. Download the inference script |
| ```python |
| from huggingface_hub import hf_hub_download |
| |
| hf_hub_download( |
| repo_id="Aleton/Autopilot-qwen3-vl", |
| filename="autopilot_inference.py", |
| local_dir="." |
| ) |
| ``` |
|
|
| ### 2. Run Inference |
| ```python |
| from autopilot_inference import AutopilotInference |
| from PIL import Image |
| |
| # 1. Load the model (downloads weights automatically) |
| autopilot = AutopilotInference.from_pretrained("Aleton/Autopilot-qwen3-vl") |
| |
| # 2. Load a dashcam image |
| image = Image.open("road.jpg") |
| |
| # 3. Get predictions |
| result = autopilot.predict(image) |
| |
| print(f"Target Speed: {result['speed_kmh']:.1f} km/h") |
| print(f"Steering Torque: {result['steering_N']:.3f} N") |
| ``` |
|
|
| ## β οΈ Disclaimer |
| This model is built for educational and research purposes only. It is not designed, tested, or certified for use in real-world autonomous vehicles. Never rely on this model to control a real car. |