Update README.md
Browse files
README.md
CHANGED
|
@@ -25,6 +25,12 @@ tags: []
|
|
| 25 |
|
| 26 |
Built on the **Qwen2.5-VL** foundation, DriveFusion-V0.2 adds specialized MLP heads to fuse physical context with visual features, enabling a comprehensive "world model" for driving.
|
| 27 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 28 |
### Core Features
|
| 29 |
- **Vision Processing**: Handles images and videos via a 32-layer Vision Transformer.
|
| 30 |
- **Context Fusion**: Custom `SpeedMLP` and `GPSTargetPointsMLP` integrate vehicle telemetry.
|
|
@@ -102,19 +108,4 @@ print("Target Speeds:", output["target_speeds"])
|
|
| 102 |
|
| 103 |
## ⚠️ Safety & Limitations
|
| 104 |
- **Non-Real-Time Hardware**: This model is optimized for high-accuracy reasoning and may require quantization for low-latency onboard use.
|
| 105 |
-
- **Physical Limits**: While the model predicts trajectories, it does not account for vehicle dynamics (e.g., tire friction) and should be used with a downstream controller.
|
| 106 |
-
|
| 107 |
-
---
|
| 108 |
-
|
| 109 |
-
## 📜 Citation
|
| 110 |
-
If this model assists your research, please cite the DriveFusion graduation project:
|
| 111 |
-
|
| 112 |
-
```bibtex
|
| 113 |
-
@article{drivefusion2026v02,
|
| 114 |
-
title={DriveFusion-V0.2: Multimodal Trajectory Prediction and Reasoning},
|
| 115 |
-
author={DriveFusion Team},
|
| 116 |
-
year={2026},
|
| 117 |
-
publisher={GitHub},
|
| 118 |
-
url={https://github.com/DriveFusion/drivefusion}
|
| 119 |
-
}
|
| 120 |
-
```
|
|
|
|
| 25 |
|
| 26 |
Built on the **Qwen2.5-VL** foundation, DriveFusion-V0.2 adds specialized MLP heads to fuse physical context with visual features, enabling a comprehensive "world model" for driving.
|
| 27 |
|
| 28 |
+
## 🔗 GitHub Repository
|
| 29 |
+
|
| 30 |
+
Find the full implementation, training scripts, and preprocessing logic here:
|
| 31 |
+
* **Main Model Code:** [DriveFusion/drivefusion](https://github.com/DriveFusion/drivefusion)
|
| 32 |
+
* **Data Collection:** [DriveFusion/data-collection](https://github.com/DriveFusion/carla-data-collection.git)
|
| 33 |
+
|
| 34 |
### Core Features
|
| 35 |
- **Vision Processing**: Handles images and videos via a 32-layer Vision Transformer.
|
| 36 |
- **Context Fusion**: Custom `SpeedMLP` and `GPSTargetPointsMLP` integrate vehicle telemetry.
|
|
|
|
| 108 |
|
| 109 |
## ⚠️ Safety & Limitations
|
| 110 |
- **Non-Real-Time Hardware**: This model is optimized for high-accuracy reasoning and may require quantization for low-latency onboard use.
|
| 111 |
+
- **Physical Limits**: While the model predicts trajectories, it does not account for vehicle dynamics (e.g., tire friction) and should be used with a downstream controller.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|