ln2697
/

tfv6_navsim

+---
+license: apache-2.0
+tags:
+- autonomous-driving
+- planning
+- pytorch
+- navsim
+- transfuser
+- end-to-end-driving
+library_name: pytorch
+---
+# TFv6 NavSim - Autonomous Driving Planning Model
+## Model Description
+TFv6 NavSim is an end-to-end autonomous driving planning model based on the TransFuser architecture. The model predicts future waypoints and vehicle headings for trajectory planning in autonomous driving scenarios.
+**Key Features:**
+- 🚗 End-to-end learning for autonomous driving
+- 📷 Multi-camera input processing (4 cameras)
+- 🎯 Predicts future waypoints and headings
+- 🏎️ Trained on NavSim dataset
+- ⚡ Efficient inference with mixed precision support
+**Architecture:**
+- Backbone: TransFuser with vision encoder
+- Planning Decoder: GPT-based trajectory prediction
+- Input: RGB images (1600x900), navigation commands, speed, acceleration
+- Output: Future waypoints and heading predictions
+## Quick Start
+### Installation
+```bash
+pip install torch torchvision timm numpy opencv-python jaxtyping beartype omegaconf huggingface_hub
+```
+### Simple Inference
+```python
+from huggingface_hub import hf_hub_download
+from inference import TFv6NavSimInference
+import numpy as np
+# Download and load model
+model_path = hf_hub_download(repo_id="longpollehn/tfv6_navsim", filename="model_0060.pth")
+model = TFv6NavSimInference(model_path)
+# Prepare input (example with dummy data)
+rgb = np.random.randint(0, 255, (900, 1600, 3), dtype=np.uint8)  # HWC format
+command = [0, 0, 1, 0]  # [left, right, straight, lanefollow]
+speed = 5.0  # m/s
+acceleration = 0.0  # m/s²
+# Run inference
+result = model.predict(rgb, command, speed, acceleration)
+print(f"Predicted waypoints: {result['waypoints'].shape}")
+print(f"Predicted headings: {result['headings'].shape}")
+```
+### Inference from Image File
+```python
+result = model.predict_from_image_path(
+    "path/to/image.jpg",
+    command=[0, 0, 1, 0],  # Go straight
+    speed=5.0,
+    acceleration=0.0
+)
+```
+## Detailed Usage
+### Input Format
+**RGB Image:**
+- Shape: `(3, H, W)` or `(H, W, 3)`
+- Expected size: 1600x900 pixels
+- Range: [0, 255] (will be normalized internally)
+**Navigation Command:**
+- 4-element array: `[left, right, straight, lanefollow]`
+- Values typically between 0 and 1
+- Examples:
+  - Turn left: `[1, 0, 0, 0]`
+  - Go straight: `[0, 0, 1, 0]`
+  - Turn right: `[0, 1, 0, 0]`
+  - Lane follow: `[0, 0, 0, 1]`
+**Speed:** Current vehicle speed in meters per second (m/s)
+**Acceleration:** Current vehicle acceleration in m/s²
+### Output Format
+Returns a dictionary with:
+- `waypoints`: numpy array of shape `(N, 2)` - predicted (x, y) positions
+- `headings`: numpy array of shape `(N,)` - predicted heading angles
+## Model Details
+### Training Configuration
+- Dataset: NavSim with 4-camera setup
+- Batch size: 64
+- Learning rate: 0.0003
+- Mixed precision training: Enabled
+- Input resolution: 1600x900 (per camera)
+- BEV grid: 256x256 pixels (64x64 meters, 4 pixels/meter)
+### Performance
+- Trained for 61 epochs
+- Checkpoint: model_0060.pth
+## Gradio Demo
+A Gradio web interface is available in `app.py`:
+```bash
+pip install gradio
+python app.py
+```
+Then open the provided URL in your browser.
+## Files in this Repository
+- `model_0060.pth` - Model checkpoint weights
+- `config.json` - Model configuration
+- `stand_alone_model.py` - Model architecture implementation
+- `inference.py` - Easy-to-use inference wrapper
+- `app.py` - Gradio web demo
+- `requirements.txt` - Python dependencies
+## Citation
+If you use this model in your research, please cite:
+```bibtex
+@misc{tfv6_navsim,
+  title={TFv6 NavSim - Autonomous Driving Planning Model},
+  author={Long Nguyen},
+  year={2025},
+  url={https://huggingface.co/longpollehn/tfv6_navsim}
+}
+```
+## License
+Apache 2.0
+## Acknowledgments
+This model is based on the TransFuser architecture and trained on the NavSim dataset.