Long Nguyen commited on
Commit
ee22b18
·
verified ·
1 Parent(s): 2ba2ebb

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +155 -0
README.md ADDED
@@ -0,0 +1,155 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - autonomous-driving
5
+ - planning
6
+ - pytorch
7
+ - navsim
8
+ - transfuser
9
+ - end-to-end-driving
10
+ library_name: pytorch
11
+ ---
12
+
13
+ # TFv6 NavSim - Autonomous Driving Planning Model
14
+
15
+ ## Model Description
16
+
17
+ TFv6 NavSim is an end-to-end autonomous driving planning model based on the TransFuser architecture. The model predicts future waypoints and vehicle headings for trajectory planning in autonomous driving scenarios.
18
+
19
+ **Key Features:**
20
+ - 🚗 End-to-end learning for autonomous driving
21
+ - 📷 Multi-camera input processing (4 cameras)
22
+ - 🎯 Predicts future waypoints and headings
23
+ - 🏎️ Trained on NavSim dataset
24
+ - ⚡ Efficient inference with mixed precision support
25
+
26
+ **Architecture:**
27
+ - Backbone: TransFuser with vision encoder
28
+ - Planning Decoder: GPT-based trajectory prediction
29
+ - Input: RGB images (1600x900), navigation commands, speed, acceleration
30
+ - Output: Future waypoints and heading predictions
31
+
32
+ ## Quick Start
33
+
34
+ ### Installation
35
+
36
+ ```bash
37
+ pip install torch torchvision timm numpy opencv-python jaxtyping beartype omegaconf huggingface_hub
38
+ ```
39
+
40
+ ### Simple Inference
41
+
42
+ ```python
43
+ from huggingface_hub import hf_hub_download
44
+ from inference import TFv6NavSimInference
45
+ import numpy as np
46
+
47
+ # Download and load model
48
+ model_path = hf_hub_download(repo_id="longpollehn/tfv6_navsim", filename="model_0060.pth")
49
+ model = TFv6NavSimInference(model_path)
50
+
51
+ # Prepare input (example with dummy data)
52
+ rgb = np.random.randint(0, 255, (900, 1600, 3), dtype=np.uint8) # HWC format
53
+ command = [0, 0, 1, 0] # [left, right, straight, lanefollow]
54
+ speed = 5.0 # m/s
55
+ acceleration = 0.0 # m/s²
56
+
57
+ # Run inference
58
+ result = model.predict(rgb, command, speed, acceleration)
59
+ print(f"Predicted waypoints: {result['waypoints'].shape}")
60
+ print(f"Predicted headings: {result['headings'].shape}")
61
+ ```
62
+
63
+ ### Inference from Image File
64
+
65
+ ```python
66
+ result = model.predict_from_image_path(
67
+ "path/to/image.jpg",
68
+ command=[0, 0, 1, 0], # Go straight
69
+ speed=5.0,
70
+ acceleration=0.0
71
+ )
72
+ ```
73
+
74
+ ## Detailed Usage
75
+
76
+ ### Input Format
77
+
78
+ **RGB Image:**
79
+ - Shape: `(3, H, W)` or `(H, W, 3)`
80
+ - Expected size: 1600x900 pixels
81
+ - Range: [0, 255] (will be normalized internally)
82
+
83
+ **Navigation Command:**
84
+ - 4-element array: `[left, right, straight, lanefollow]`
85
+ - Values typically between 0 and 1
86
+ - Examples:
87
+ - Turn left: `[1, 0, 0, 0]`
88
+ - Go straight: `[0, 0, 1, 0]`
89
+ - Turn right: `[0, 1, 0, 0]`
90
+ - Lane follow: `[0, 0, 0, 1]`
91
+
92
+ **Speed:** Current vehicle speed in meters per second (m/s)
93
+
94
+ **Acceleration:** Current vehicle acceleration in m/s²
95
+
96
+ ### Output Format
97
+
98
+ Returns a dictionary with:
99
+ - `waypoints`: numpy array of shape `(N, 2)` - predicted (x, y) positions
100
+ - `headings`: numpy array of shape `(N,)` - predicted heading angles
101
+
102
+ ## Model Details
103
+
104
+ ### Training Configuration
105
+ - Dataset: NavSim with 4-camera setup
106
+ - Batch size: 64
107
+ - Learning rate: 0.0003
108
+ - Mixed precision training: Enabled
109
+ - Input resolution: 1600x900 (per camera)
110
+ - BEV grid: 256x256 pixels (64x64 meters, 4 pixels/meter)
111
+
112
+ ### Performance
113
+ - Trained for 61 epochs
114
+ - Checkpoint: model_0060.pth
115
+
116
+ ## Gradio Demo
117
+
118
+ A Gradio web interface is available in `app.py`:
119
+
120
+ ```bash
121
+ pip install gradio
122
+ python app.py
123
+ ```
124
+
125
+ Then open the provided URL in your browser.
126
+
127
+ ## Files in this Repository
128
+
129
+ - `model_0060.pth` - Model checkpoint weights
130
+ - `config.json` - Model configuration
131
+ - `stand_alone_model.py` - Model architecture implementation
132
+ - `inference.py` - Easy-to-use inference wrapper
133
+ - `app.py` - Gradio web demo
134
+ - `requirements.txt` - Python dependencies
135
+
136
+ ## Citation
137
+
138
+ If you use this model in your research, please cite:
139
+
140
+ ```bibtex
141
+ @misc{tfv6_navsim,
142
+ title={TFv6 NavSim - Autonomous Driving Planning Model},
143
+ author={Long Nguyen},
144
+ year={2025},
145
+ url={https://huggingface.co/longpollehn/tfv6_navsim}
146
+ }
147
+ ```
148
+
149
+ ## License
150
+
151
+ Apache 2.0
152
+
153
+ ## Acknowledgments
154
+
155
+ This model is based on the TransFuser architecture and trained on the NavSim dataset.