Long Nguyen commited on
Commit
b7c050e
·
verified ·
1 Parent(s): a702328

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +23 -123
README.md CHANGED
@@ -4,152 +4,52 @@ tags:
4
  - autonomous-driving
5
  - planning
6
  - pytorch
7
- - navsim
8
- - transfuser
9
- - end-to-end-driving
10
- library_name: pytorch
11
  ---
12
 
13
- # TFv6 NavSim - Autonomous Driving Planning Model
14
 
15
- ## Model Description
16
 
17
- TFv6 NavSim is an end-to-end autonomous driving planning model based on the TransFuser architecture. The model predicts future waypoints and vehicle headings for trajectory planning in autonomous driving scenarios.
18
-
19
- **Key Features:**
20
- - 🚗 End-to-end learning for autonomous driving
21
- - 📷 Multi-camera input processing (4 cameras)
22
- - 🎯 Predicts future waypoints and headings
23
- - 🏎️ Trained on NavSim dataset
24
- - ⚡ Efficient inference with mixed precision support
25
-
26
- **Architecture:**
27
- - Backbone: TransFuser with vision encoder
28
- - Planning Decoder: GPT-based trajectory prediction
29
- - Input: RGB images (1600x900), navigation commands, speed, acceleration
30
- - Output: Future waypoints and heading predictions
31
-
32
- ## Quick Start
33
-
34
- ### Installation
35
 
36
  ```bash
37
- pip install torch torchvision timm numpy opencv-python jaxtyping beartype omegaconf huggingface_hub
38
  ```
39
 
40
- ### Simple Inference
41
 
42
  ```python
43
- from huggingface_hub import hf_hub_download
44
  from inference import TFv6NavSimInference
45
  import numpy as np
46
 
47
- # Download and load model
48
- model_path = hf_hub_download(repo_id="longpollehn/tfv6_navsim", filename="model_0060.pth")
49
- model = TFv6NavSimInference(model_path)
50
 
51
- # Prepare input (example with dummy data)
52
- rgb = np.random.randint(0, 255, (900, 1600, 3), dtype=np.uint8) # HWC format
53
  command = [0, 0, 1, 0] # [left, right, straight, lanefollow]
54
  speed = 5.0 # m/s
55
  acceleration = 0.0 # m/s²
56
 
57
  # Run inference
58
  result = model.predict(rgb, command, speed, acceleration)
59
- print(f"Predicted waypoints: {result['waypoints'].shape}")
60
- print(f"Predicted headings: {result['headings'].shape}")
61
- ```
62
-
63
- ### Inference from Image File
64
-
65
- ```python
66
- result = model.predict_from_image_path(
67
- "path/to/image.jpg",
68
- command=[0, 0, 1, 0], # Go straight
69
- speed=5.0,
70
- acceleration=0.0
71
- )
72
- ```
73
-
74
- ## Detailed Usage
75
-
76
- ### Input Format
77
-
78
- **RGB Image:**
79
- - Shape: `(3, H, W)` or `(H, W, 3)`
80
- - Expected size: 1600x900 pixels
81
- - Range: [0, 255] (will be normalized internally)
82
-
83
- **Navigation Command:**
84
- - 4-element array: `[left, right, straight, lanefollow]`
85
- - Values typically between 0 and 1
86
- - Examples:
87
- - Turn left: `[1, 0, 0, 0]`
88
- - Go straight: `[0, 0, 1, 0]`
89
- - Turn right: `[0, 1, 0, 0]`
90
- - Lane follow: `[0, 0, 0, 1]`
91
-
92
- **Speed:** Current vehicle speed in meters per second (m/s)
93
-
94
- **Acceleration:** Current vehicle acceleration in m/s²
95
-
96
- ### Output Format
97
-
98
- Returns a dictionary with:
99
- - `waypoints`: numpy array of shape `(N, 2)` - predicted (x, y) positions
100
- - `headings`: numpy array of shape `(N,)` - predicted heading angles
101
-
102
- ## Model Details
103
-
104
- ### Training Configuration
105
- - Dataset: NavSim with 4-camera setup
106
- - Batch size: 64
107
- - Learning rate: 0.0003
108
- - Mixed precision training: Enabled
109
- - Input resolution: 1600x900 (per camera)
110
- - BEV grid: 256x256 pixels (64x64 meters, 4 pixels/meter)
111
-
112
- ### Performance
113
- - Trained for 61 epochs
114
- - Checkpoint: model_0060.pth
115
-
116
- ## Gradio Demo
117
-
118
- A Gradio web interface is available in `app.py`:
119
-
120
- ```bash
121
- pip install gradio
122
- python app.py
123
  ```
124
 
125
- Then open the provided URL in your browser.
126
-
127
- ## Files in this Repository
128
-
129
- - `model_0060.pth` - Model checkpoint weights
130
- - `config.json` - Model configuration
131
- - `stand_alone_model.py` - Model architecture implementation
132
- - `inference.py` - Easy-to-use inference wrapper
133
- - `app.py` - Gradio web demo
134
- - `requirements.txt` - Python dependencies
135
-
136
- ## Citation
137
-
138
- If you use this model in your research, please cite:
139
-
140
- ```bibtex
141
- @misc{tfv6_navsim,
142
- title={TFv6 NavSim - Autonomous Driving Planning Model},
143
- author={Long Nguyen},
144
- year={2025},
145
- url={https://huggingface.co/longpollehn/tfv6_navsim}
146
- }
147
- ```
148
 
149
- ## License
 
 
 
 
150
 
151
- Apache 2.0
 
 
152
 
153
- ## Acknowledgments
154
 
155
- This model is based on the TransFuser architecture and trained on the NavSim dataset.
 
 
 
4
  - autonomous-driving
5
  - planning
6
  - pytorch
 
 
 
 
7
  ---
8
 
9
+ # TFv6 NavSim
10
 
11
+ Autonomous driving planning model (TransFuser-based). Predicts waypoints and headings.
12
 
13
+ ## Usage
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
 
15
  ```bash
16
+ pip install torch timm numpy opencv-python jaxtyping beartype omegaconf huggingface_hub
17
  ```
18
 
19
+ ### Quick Start
20
 
21
  ```python
 
22
  from inference import TFv6NavSimInference
23
  import numpy as np
24
 
25
+ # Auto-downloads from HuggingFace
26
+ model = TFv6NavSimInference()
 
27
 
28
+ # Prepare input
29
+ rgb = np.random.randint(0, 255, (900, 1600, 3), dtype=np.uint8)
30
  command = [0, 0, 1, 0] # [left, right, straight, lanefollow]
31
  speed = 5.0 # m/s
32
  acceleration = 0.0 # m/s²
33
 
34
  # Run inference
35
  result = model.predict(rgb, command, speed, acceleration)
36
+ print(result['waypoints'].shape, result['headings'].shape)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
37
  ```
38
 
39
+ ## Input/Output
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
40
 
41
+ **Input:**
42
+ - RGB: (900, 1600, 3) or (3, 900, 1600), range [0, 255]
43
+ - Command: [left, right, straight, lanefollow], e.g. [0,0,1,0] for straight
44
+ - Speed: m/s
45
+ - Acceleration: m/s²
46
 
47
+ **Output:**
48
+ - `waypoints`: (N, 2) predicted positions
49
+ - `headings`: (N,) predicted angles
50
 
51
+ ## Details
52
 
53
+ - Architecture: TransFuser
54
+ - Dataset: NavSim (4 cameras)
55
+ - Checkpoint: Epoch 60