cuijh26 commited on
Commit
b016cd2
Β·
verified Β·
1 Parent(s): a8f0cb9

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +140 -3
README.md CHANGED
@@ -1,3 +1,140 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <h1 align='center'>WAM-Flow: Parallel Coarse-to-Fine Motion Planning via Discrete Flow Matching for Autonomous Driving</h1>
2
+ <div align='center'>
3
+ <a href='https://github.com/YoucanBaby' target='_blank'>Yifang Xu</a><sup>1*</sup>&emsp;
4
+ <a href='https://cuijh26.github.io/' target='_blank'>Jiahao Cui</a><sup>1*</sup>&emsp;
5
+ <a href='https://github.com/fudan-generative-vision/WAM-Flow' target='_blank'>Feipeng Cai</a><sup>2*</sup>&emsp;
6
+ <a href='https://github.com/SSSSSSuger' target='_blank'>Zhihao Zhu</a><sup>1</sup>&emsp;
7
+ <a href='https://github.com/NinoNeumann' target='_blank'>Hanlin Shang</a><sup>1</sup>&emsp;
8
+ <a href='https://github.com/isan089' target='_blank'>Shan Luan</a><sup>1</sup>&emsp;
9
+ </div>
10
+ <div align='center'>
11
+ <a href='https://github.com/xumingw' target='_blank'>Mingwang Xu</a><sup>1</sup>&emsp;
12
+ <a href='https://github.com/fudan-generative-vision/WAM-Flow' target='_blank'>Neng Zhang</a><sup>2</sup>&emsp;
13
+ <a href='https://github.com/fudan-generative-vision/WAM-Flow' target='_blank'>Yaoyi Li</a><sup>2</sup>&emsp;
14
+ <a href='https://github.com/fudan-generative-vision/WAM-Flowβ€˜ target='_blank'>Jia Cai</a><sup>2</sup>&emsp;
15
+ <a href='https://sites.google.com/site/zhusiyucs/home' target='_blank'>Siyu Zhu</a><sup>1</sup>&emsp;
16
+ </div>
17
+
18
+ <div align='center'>
19
+ <sup>1</sup>Fudan University&emsp; <sup>2</sup>Yinwang Intelligent Technology Co., Ltd&emsp;
20
+ </div>
21
+
22
+ <br>
23
+ <div align='center'>
24
+ <a href='https://github.com/fudan-generative-vision/WAM-Flow'><img src='https://img.shields.io/github/stars/fudan-generative-vision/WAM-Flow?style=social'></a>
25
+ <a href='https://arxiv.org/abs/2512.06112'><img src='https://img.shields.io/badge/Paper-Arxiv-red'></a>
26
+ <a href='https://huggingface.co/fudan-generative-ai/WAM-Flow'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20HuggingFace-Model-yellow'></a>
27
+ </div>
28
+ <br>
29
+
30
+
31
+
32
+ ## πŸ“° News
33
+ - **`2026/02/01`**: πŸŽ‰πŸŽ‰πŸŽ‰ Release the pretrained models on [Huggingface](https://huggingface.co/fudan-generative-ai/WAM-Flow).
34
+ - **`2025/12/06`**: πŸŽ‰πŸŽ‰πŸŽ‰ Paper submitted on [Arxiv](https://arxiv.org/pdf/2512.06112).
35
+
36
+
37
+
38
+ ## πŸ“…οΈ Roadmap
39
+
40
+ | Status | Milestone | ETA |
41
+ | :----: | :----------------------------------------------------------------------------------------------------: | :--------: |
42
+ | βœ… | **[Release the SFT and inference code](https://github.com/fudan-generative-vision/WAM-Flow)** | 2025.12.19 |
43
+ | βœ… | **[Pretrained models on Huggingface](https://huggingface.co/fudan-generative-ai/WAM-Flow)** | 2026.02.01 |
44
+ | πŸš€ | **[Release the evaluation code](https://huggingface.co/fudan-generative-ai/WAM-Flow)** | TBD |
45
+ | πŸš€ | **[Release the RL code](https://github.com/fudan-generative-vision/WAM-Flow)** | TBD |
46
+ | πŸš€ | **[Release the pre-processed training data](#training)** | TBD |
47
+
48
+
49
+ ## πŸ“Έ Showcase
50
+ ![teaser](assets/Figure_1.png)
51
+
52
+ ## πŸ† Qualitative Results on NAVSIM
53
+ ### NAVSIM-v1 benchmark results
54
+ <div style="text-align: center;">
55
+ <img src="assets/navsim-v1.png" alt="navsim-v1" width="70%" />
56
+ </div>
57
+
58
+ ### NAVSIM-v2 benchmark results
59
+ <div style="text-align: center;">
60
+ <img src="assets/navsim-v2.png" alt="navsim-v2" width="70%" />
61
+ </div>
62
+
63
+
64
+
65
+ ## πŸ”§οΈ Framework
66
+ ![framework](assets/Figure_2.png)
67
+ Our method takes as input a front-view image, a natural-language navigation command with a system prompt, and the ego-vehicle states, and outputs an 8-waypoint future trajectory spanning 4 seconds through parallel denoising. The model is first trained via supervised fine-tuning to learn accurate trajectory prediction. We then apply simulatorguided GRPO to further optimize closed-loop behavior. The GRPO reward function integrates safety constraints (collision avoidance, drivable-area compliance) with performance objectives (ego-progress, time-to-collision, comfort).
68
+
69
+
70
+
71
+ ## Quick Start
72
+
73
+ ### Installation
74
+
75
+ Clone the repo:
76
+
77
+ ```sh
78
+ git clone https://github.com/fudan-generative-vision/WAM-Flow.git
79
+ cd WAM-Flow
80
+ ```
81
+
82
+ Install dependencies:
83
+
84
+ ```sh
85
+ conda create --name wam-flow python=3.10
86
+ conda activate wam-flow
87
+ pip install -r requirements.txt
88
+ ```
89
+
90
+
91
+ ### Model Download
92
+
93
+ Download models using huggingface-cli:
94
+
95
+ ```sh
96
+ pip install "huggingface_hub[cli]"
97
+ huggingface-cli download fudan-generative-ai/WAM-Flow --local-dir ./pretrained_model/wam-flow
98
+ huggingface-cli download LucasJinWang/FUDOKI --local-dir ./pretrained_model/fudoki
99
+ ```
100
+
101
+
102
+
103
+ ### Inference
104
+
105
+ ```sh
106
+ sh script/infer.sh
107
+ ```
108
+
109
+
110
+ ### Training
111
+
112
+ ```bash
113
+ sh script/sft_debug.sh
114
+ ```
115
+
116
+
117
+
118
+ ## πŸ“ Citation
119
+
120
+ If you find our work useful for your research, please consider citing the paper:
121
+
122
+ ```
123
+ @article{xu2025wam,
124
+ title={WAM-Flow: Parallel Coarse-to-Fine Motion Planning via Discrete Flow Matching for Autonomous Driving},
125
+ author={Xu, Yifang and Cui, Jiahao and Cai, Feipeng and Zhu, Zhihao and Shang, Hanlin and Luan, Shan and Xu, Mingwang and Zhang, Neng and Li, Yaoyi and Cai, Jia and others},
126
+ journal={arXiv preprint arXiv:2512.06112},
127
+ year={2025}
128
+ }
129
+ ```
130
+
131
+
132
+
133
+ ## ⚠️ Social Risks and Mitigations
134
+
135
+ The integration of Vision-Language-Action models into autonomous driving introduces ethical challenges, particularly regarding the opacity of neural decision-making and its impact on road safety. To mitigate these risks, it is imperative to implement explainable AI frameworks and robust safe protocols that ensure predictable vehicle behavior in long-tailed scenarios. Furthermore, addressing concerns over data privacy and public surveillance requires transparent data governance and rigorous de-identification practices. By prioritizing safety-critical alignment and ethical compliance, this research promotes the responsible development and deployment of VLA-based autonomous systems.
136
+
137
+
138
+
139
+ ## πŸ€— Acknowledgements
140
+ We gratefully acknowledge the contributors to the [Recogdrive](https://github.com/xiaomi-research/recogdrive), [Janus](https://github.com/deepseek-ai/Janus), [FUDOKI](https://github.com/fudoki-hku/FUDOKI) and [flow_matching](https://github.com/facebookresearch/flow_matching) repositories, whose commitment to open source has provided us with their excellent codebases and pretrained models.