Improve model card with full details and tags

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +424 -5
README.md CHANGED
@@ -1,8 +1,10 @@
1
  ---
2
  license: mit
 
 
3
  ---
4
- # FluidNexus: 3D Fluid Reconstruction and Prediction From a Single Video
5
 
 
6
 
7
  [![arXiv](https://img.shields.io/badge/arXiv-2503.04720-b31b1b)](https://arxiv.org/abs/2503.04720)
8
  [![Paper PDF](https://img.shields.io/badge/Paper-PDF-blue)](https://arxiv.org/pdf/2503.04720)
@@ -11,16 +13,433 @@ license: mit
11
  [![Hugging Face Datasets](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Datasets-orange)](https://huggingface.co/datasets/yuegao/FluidNexusDatasets)
12
  [![Hugging Face Models](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Models-gold)](https://huggingface.co/yuegao/FluidNexusModels)
13
 
14
- [Yue Gao*](https://yuegao.me/), [Hong-Xing "Koven" Yu*](https://kovenyu.com/), [Bo Zhu](https://faculty.cc.gatech.edu/~bozhu/), [Jiajun Wu](https://jiajunwu.com/)
15
 
16
  [Stanford University](https://svl.stanford.edu/); [Microsoft](https://microsoft.com/); [Georgia Institute of Technology](https://www.gatech.edu/)
17
 
18
  \* denotes equal contribution
19
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
 
21
- ## Citation
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
 
23
- If you find these datasets useful for your research, please cite our paper:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
24
 
25
  ```bibtex
26
  @inproceedings{gao2025fluidnexus,
@@ -30,4 +449,4 @@ If you find these datasets useful for your research, please cite our paper:
30
  month = {June},
31
  year = {2025},
32
  }
33
- ```
 
1
  ---
2
  license: mit
3
+ pipeline_tag: image-to-3d
4
+ library_name: diffusers
5
  ---
 
6
 
7
+ # FluidNexus: 3D Fluid Reconstruction and Prediction From a Single Video
8
 
9
  [![arXiv](https://img.shields.io/badge/arXiv-2503.04720-b31b1b)](https://arxiv.org/abs/2503.04720)
10
  [![Paper PDF](https://img.shields.io/badge/Paper-PDF-blue)](https://arxiv.org/pdf/2503.04720)
 
13
  [![Hugging Face Datasets](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Datasets-orange)](https://huggingface.co/datasets/yuegao/FluidNexusDatasets)
14
  [![Hugging Face Models](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Models-gold)](https://huggingface.co/yuegao/FluidNexusModels)
15
 
16
+ [**Yue Gao**\*](https://yuegao.me/), [**Hong-Xing "Koven" Yu**\*](https://kovenyu.com/), [**Bo Zhu**](https://faculty.cc.gatech.edu/~bozhu/), [**Jiajun Wu**](https://jiajunwu.com/)
17
 
18
  [Stanford University](https://svl.stanford.edu/); [Microsoft](https://microsoft.com/); [Georgia Institute of Technology](https://www.gatech.edu/)
19
 
20
  \* denotes equal contribution
21
 
22
+ ![FluidNexus Teaser](https://github.com/ueoo/FluidNexus/raw/main/assets/teaser_vel.gif)
23
+
24
+ ## Abstract
25
+ We study reconstructing and predicting 3D fluid appearance and velocity from a single video. Current methods require multi-view videos for fluid reconstruction. We present FluidNexus, a novel framework that bridges video generation and physics simulation to tackle this task. Our key insight is to synthesize multiple novel-view videos as references for reconstruction. FluidNexus consists of two key components: (1) a novel-view video synthesizer that combines frame-wise view synthesis with video diffusion refinement for generating realistic videos, and (2) a physics-integrated particle representation coupling differentiable simulation and rendering to simultaneously facilitate 3D fluid reconstruction and prediction. To evaluate our approach, we collect two new real-world fluid datasets featuring textured backgrounds and object interactions. Our method enables dynamic novel view synthesis, future prediction, and interaction simulation from a single fluid video.
26
+
27
+ ## 🚀 Get Started
28
+
29
+ > Don’t forget to update all `/path/to/FluidNexusRoot` to your real path. Find & Replace is your friend!
30
+
31
+ ### Set Up Root Folder and Python Environment
32
+
33
+ ```shell
34
+ mkdir -p /path/to/FluidNexusRoot
35
+
36
+ cd /path/to/FluidNexusRoot
37
+ git clone https://github.com/ueoo/FluidNexus.git
38
+
39
+ cd FluidNexus
40
+ conda env create -f fluid_nexus.yml
41
+
42
+ conda activate fluid_nexus
43
+
44
+ # Install the 3D Gaussian Splatting submodules
45
+ pip install git+https://github.com/graphdeco-inria/diff-gaussian-rasterization.git
46
+ pip install git+https://github.com/facebookresearch/pytorch3d.git@stable
47
+ cd /path/to/FluidNexusRoot/FluidNexus/FluidDynamics
48
+ pip install submodules/gaussian_rasterization_ch3
49
+ pip install submodules/gaussian_rasterization_ch1
50
+ pip install submodules/simple-knn
51
+ pip install git+https://github.com/openai/CLIP.git
52
+ pip install xformers --index-url https://download.pytorch.org/whl/cu124
53
+ ```
54
+
55
+ ### Download the Datasets
56
+
57
+ Our **FluidNexus-Smoke** and **FluidNexus-Ball** datasets each include 120 scenes. Every scene contains 5 synchronized multi-view videos, with cameras arranged along a horizontal arc of approximately 120°.
58
+
59
+ * **FluidNexusSmoke** and **FluidNexusBall**: Processed datasets containing one example sample used in our paper.
60
+ * **FluidNexusSmokeAll** and **FluidNexusBallAll**: All samples processed into frames, usable within the FluidNexus framework.
61
+ * **FluidNexusSmokeAllRaw** and **FluidNexusBallAllRaw**: Raw videos of all samples as originally captured.
62
+ > For a quick start, just download either FluidNexusSmoke or FluidNexusBall. The ones labeled ‘All’ contain all the datasets we collected; you should use them only if you want to finetune Zero123 or CogVideo-X, or perform a thorough evaluation on the entire dataset.
63
+
64
+ For **ScalarFlow**, please refer to the original [website](https://ge.in.tum.de/publications/2019-scalarflow-eckert/).
65
+
66
+ ```shell
67
+ cd /path/to/FluidNexusRoot
68
+
69
+ # Download FluidNexus-Smoke FluidNexus-Ball ScalarReal datasets from Hugging Face
70
+ # To use the full dataset, you can clone it directly from HF:
71
+ # git clone https://huggingface.co/datasets/yuegao/FluidNexusDatasets
72
+ cd FluidNexusDatasets
73
+ # conda install -c conda-forge git-lfs
74
+ # git lfs install
75
+ # git lfs pull
76
+
77
+ # If you only want to download the two without ‘All’, you can do so with:
78
+ # wget https://huggingface.co/datasets/yuegao/FluidNexusDatasets/resolve/main/FluidNexusBall.zip?download=true
79
+ # wget https://huggingface.co/datasets/yuegao/FluidNexusDatasets/resolve/main/FluidNexusSmoke.zip?download=true
80
+
81
+
82
+ unzip FluidNexusBall.zip
83
+ # unzip FluidNexusBallAll.zip
84
+ # unzip FluidNexusBallAllRaw.zip
85
+ unzip FluidNexusSmoke.zip
86
+ # unzip FluidNexusSmokeAll.zip
87
+ # unzip FluidNexusSmokeAllRaw.zip
88
+ unzip ScalarReal.zip
89
+
90
+ mv FluidNexusBall /path/to/FluidNexusRoot
91
+ # mv FluidNexusBallAll /path/to/FluidNexusRoot
92
+ # mv FluidNexusBallAllRaw /path/to/FluidNexusRoot
93
+ mv FluidNexusSmoke /path/to/FluidNexusRoot
94
+ # mv FluidNexusSmokeAll /path/to/FluidNexusRoot
95
+ # mv FluidNexusSmokeAllRaw /path/to/FluidNexusRoot
96
+ mv ScalarReal /path/to/FluidNexusRoot
97
+ ```
98
+
99
+ ### Frame-wise Novel View Synthesis
100
+
101
+ #### 1. Convert the frames to Zero123 input frames and create the cameras
102
+
103
+ ```shell
104
+ cd /path/to/FluidNexusRoot/FluidNexus/DataProcessing
105
+
106
+ python convert_original_to_zero123.py
107
+
108
+ # note: update the dataset_name in create_zero123_cams.py first
109
+ python create_zero123_cams.py
110
+ ```
111
+
112
+ #### 2. Download the pretrained Zero123 and CogVideoX models
113
+
114
+ ```shell
115
+ cd /path/to/FluidNexusRoot
116
+
117
+ # Zero123 base models
118
+ mkdir -p zero123_weights
119
+ cd zero123_weights
120
+ wget https://zero123.cs.columbia.edu/assets/zero123-xl.ckpt
121
+
122
+ # CogVideoX base models
123
+ mkdir -p cogvideox-sat
124
+ # Please refer to the CogVideoX repo, we use the 1.0 version
125
+ # https://github.com/THUDM/CogVideo/blob/main/sat/README.md
126
+
127
+ # Our finetuned models
128
+ git clone https://huggingface.co/yuegao/FluidNexusModels
129
+
130
+ cd FluidNexusModels
131
+
132
+ mv zero123_finetune_logs /path/to/FluidNexusRoot
133
+ mv cogvideox_lora_ckpts /path/to/FluidNexusRoot
134
+ ```
135
+
136
+ #### 3. Inference the frame-wise novel view synthesis model
137
+
138
+ Take `FluidNexus-Smoke` as an example, we assume the camera 2 is the middle camera, which is used as input:
139
+
140
+ ```shell
141
+ cd /path/to/FluidNexusRoot/FluidNexus/Zero123
142
+
143
+ python inference/infer_fluid_nexus_smoke.py --tgt_cam 0
144
+ python inference/infer_fluid_nexus_smoke.py --tgt_cam 1
145
+ python inference/infer_fluid_nexus_smoke.py --tgt_cam 3
146
+ python inference/infer_fluid_nexus_smoke.py --tgt_cam 4
147
+ ```
148
+
149
+ ### Generative Video Refinement
150
+
151
+ #### 1. Convert Zero123 output frames to CogVideoX input frames
152
+
153
+ ```shell
154
+ cd /path/to/FluidNexusRoot/FluidNexus/DataProcessing
155
+ python convert_zero123_to_cogvideox.py
156
+ ```
157
+
158
+ #### 2. Inference the video generative models
159
+
160
+ ```shell
161
+ cd /path/to/FluidNexusRoot/FluidNexus/CogVideoX
162
+
163
+ bash tools_gen/gen_zero123_pi2v_long_fluid_nexus_smoke.sh
164
+
165
+ bash tools_gen/gen_zero123_pi2v_long_fluid_nexus_ball.sh
166
+
167
+ bash tools_gen/gen_zero123_pi2v_long_scalar_real.sh
168
+ ```
169
+
170
+ #### 3. Convert the video gen output frames to original frame format
171
+
172
+ ```shell
173
+ cd /path/to/FluidNexusRoot/FluidNexus/DataProcessing
174
+
175
+ python convert_cogvideox_to_original.py
176
+ ```
177
+
178
+ ### Fluid Dynamics Reconstruction
179
+
180
+ #### 1. Optimize the background
181
+
182
+ Skip this step for ScalarReal dataset
183
+
184
+ ```shell
185
+ cd /path/to/FluidNexusRoot/FluidNexus/FluidDynamics
186
+
187
+ # For FluidNeuxs-Smoke
188
+ bash tools_fluid_nexus/smoke_train_background.sh
189
+
190
+ # For FluidNeuxs-Ball
191
+ bash tools_fluid_nexus/ball_train_background.sh
192
+ ```
193
+
194
+ #### 2. Optimize the physical particles
195
+
196
+ ```shell
197
+ cd /path/to/FluidNexusRoot/FluidNexus/FluidDynamics
198
+
199
+ # For FluidNeuxs-Smoke
200
+ bash tools_fluid_nexus/smoke_train_dynamics_physical.sh
201
+
202
+ # For FluidNeuxs-Ball
203
+ bash tools_fluid_nexus/ball_train_dynamics_physical.sh
204
+
205
+ # For ScalarReal
206
+ bash tools_scalar_real/train_physical_particle.sh
207
+ ```
208
+
209
+ #### 3. Optimize the visual particles
210
+
211
+ ```shell
212
+ cd /path/to/FluidNexusRoot/FluidNexus/FluidDynamics
213
+
214
+ # For FluidNeuxs-Smoke
215
+ bash tools_fluid_nexus/smoke_train_dynamics_visual.sh
216
+
217
+ # For FluidNeuxs-Ball
218
+ bash tools_fluid_nexus/ball_train_dynamics_visual.sh
219
+
220
+ # For ScalarReal
221
+ bash tools_scalar_real/train_visual_particle.sh
222
+ ```
223
+
224
+ 🎊🎊 The results are located in `training_render`! 🎊🎊
225
+
226
+ ## 🕰️ Future Prediction
227
+
228
+ ### Physics simulation
229
+
230
+ Physics simulation is used to render rough multi-view future prediction frames.
231
+
232
+ ```shell
233
+ cd /path/to/FluidNexusRoot/FluidNexus/FluidDynamics
234
+
235
+ # For FluidNeuxs-Smoke
236
+ bash tools_fluid_nexus/smoke_future_simulation.sh
237
+
238
+ # For FluidNeuxs-Ball
239
+ bash tools_fluid_nexus/ball_future_simulation.sh
240
+
241
+ # For ScalarReal
242
+ bash tools_scalar_real/future_simulation.sh
243
+ ```
244
+
245
+ ### Convert the simulation results to CogVideoX input format
246
+
247
+ ```shell
248
+ cd /path/to/FluidNexusRoot/FluidNexus/DataProcessing
249
+
250
+ # FluidNexus-Smoke
251
+ # update the experiment name first
252
+ python convert_simulation_original_to_cogvideox.py
253
+
254
+ # FluidNexus-Ball
255
+ # update the experiment name first
256
+ python convert_simulation_original_to_cogvideox.py
257
+
258
+ # ScalarReal
259
+ python convert_simulation_original_to_cogvideox_unshift.py
260
+ ```
261
+
262
+ ### Generative video refinement on future prediction
263
+
264
+ Refine the rough multi-view frames.
265
+
266
+ ```shell
267
+ cd /path/to/FluidNexusRoot/FluidNexus/CogVideoX
268
+
269
+ bash tools_gen/gen_future_pi2v_fluid_nexus_smoke.sh
270
+
271
+ bash tools_gen/gen_future_pi2v_fluid_nexus_ball.sh
272
+
273
+ bash tools_gen/gen_future_pi2v_scalar_real.sh
274
+ ```
275
+
276
+ ### Fluid dynamics reconstruction with future prediction
277
+
278
+ #### 1. Optimize the physical particles with future prediction
279
+
280
+ ```shell
281
+ cd /path/to/FluidNexusRoot/FluidNexus/FluidDynamics
282
+
283
+ # For FluidNeuxs-Smoke
284
+ bash tools_fluid_nexus/smoke_train_dynamics_physical_future.sh
285
+
286
+ # For FluidNeuxs-Ball
287
+ bash tools_fluid_nexus/ball_train_dynamics_physical_future.sh
288
+
289
+ # For ScalarReal
290
+ bash tools_scalar_real/train_physical_particle_future.sh
291
+ ```
292
+
293
+ #### 2. Optimize the visual particles with future prediction
294
+
295
+ ```shell
296
+ cd /path/to/FluidNexusRoot/FluidNexus/FluidDynamics
297
+
298
+ # For FluidNeuxs-Smoke
299
+ bash tools_fluid_nexus/smoke_train_dynamics_visual_future.sh
300
+
301
+ # For FluidNeuxs-Ball
302
+ bash tools_fluid_nexus/ball_train_dynamics_visual_future.sh
303
+
304
+ # For ScalarReal
305
+ bash tools_scalar_real/train_visual_particle_future.sh
306
+ ```
307
+
308
+ ## 💨 Counterfactual Interaction Simulation - Wind
309
+
310
+ ### Physics simulation with wind
311
+
312
+ ```shell
313
+ cd /path/to/FluidNexusRoot/FluidNexus/FluidDynamics
314
+ bash tools_fluid_nexus/smoke_wind_simulation.sh
315
+ ```
316
+
317
+ ### Convert the simulation results to CogVideoX format
318
+
319
+ ```shell
320
+ cd /path/to/FluidNexusRoot/FluidNexus/DataProcessing
321
+
322
+ # FluidNexus-Smoke wind interaction
323
+ # update the experiment name first
324
+ python convert_simulation_original_to_cogvideox.py
325
+ ```
326
+
327
+ ### Generative video refinement with wind
328
+
329
+ ```shell
330
+ cd /path/to/FluidNexusRoot/FluidNexus/CogVideoX
331
+
332
+ bash tools_gen/gen_future_pi2v_fluid_nexus_smoke_wind.sh
333
+ ```
334
+
335
+ ### Fluid dynamics reconstruction with wind
336
+
337
+ #### 1. Optimize the physical particles with wind
338
+
339
+ ```shell
340
+ cd /path/to/FluidNexusRoot/FluidNexus/FluidDynamics
341
+
342
+ bash tools_fluid_nexus/smoke_train_dynamics_physical_wind.sh
343
+ ```
344
+
345
+ #### 2. Optimize the visual particles with wind
346
+
347
+ ```shell
348
+ cd /path/to/FluidNexusRoot/FluidNexus/FluidDynamics
349
+
350
+ bash fluid_dynamics/tools_fluid_nexus/smoke_train_dynamics_visual_wind.sh
351
+ ```
352
+
353
+ ## 🔮 Counterfactual Interaction Simulation - Object
354
+
355
+ ### Fluid dynamics reconstruction with object
356
+
357
+ #### 1. Optimize the physical particles with object
358
+
359
+ ```shell
360
+ cd /path/to/FluidNexusRoot/FluidNexus/FluidDynamics
361
+
362
+ bash tools_fluid_nexus/object_train_dynamics_physical.sh
363
+ ```
364
+
365
+ #### 2. Optimize the visual particles with object
366
+
367
+ ```shell
368
+ cd /path/to/FluidNexusRoot/FluidNexus/FluidDynamics
369
+
370
+ bash fluid_dynamics/tools_fluid_nexus/object_train_dynamics_visual.sh
371
+ ```
372
+
373
+ ## 🚞 Zero123 Finetuning
374
+
375
+ ### Create Zero123 datasets
376
+
377
+ ```shell
378
+ cd /path/to/FluidNexusRoot/FluidNexus/DataProcessing
379
+
380
+ # FluidNexus-Smoke
381
+ bash create_zero123_fluid_nexus_smoke.sh
382
+
383
+ # FluidNexus-Ball
384
+ bash create_zero123_fluid_nexus_ball.sh
385
+
386
+ # ScalarFlow
387
+ bash create_zero123_scalar_flow.sh
388
+ ```
389
+
390
+ ### Finetune Zero123 models
391
+
392
+ ```shell
393
+ cd /path/to/FluidNexusRoot/FluidNexus/Zero123
394
+ # FluidNexus-Smoke
395
+ bash tools/train_fluid_nexus_smoke.sh
396
+
397
+ # FluidNexus-Ball
398
+ bash tools/train_fluid_nexus_ball.sh
399
+
400
+ # ScalarFlow
401
+ bash tools/train_scalar_flow.sh
402
+ ```
403
 
404
+ ## 🚂 CogVideoX LoRA Finetuning
405
+
406
+ ### Create CogVideoX datasets
407
+
408
+ ```shell
409
+ cd /path/to/FluidNexusRoot/FluidNexus/DataProcessing
410
+ # FluidNexus-Smoke
411
+ bash create_cogvideox_fluid_nexus_smoke.sh
412
+
413
+ # FluidNexus-Ball
414
+ bash create_cogvideox_fluid_nexus_ball.sh
415
+
416
+ # ScalarFlow
417
+ bash create_cogvideox_scalar_flow.sh
418
+ ```
419
 
420
+ ### Finetune CogVideoX models
421
+
422
+ ```shell
423
+ cd /path/to/FluidNexusRoot/FluidNexus/CogVideoX
424
+ # FluidNexus-Smoke
425
+ bash tools_finetune/finetune_pi2v_fluid_nexus_smoke.sh
426
+
427
+ # FluidNexus-Ball
428
+ bash tools_finetune/finetune_pi2v_fluid_nexus_ball.sh
429
+
430
+ # ScalarFlow
431
+ bash tools_finetune/finetune_pi2v_scalar_flow.sh
432
+ ```
433
+
434
+ ## 🌴 Acknowledgements
435
+
436
+ Thanks to these great repositories: [SpacetimeGaussians](https://github.com/oppo-us-research/SpacetimeGaussians), [3DGS](https://github.com/graphdeco-inria/gaussian-splatting), [HyFluid](https://github.com/y-zheng18/HyFluid), [CogVideo](https://github.com/THUDM/CogVideo), [Zero123](https://github.com/cvlab-columbia/zero123), [diffusers](https://github.com/huggingface/diffusers) and many other inspiring works in the community.
437
+
438
+ We sincerely thank the anonymous reviewers of CVPR 2025 for their helpful feedbacks.
439
+
440
+ ## ⭐️ Citation
441
+
442
+ If you find this code useful for your research, please cite our paper:
443
 
444
  ```bibtex
445
  @inproceedings{gao2025fluidnexus,
 
449
  month = {June},
450
  year = {2025},
451
  }
452
+ ```