nielsr HF Staff commited on
Commit
0714b2e
·
verified ·
1 Parent(s): 7ec0409

Improve model card with full details and tags

Browse files

This PR significantly expands the model card by incorporating the detailed information from the project's GitHub README. This includes the abstract, comprehensive setup instructions, usage examples for various functionalities (reconstruction, prediction, and simulation), and a more complete acknowledgements section.

It also adds the `image-to-3d` pipeline tag, ensuring the model can be found at https://huggingface.co/models?pipeline_tag=image-to-3d, and correctly identifies `diffusers` as a dependency.

Files changed (1) hide show
  1. README.md +424 -5
README.md CHANGED
@@ -1,8 +1,10 @@
1
  ---
2
  license: mit
 
 
3
  ---
4
- # FluidNexus: 3D Fluid Reconstruction and Prediction From a Single Video
5
 
 
6
 
7
  [![arXiv](https://img.shields.io/badge/arXiv-2503.04720-b31b1b)](https://arxiv.org/abs/2503.04720)
8
  [![Paper PDF](https://img.shields.io/badge/Paper-PDF-blue)](https://arxiv.org/pdf/2503.04720)
@@ -11,16 +13,433 @@ license: mit
11
  [![Hugging Face Datasets](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Datasets-orange)](https://huggingface.co/datasets/yuegao/FluidNexusDatasets)
12
  [![Hugging Face Models](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Models-gold)](https://huggingface.co/yuegao/FluidNexusModels)
13
 
14
- [Yue Gao*](https://yuegao.me/), [Hong-Xing "Koven" Yu*](https://kovenyu.com/), [Bo Zhu](https://faculty.cc.gatech.edu/~bozhu/), [Jiajun Wu](https://jiajunwu.com/)
15
 
16
  [Stanford University](https://svl.stanford.edu/); [Microsoft](https://microsoft.com/); [Georgia Institute of Technology](https://www.gatech.edu/)
17
 
18
  \* denotes equal contribution
19
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
 
21
- ## Citation
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
 
23
- If you find these datasets useful for your research, please cite our paper:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
24
 
25
  ```bibtex
26
  @inproceedings{gao2025fluidnexus,
@@ -30,4 +449,4 @@ If you find these datasets useful for your research, please cite our paper:
30
  month = {June},
31
  year = {2025},
32
  }
33
- ```
 
1
  ---
2
  license: mit
3
+ pipeline_tag: image-to-3d
4
+ library_name: diffusers
5
  ---
 
6
 
7
+ # FluidNexus: 3D Fluid Reconstruction and Prediction From a Single Video
8
 
9
  [![arXiv](https://img.shields.io/badge/arXiv-2503.04720-b31b1b)](https://arxiv.org/abs/2503.04720)
10
  [![Paper PDF](https://img.shields.io/badge/Paper-PDF-blue)](https://arxiv.org/pdf/2503.04720)
 
13
  [![Hugging Face Datasets](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Datasets-orange)](https://huggingface.co/datasets/yuegao/FluidNexusDatasets)
14
  [![Hugging Face Models](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Models-gold)](https://huggingface.co/yuegao/FluidNexusModels)
15
 
16
+ [**Yue Gao**\*](https://yuegao.me/), [**Hong-Xing "Koven" Yu**\*](https://kovenyu.com/), [**Bo Zhu**](https://faculty.cc.gatech.edu/~bozhu/), [**Jiajun Wu**](https://jiajunwu.com/)
17
 
18
  [Stanford University](https://svl.stanford.edu/); [Microsoft](https://microsoft.com/); [Georgia Institute of Technology](https://www.gatech.edu/)
19
 
20
  \* denotes equal contribution
21
 
22
+ ![FluidNexus Teaser](https://github.com/ueoo/FluidNexus/raw/main/assets/teaser_vel.gif)
23
+
24
+ ## Abstract
25
+ We study reconstructing and predicting 3D fluid appearance and velocity from a single video. Current methods require multi-view videos for fluid reconstruction. We present FluidNexus, a novel framework that bridges video generation and physics simulation to tackle this task. Our key insight is to synthesize multiple novel-view videos as references for reconstruction. FluidNexus consists of two key components: (1) a novel-view video synthesizer that combines frame-wise view synthesis with video diffusion refinement for generating realistic videos, and (2) a physics-integrated particle representation coupling differentiable simulation and rendering to simultaneously facilitate 3D fluid reconstruction and prediction. To evaluate our approach, we collect two new real-world fluid datasets featuring textured backgrounds and object interactions. Our method enables dynamic novel view synthesis, future prediction, and interaction simulation from a single fluid video.
26
+
27
+ ## 🚀 Get Started
28
+
29
+ > Don’t forget to update all `/path/to/FluidNexusRoot` to your real path. Find & Replace is your friend!
30
+
31
+ ### Set Up Root Folder and Python Environment
32
+
33
+ ```shell
34
+ mkdir -p /path/to/FluidNexusRoot
35
+
36
+ cd /path/to/FluidNexusRoot
37
+ git clone https://github.com/ueoo/FluidNexus.git
38
+
39
+ cd FluidNexus
40
+ conda env create -f fluid_nexus.yml
41
+
42
+ conda activate fluid_nexus
43
+
44
+ # Install the 3D Gaussian Splatting submodules
45
+ pip install git+https://github.com/graphdeco-inria/diff-gaussian-rasterization.git
46
+ pip install git+https://github.com/facebookresearch/pytorch3d.git@stable
47
+ cd /path/to/FluidNexusRoot/FluidNexus/FluidDynamics
48
+ pip install submodules/gaussian_rasterization_ch3
49
+ pip install submodules/gaussian_rasterization_ch1
50
+ pip install submodules/simple-knn
51
+ pip install git+https://github.com/openai/CLIP.git
52
+ pip install xformers --index-url https://download.pytorch.org/whl/cu124
53
+ ```
54
+
55
+ ### Download the Datasets
56
+
57
+ Our **FluidNexus-Smoke** and **FluidNexus-Ball** datasets each include 120 scenes. Every scene contains 5 synchronized multi-view videos, with cameras arranged along a horizontal arc of approximately 120°.
58
+
59
+ * **FluidNexusSmoke** and **FluidNexusBall**: Processed datasets containing one example sample used in our paper.
60
+ * **FluidNexusSmokeAll** and **FluidNexusBallAll**: All samples processed into frames, usable within the FluidNexus framework.
61
+ * **FluidNexusSmokeAllRaw** and **FluidNexusBallAllRaw**: Raw videos of all samples as originally captured.
62
+ > For a quick start, just download either FluidNexusSmoke or FluidNexusBall. The ones labeled ‘All’ contain all the datasets we collected; you should use them only if you want to finetune Zero123 or CogVideo-X, or perform a thorough evaluation on the entire dataset.
63
+
64
+ For **ScalarFlow**, please refer to the original [website](https://ge.in.tum.de/publications/2019-scalarflow-eckert/).
65
+
66
+ ```shell
67
+ cd /path/to/FluidNexusRoot
68
+
69
+ # Download FluidNexus-Smoke FluidNexus-Ball ScalarReal datasets from Hugging Face
70
+ # To use the full dataset, you can clone it directly from HF:
71
+ # git clone https://huggingface.co/datasets/yuegao/FluidNexusDatasets
72
+ cd FluidNexusDatasets
73
+ # conda install -c conda-forge git-lfs
74
+ # git lfs install
75
+ # git lfs pull
76
+
77
+ # If you only want to download the two without ‘All’, you can do so with:
78
+ # wget https://huggingface.co/datasets/yuegao/FluidNexusDatasets/resolve/main/FluidNexusBall.zip?download=true
79
+ # wget https://huggingface.co/datasets/yuegao/FluidNexusDatasets/resolve/main/FluidNexusSmoke.zip?download=true
80
+
81
+
82
+ unzip FluidNexusBall.zip
83
+ # unzip FluidNexusBallAll.zip
84
+ # unzip FluidNexusBallAllRaw.zip
85
+ unzip FluidNexusSmoke.zip
86
+ # unzip FluidNexusSmokeAll.zip
87
+ # unzip FluidNexusSmokeAllRaw.zip
88
+ unzip ScalarReal.zip
89
+
90
+ mv FluidNexusBall /path/to/FluidNexusRoot
91
+ # mv FluidNexusBallAll /path/to/FluidNexusRoot
92
+ # mv FluidNexusBallAllRaw /path/to/FluidNexusRoot
93
+ mv FluidNexusSmoke /path/to/FluidNexusRoot
94
+ # mv FluidNexusSmokeAll /path/to/FluidNexusRoot
95
+ # mv FluidNexusSmokeAllRaw /path/to/FluidNexusRoot
96
+ mv ScalarReal /path/to/FluidNexusRoot
97
+ ```
98
+
99
+ ### Frame-wise Novel View Synthesis
100
+
101
+ #### 1. Convert the frames to Zero123 input frames and create the cameras
102
+
103
+ ```shell
104
+ cd /path/to/FluidNexusRoot/FluidNexus/DataProcessing
105
+
106
+ python convert_original_to_zero123.py
107
+
108
+ # note: update the dataset_name in create_zero123_cams.py first
109
+ python create_zero123_cams.py
110
+ ```
111
+
112
+ #### 2. Download the pretrained Zero123 and CogVideoX models
113
+
114
+ ```shell
115
+ cd /path/to/FluidNexusRoot
116
+
117
+ # Zero123 base models
118
+ mkdir -p zero123_weights
119
+ cd zero123_weights
120
+ wget https://zero123.cs.columbia.edu/assets/zero123-xl.ckpt
121
+
122
+ # CogVideoX base models
123
+ mkdir -p cogvideox-sat
124
+ # Please refer to the CogVideoX repo, we use the 1.0 version
125
+ # https://github.com/THUDM/CogVideo/blob/main/sat/README.md
126
+
127
+ # Our finetuned models
128
+ git clone https://huggingface.co/yuegao/FluidNexusModels
129
+
130
+ cd FluidNexusModels
131
+
132
+ mv zero123_finetune_logs /path/to/FluidNexusRoot
133
+ mv cogvideox_lora_ckpts /path/to/FluidNexusRoot
134
+ ```
135
+
136
+ #### 3. Inference the frame-wise novel view synthesis model
137
+
138
+ Take `FluidNexus-Smoke` as an example, we assume the camera 2 is the middle camera, which is used as input:
139
+
140
+ ```shell
141
+ cd /path/to/FluidNexusRoot/FluidNexus/Zero123
142
+
143
+ python inference/infer_fluid_nexus_smoke.py --tgt_cam 0
144
+ python inference/infer_fluid_nexus_smoke.py --tgt_cam 1
145
+ python inference/infer_fluid_nexus_smoke.py --tgt_cam 3
146
+ python inference/infer_fluid_nexus_smoke.py --tgt_cam 4
147
+ ```
148
+
149
+ ### Generative Video Refinement
150
+
151
+ #### 1. Convert Zero123 output frames to CogVideoX input frames
152
+
153
+ ```shell
154
+ cd /path/to/FluidNexusRoot/FluidNexus/DataProcessing
155
+ python convert_zero123_to_cogvideox.py
156
+ ```
157
+
158
+ #### 2. Inference the video generative models
159
+
160
+ ```shell
161
+ cd /path/to/FluidNexusRoot/FluidNexus/CogVideoX
162
+
163
+ bash tools_gen/gen_zero123_pi2v_long_fluid_nexus_smoke.sh
164
+
165
+ bash tools_gen/gen_zero123_pi2v_long_fluid_nexus_ball.sh
166
+
167
+ bash tools_gen/gen_zero123_pi2v_long_scalar_real.sh
168
+ ```
169
+
170
+ #### 3. Convert the video gen output frames to original frame format
171
+
172
+ ```shell
173
+ cd /path/to/FluidNexusRoot/FluidNexus/DataProcessing
174
+
175
+ python convert_cogvideox_to_original.py
176
+ ```
177
+
178
+ ### Fluid Dynamics Reconstruction
179
+
180
+ #### 1. Optimize the background
181
+
182
+ Skip this step for ScalarReal dataset
183
+
184
+ ```shell
185
+ cd /path/to/FluidNexusRoot/FluidNexus/FluidDynamics
186
+
187
+ # For FluidNeuxs-Smoke
188
+ bash tools_fluid_nexus/smoke_train_background.sh
189
+
190
+ # For FluidNeuxs-Ball
191
+ bash tools_fluid_nexus/ball_train_background.sh
192
+ ```
193
+
194
+ #### 2. Optimize the physical particles
195
+
196
+ ```shell
197
+ cd /path/to/FluidNexusRoot/FluidNexus/FluidDynamics
198
+
199
+ # For FluidNeuxs-Smoke
200
+ bash tools_fluid_nexus/smoke_train_dynamics_physical.sh
201
+
202
+ # For FluidNeuxs-Ball
203
+ bash tools_fluid_nexus/ball_train_dynamics_physical.sh
204
+
205
+ # For ScalarReal
206
+ bash tools_scalar_real/train_physical_particle.sh
207
+ ```
208
+
209
+ #### 3. Optimize the visual particles
210
+
211
+ ```shell
212
+ cd /path/to/FluidNexusRoot/FluidNexus/FluidDynamics
213
+
214
+ # For FluidNeuxs-Smoke
215
+ bash tools_fluid_nexus/smoke_train_dynamics_visual.sh
216
+
217
+ # For FluidNeuxs-Ball
218
+ bash tools_fluid_nexus/ball_train_dynamics_visual.sh
219
+
220
+ # For ScalarReal
221
+ bash tools_scalar_real/train_visual_particle.sh
222
+ ```
223
+
224
+ 🎊🎊 The results are located in `training_render`! 🎊🎊
225
+
226
+ ## 🕰️ Future Prediction
227
+
228
+ ### Physics simulation
229
+
230
+ Physics simulation is used to render rough multi-view future prediction frames.
231
+
232
+ ```shell
233
+ cd /path/to/FluidNexusRoot/FluidNexus/FluidDynamics
234
+
235
+ # For FluidNeuxs-Smoke
236
+ bash tools_fluid_nexus/smoke_future_simulation.sh
237
+
238
+ # For FluidNeuxs-Ball
239
+ bash tools_fluid_nexus/ball_future_simulation.sh
240
+
241
+ # For ScalarReal
242
+ bash tools_scalar_real/future_simulation.sh
243
+ ```
244
+
245
+ ### Convert the simulation results to CogVideoX input format
246
+
247
+ ```shell
248
+ cd /path/to/FluidNexusRoot/FluidNexus/DataProcessing
249
+
250
+ # FluidNexus-Smoke
251
+ # update the experiment name first
252
+ python convert_simulation_original_to_cogvideox.py
253
+
254
+ # FluidNexus-Ball
255
+ # update the experiment name first
256
+ python convert_simulation_original_to_cogvideox.py
257
+
258
+ # ScalarReal
259
+ python convert_simulation_original_to_cogvideox_unshift.py
260
+ ```
261
+
262
+ ### Generative video refinement on future prediction
263
+
264
+ Refine the rough multi-view frames.
265
+
266
+ ```shell
267
+ cd /path/to/FluidNexusRoot/FluidNexus/CogVideoX
268
+
269
+ bash tools_gen/gen_future_pi2v_fluid_nexus_smoke.sh
270
+
271
+ bash tools_gen/gen_future_pi2v_fluid_nexus_ball.sh
272
+
273
+ bash tools_gen/gen_future_pi2v_scalar_real.sh
274
+ ```
275
+
276
+ ### Fluid dynamics reconstruction with future prediction
277
+
278
+ #### 1. Optimize the physical particles with future prediction
279
+
280
+ ```shell
281
+ cd /path/to/FluidNexusRoot/FluidNexus/FluidDynamics
282
+
283
+ # For FluidNeuxs-Smoke
284
+ bash tools_fluid_nexus/smoke_train_dynamics_physical_future.sh
285
+
286
+ # For FluidNeuxs-Ball
287
+ bash tools_fluid_nexus/ball_train_dynamics_physical_future.sh
288
+
289
+ # For ScalarReal
290
+ bash tools_scalar_real/train_physical_particle_future.sh
291
+ ```
292
+
293
+ #### 2. Optimize the visual particles with future prediction
294
+
295
+ ```shell
296
+ cd /path/to/FluidNexusRoot/FluidNexus/FluidDynamics
297
+
298
+ # For FluidNeuxs-Smoke
299
+ bash tools_fluid_nexus/smoke_train_dynamics_visual_future.sh
300
+
301
+ # For FluidNeuxs-Ball
302
+ bash tools_fluid_nexus/ball_train_dynamics_visual_future.sh
303
+
304
+ # For ScalarReal
305
+ bash tools_scalar_real/train_visual_particle_future.sh
306
+ ```
307
+
308
+ ## 💨 Counterfactual Interaction Simulation - Wind
309
+
310
+ ### Physics simulation with wind
311
+
312
+ ```shell
313
+ cd /path/to/FluidNexusRoot/FluidNexus/FluidDynamics
314
+ bash tools_fluid_nexus/smoke_wind_simulation.sh
315
+ ```
316
+
317
+ ### Convert the simulation results to CogVideoX format
318
+
319
+ ```shell
320
+ cd /path/to/FluidNexusRoot/FluidNexus/DataProcessing
321
+
322
+ # FluidNexus-Smoke wind interaction
323
+ # update the experiment name first
324
+ python convert_simulation_original_to_cogvideox.py
325
+ ```
326
+
327
+ ### Generative video refinement with wind
328
+
329
+ ```shell
330
+ cd /path/to/FluidNexusRoot/FluidNexus/CogVideoX
331
+
332
+ bash tools_gen/gen_future_pi2v_fluid_nexus_smoke_wind.sh
333
+ ```
334
+
335
+ ### Fluid dynamics reconstruction with wind
336
+
337
+ #### 1. Optimize the physical particles with wind
338
+
339
+ ```shell
340
+ cd /path/to/FluidNexusRoot/FluidNexus/FluidDynamics
341
+
342
+ bash tools_fluid_nexus/smoke_train_dynamics_physical_wind.sh
343
+ ```
344
+
345
+ #### 2. Optimize the visual particles with wind
346
+
347
+ ```shell
348
+ cd /path/to/FluidNexusRoot/FluidNexus/FluidDynamics
349
+
350
+ bash fluid_dynamics/tools_fluid_nexus/smoke_train_dynamics_visual_wind.sh
351
+ ```
352
+
353
+ ## 🔮 Counterfactual Interaction Simulation - Object
354
+
355
+ ### Fluid dynamics reconstruction with object
356
+
357
+ #### 1. Optimize the physical particles with object
358
+
359
+ ```shell
360
+ cd /path/to/FluidNexusRoot/FluidNexus/FluidDynamics
361
+
362
+ bash tools_fluid_nexus/object_train_dynamics_physical.sh
363
+ ```
364
+
365
+ #### 2. Optimize the visual particles with object
366
+
367
+ ```shell
368
+ cd /path/to/FluidNexusRoot/FluidNexus/FluidDynamics
369
+
370
+ bash fluid_dynamics/tools_fluid_nexus/object_train_dynamics_visual.sh
371
+ ```
372
+
373
+ ## 🚞 Zero123 Finetuning
374
+
375
+ ### Create Zero123 datasets
376
+
377
+ ```shell
378
+ cd /path/to/FluidNexusRoot/FluidNexus/DataProcessing
379
+
380
+ # FluidNexus-Smoke
381
+ bash create_zero123_fluid_nexus_smoke.sh
382
+
383
+ # FluidNexus-Ball
384
+ bash create_zero123_fluid_nexus_ball.sh
385
+
386
+ # ScalarFlow
387
+ bash create_zero123_scalar_flow.sh
388
+ ```
389
+
390
+ ### Finetune Zero123 models
391
+
392
+ ```shell
393
+ cd /path/to/FluidNexusRoot/FluidNexus/Zero123
394
+ # FluidNexus-Smoke
395
+ bash tools/train_fluid_nexus_smoke.sh
396
+
397
+ # FluidNexus-Ball
398
+ bash tools/train_fluid_nexus_ball.sh
399
+
400
+ # ScalarFlow
401
+ bash tools/train_scalar_flow.sh
402
+ ```
403
 
404
+ ## 🚂 CogVideoX LoRA Finetuning
405
+
406
+ ### Create CogVideoX datasets
407
+
408
+ ```shell
409
+ cd /path/to/FluidNexusRoot/FluidNexus/DataProcessing
410
+ # FluidNexus-Smoke
411
+ bash create_cogvideox_fluid_nexus_smoke.sh
412
+
413
+ # FluidNexus-Ball
414
+ bash create_cogvideox_fluid_nexus_ball.sh
415
+
416
+ # ScalarFlow
417
+ bash create_cogvideox_scalar_flow.sh
418
+ ```
419
 
420
+ ### Finetune CogVideoX models
421
+
422
+ ```shell
423
+ cd /path/to/FluidNexusRoot/FluidNexus/CogVideoX
424
+ # FluidNexus-Smoke
425
+ bash tools_finetune/finetune_pi2v_fluid_nexus_smoke.sh
426
+
427
+ # FluidNexus-Ball
428
+ bash tools_finetune/finetune_pi2v_fluid_nexus_ball.sh
429
+
430
+ # ScalarFlow
431
+ bash tools_finetune/finetune_pi2v_scalar_flow.sh
432
+ ```
433
+
434
+ ## 🌴 Acknowledgements
435
+
436
+ Thanks to these great repositories: [SpacetimeGaussians](https://github.com/oppo-us-research/SpacetimeGaussians), [3DGS](https://github.com/graphdeco-inria/gaussian-splatting), [HyFluid](https://github.com/y-zheng18/HyFluid), [CogVideo](https://github.com/THUDM/CogVideo), [Zero123](https://github.com/cvlab-columbia/zero123), [diffusers](https://github.com/huggingface/diffusers) and many other inspiring works in the community.
437
+
438
+ We sincerely thank the anonymous reviewers of CVPR 2025 for their helpful feedbacks.
439
+
440
+ ## ⭐️ Citation
441
+
442
+ If you find this code useful for your research, please cite our paper:
443
 
444
  ```bibtex
445
  @inproceedings{gao2025fluidnexus,
 
449
  month = {June},
450
  year = {2025},
451
  }
452
+ ```