Add model card for ReImagine

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +65 -0
README.md ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ pipeline_tag: image-to-video
4
+ ---
5
+
6
+ # ReImagine: Rethinking Controllable High-Quality Human Video Generation via Image-First Synthesis
7
+
8
+ [**Project Page**](https://keruzheng.github.io/ReImagine-Project/) | [**Paper (arXiv)**](https://arxiv.org/abs/2604.19720) | [**Code**](https://github.com/Taited/ReImagine) | [**Demo**](https://taited-reimagine.hf.space/)
9
+
10
+ **ReImagine** is a framework for controllable high-quality human video generation. It revisits the problem from an image-first perspective, where high-quality human appearance is learned via image generation and used as a prior for video synthesis. This approach decouples appearance modeling from temporal consistency.
11
+
12
+ The system utilizes a pose- and viewpoint-controllable pipeline that combines a pretrained image backbone with SMPL-X-based motion guidance, followed by a training-free temporal refinement stage based on a pretrained video diffusion model.
13
+
14
+ ## Getting Started
15
+
16
+ ### Installation
17
+
18
+ ```bash
19
+ conda create -n reimagine python=3.10
20
+ conda activate reimagine
21
+ pip install torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 --index-url https://download.pytorch.org/whl/cu124
22
+ pip install -e .
23
+ ```
24
+
25
+ ### Pretrained Weights
26
+
27
+ ReImagine utilizes base models and specific LoRA weights. You can download the weights using the Hugging Face CLI:
28
+
29
+ ```bash
30
+ # Download base FLUX.1 model
31
+ hf download black-forest-labs/FLUX.1-Kontext-dev \
32
+ --local-dir ./models/FLUX.1-Kontext-dev \
33
+ --exclude "flux1-kontext-dev.safetensors" \
34
+ --exclude "vae/**"
35
+
36
+ # Download ControlNet
37
+ hf download jasperai/Flux.1-dev-Controlnet-Surface-Normals \
38
+ --local-dir ./models/Flux.1-dev-Controlnet-Surface-Normals
39
+
40
+ # Download ReImagine LoRA Weights
41
+ hf download taited/ReImagine-Pretrained --local-dir ./models/ReImagine-Pretrained
42
+ ```
43
+
44
+ ## Inference
45
+
46
+ To perform image-first synthesis, use the provided inference script:
47
+
48
+ ```bash
49
+ python inference_img.py
50
+ ```
51
+ This script requires a wide reference image (front and back views) and a normal map generated from SMPL-X. For video synthesis, the temporal-refinement stage is used to ensure consistency across frames.
52
+
53
+ ## Citation
54
+
55
+ If you find this project useful, please consider citing the paper:
56
+
57
+ ```bibtex
58
+ @article{sun2025rethinking,
59
+ title={ReImagine: Rethinking Controllable High-Quality Human Video Generation via Image-First Synthesis},
60
+ author={Sun, Zhengwentai and Zheng, Keru and Li, Chenghong and Liao, Hongjie and Yang, Xihe and Li, Heyuan and Zhi, Yihao and Ning, Shuliang and Cui, Shuguang and Han, Xiaoguang},
61
+ journal={arXiv preprint arXiv:2604.19720},
62
+ year={2026},
63
+ url={https://arxiv.org/abs/2604.19720v1}
64
+ }
65
+ ```