time-to-move / docs /HUGGINGFACE.md

rmz92002

Upload 95 files

5484092 verified 3 months ago

5.54 kB

	# Hosting Time-to-Move on Hugging Face

	This guide explains how to mirror the Time-to-Move (TTM) codebase on the 🤗 Hub and how to expose an interactive demo through a Space. It assumes you already read the main `README.md`, understand how to run `run_wan.py`, and have access to Wan 2.2 weights through your Hugging Face account.

	---

	## 1. Prerequisites

	- Hugging Face account with access to the Wan 2.2 Image-to-Video model (`Wan-AI/Wan2.2-I2V-A14B-Diffusers` at the time of writing).
	- Local environment with Git, Git LFS, Python 3.10+, and the `huggingface_hub` CLI.
	- GPU-backed hardware both locally (for testing) and on Spaces (A100 or A10 is strongly recommended; CPU-only tiers are too slow for Wan 2.2).
	- Optional: organization namespace on the Hugging Face Hub (recommended if you want to publish under a team/org).

	Authenticate once locally (this stores a token in `~/.huggingface`):

	```bash
	huggingface-cli login
	git lfs install
	```

	---

	## 2. Publish the code as a model repository

	1. Create an empty repo on the Hub. Example:

	```bash
	huggingface-cli repo create time-to-move/wan-ttm --type=model --yes
	git clone https://huggingface.co/time-to-move/wan-ttm
	cd wan-ttm
	```

	2. Copy the TTM sources. From the project root, copy the files that users need to reproduce inference:

	```bash
	rsync -av \
	--exclude ".git/" \
	--exclude "outputs/" \
	/path/to/TTM/ \
	/path/to/wan-ttm/
	```

	Make sure `pipelines/`, `run_wan.py`, `run_cog.py`, `run_svd.py`, `examples/`, and the new `huggingface_space/` folder are included. Track large binary assets:

	```bash
	git lfs track ".mp4" ".png" "*.gif"
	git add .gitattributes
	```

	3. Add a model card. Reuse the main `README.md` or create a shorter version describing:
	- What Time-to-Move does.
	- How to run `run_wan.py` with the `motion_signal` + `mask`.
	- Which base model checkpoint the repo expects (Wan 2.2 I2V A14B).

	4. Push to the Hub.

	```bash
	git add .
	git commit -m "Initial commit of Time-to-Move Wan implementation"
	git push
	```

	Users can now do:

	```python
	from huggingface_hub import snapshot_download
	snapshot_download("time-to-move/wan-ttm")
	```

	---

	## 3. Prepare a Hugging Face Space (Gradio)

	This repository now contains `huggingface_space/`, a ready-to-use Space template:

	```
	huggingface_space/
	├── README.md # Quickstart instructions
	├── app.py # Gradio UI (loads Wan + Time-to-Move logic)
	└── requirements.txt # Runtime dependencies
	```

	### 3.1 Create the Space

	```bash
	huggingface-cli repo create time-to-move/wan-ttm-demo --type=space --sdk=gradio --yes
	git clone https://huggingface.co/spaces/time-to-move/wan-ttm-demo
	cd wan-ttm-demo
	```

	Copy everything from `huggingface_space/` into the Space repository (or keep the whole repo and set the Space’s working directory accordingly). Commit and push.

	### 3.2 Configure hardware and secrets

	- Hardware: Select an A100 (preferred) or A10 GPU runtime in the Space settings. Wan 2.2 is too heavy for CPUs.
	- WAN_MODEL_ID: If you mirrored Wan 2.2 into your organization, set the environment variable to point to it. Otherwise leave the default (`Wan-AI/Wan2.2-I2V-A14B-Diffusers`).
	- HF_TOKEN / WAN_ACCESS_TOKEN: Add a Space secret only if the Wan checkpoint is private. The Gradio app reads from `HF_TOKEN` automatically when calling `from_pretrained`.
	- PYTORCH_CUDA_ALLOC_CONF: Recommended value `expandable_segments:True` to reduce CUDA fragmentation.

	### 3.3 How the app works

	`huggingface_space/app.py` exposes:

	- A dropdown of the pre-packaged `examples/cutdrag_wan_*` prompts.
	- Optional custom uploads (`first_frame`, `mask.mp4`, `motion_signal.mp4`) following the README workflow.
	- Sliders for `tweak-index`, `tstrong-index`, guidance scale, seed, etc.
	- Live status messages and a generated MP4 preview using `diffusers.utils.export_to_video`.

	The UI lazily loads the `WanImageToVideoTTMPipeline` with tiling/slicing enabled to reduce VRAM usage. All preprocessing matches the logic in `run_wan.py` (the same `compute_hw_from_area` helper is reused).

	If you need to customize the experience (e.g., restrict to certain prompts, enforce shorter sequences), edit `huggingface_space/app.py` before pushing.

	---

	## 4. Testing checklist

	1. Local dry-run.
	```bash
	pip install -r huggingface_space/requirements.txt
	WAN_MODEL_ID=Wan-AI/Wan2.2-I2V-A14B-Diffusers \
	python huggingface_space/app.py
	```
	Ensure you can generate at least one of the bundled examples.

	2. Space smoke test.
	- Open the deployed Space.
	- Run the default example (`cutdrag_wan_Monkey`) and confirm you receive a video in ~2–3 minutes on A100 hardware.
	- Optionally upload a small custom mask/video pair and verify that `tweak-index`/`tstrong-index` are honored.

	3. Monitor logs. Use the Space “Logs” tab to confirm:
	- The pipeline downloads from the expected `WAN_MODEL_ID`.
	- VRAM usage stays within the selected hardware tier.

	4. Freeze dependencies. When satisfied, tag the Space (`v1`, `demo`) so users know which TTM commit it matches.

	You now have both a model repository (for anyone to clone/run) and a public Space for live demos. Feel free to adapt the instructions for the CogVideoX or Stable Video Diffusion pipelines if you plan to expose them as well; start by duplicating the provided Space template and swapping out `run_wan.py` for the relevant runner.