Spaces:
Running
Running
A newer version of the Gradio SDK is available:
6.1.0
InfiniteTalk - Deployment Guide
Prerequisites
- HuggingFace Account: Sign up at https://huggingface.co
- Git & Git LFS: Install from https://git-scm.com
- HuggingFace CLI (optional but recommended):
pip install huggingface_hub huggingface-cli login
Deployment Steps
Option 1: Web UI (Easiest)
Create New Space
- Go to https://huggingface.co/new-space
- Space name:
infinitetalk(or your choice) - License:
apache-2.0 - SDK:
Gradio - Hardware:
ZeroGPU(free tier available!) - Click "Create Space"
Upload Files
- Click "Files" tab in your new Space
- Upload all files from this directory:
README.md(with YAML metadata)app.pyrequirements.txtpackages.txt.gitignoresrc/folderwan/folderutils/folderassets/folder (optional)examples/folder (optional)LICENSE.txt
Wait for Build
- Space will automatically build
- First build takes 5-10 minutes (installing dependencies)
- Check "Logs" tab for build progress
- Watch for any error messages
Test Your Space
- Once built, the Space will show "Running"
- First generation will download models (~2-3 minutes)
- Try with example images/audio
Option 2: Git (Advanced)
Clone Your Space
git clone https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME cd YOUR_SPACE_NAMECopy Files
# From your local infinitetalk-hf-space directory cp -r /path/to/infinitetalk-hf-space/* .Commit and Push
git add . git commit -m "Initial InfiniteTalk Space deployment" git pushMonitor Build
- Go to your Space URL
- Check "Logs" for build progress
Option 3: CLI Upload
# From this directory
huggingface-cli upload YOUR_USERNAME/YOUR_SPACE_NAME . --repo-type=space
Troubleshooting
Build Fails with Flash-Attn Error
Symptom: flash-attn compilation fails
Solutions:
Try adding to
requirements.txt:flash-attn==2.7.4.post1 --no-build-isolationOr use Dockerfile approach (create
Dockerfile):FROM nvidia/cuda:12.1.0-devel-ubuntu22.04 RUN apt-get update && apt-get install -y \ python3.10 python3-pip git ffmpeg build-essential libsndfile1 WORKDIR /app # Install PyTorch first RUN pip install torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 # Install flash-attn with pre-built wheels RUN pip install flash-attn==2.7.4.post1 --no-build-isolation # Copy and install requirements COPY requirements.txt . RUN pip install -r requirements.txt # Copy application COPY . . CMD ["python3", "app.py"]
Models Not Downloading
Symptom: "Model download failed" error
Solutions:
- Check HuggingFace is not down: https://status.huggingface.co
- Add HF_TOKEN secret in Space settings (for private models)
- Check model repository IDs in
utils/model_loader.py
Out of Memory (OOM) Errors
Symptom: "CUDA out of memory"
Solutions:
- Reduce resolution (use 480p instead of 720p)
- Reduce diffusion steps (try 30 instead of 40)
- Process shorter videos
- Check
utils/gpu_manager.pysettings
Space Stuck in "Building"
Symptom: Build takes >15 minutes
Solutions:
- Check "Logs" tab for errors
- Flash-attn compilation can take 10+ minutes
- If timeout, try Dockerfile approach
- Consider pre-built flash-attn wheels
ZeroGPU Quota Exceeded
Symptom: "GPU quota exceeded"
Solutions:
- Free Tier: Wait for quota to refill (1 ZeroGPU second = 30 real seconds)
- Upgrade to PRO: $9/month for 8× quota
- Apply for Grant: Community GPU Grant for innovative projects
- Optimize generation time (reduce steps, use 480p)
Post-Deployment
Monitor Usage
- Check "Logs" tab regularly
- Monitor GPU quota in Space settings
- Watch for user error reports in Community tab
Update Space
# Make changes locally
git add .
git commit -m "Update: [description]"
git push
Space will automatically rebuild on push.
Add Examples
Upload example images and audio to examples/ folder to help users get started quickly.
Enable Discussions
In Space settings, enable "Discussions" to get user feedback.
Apply for Community GPU Grant
If your Space is popular and useful:
- Go to Space Settings
- Click "Apply for community GPU grant"
- Explain your project's value to the community
Hardware Options
Free ZeroGPU
- Cost: FREE
- Limits: 300s per session, 600s max quota
- Best for: Testing, light usage, demos
- GPU: H200 with 70GB VRAM
PRO ZeroGPU
- Cost: $9/month
- Benefits: 8× quota, priority queue, 10 Spaces
- Best for: Regular usage, public demos
Dedicated GPU (Paid)
- T4 (16GB): $0.60/hour - Too small for InfiniteTalk
- A10G (24GB): $1.05/hour - Minimum viable
- A100 (40GB): $3.00/hour - Overkill but works
- Best for: Private, dedicated instances
Performance Expectations
First Generation
- Model download: 2-3 minutes
- Generation (10s video, 480p): 40 seconds
- Total: ~3-4 minutes
Subsequent Generations
- Generation (10s video, 480p): 35-40 seconds
- Generation (10s video, 720p): 60-70 seconds
Free Tier Usage
- ~3-5 generations per quota period (600s ZeroGPU)
- Quota refills gradually (1 ZeroGPU second per 30 real seconds)
Support
- Issues: File at https://github.com/MeiGen-AI/InfiniteTalk/issues
- Discussions: Use Space's Community tab
- HF Forums: https://discuss.huggingface.co
Success Checklist
- Space builds without errors
- Models download successfully on first run
- Example image-to-video generation works
- Example video dubbing works
- No OOM errors with 480p
- GPU memory is cleaned up between runs
- Gradio UI is responsive
- Examples are loaded and working
- README displays correctly
- Space doesn't crash after multiple uses
Good luck with your deployment! 🚀