Spaces:

ShalomKing
/

infinitetalk

Running

App Files Files Community

infinitetalk / DEPLOYMENT.md

ShalomKing

Upload folder using huggingface_hub

38572a2 verified 13 days ago

preview code

raw

history blame contribute delete

6.17 kB

A newer version of the Gradio SDK is available: 6.1.0

Upgrade

InfiniteTalk - Deployment Guide

Prerequisites

HuggingFace Account: Sign up at https://huggingface.co
Git & Git LFS: Install from https://git-scm.com

HuggingFace CLI (optional but recommended):

pip install huggingface_hub
huggingface-cli login

Deployment Steps

Option 1: Web UI (Easiest)

Create New Space
- Go to https://huggingface.co/new-space
- Space name: infinitetalk (or your choice)
- License: apache-2.0
- SDK: Gradio
- Hardware: ZeroGPU (free tier available!)
- Click "Create Space"
Upload Files
- Click "Files" tab in your new Space
- Upload all files from this directory:
  - README.md (with YAML metadata)
  - app.py
  - requirements.txt
  - packages.txt
  - .gitignore
  - src/ folder
  - wan/ folder
  - utils/ folder
  - assets/ folder (optional)
  - examples/ folder (optional)
  - LICENSE.txt
Wait for Build
- Space will automatically build
- First build takes 5-10 minutes (installing dependencies)
- Check "Logs" tab for build progress
- Watch for any error messages
Test Your Space
- Once built, the Space will show "Running"
- First generation will download models (~2-3 minutes)
- Try with example images/audio

Option 2: Git (Advanced)

Clone Your Space

git clone https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME
cd YOUR_SPACE_NAME

Copy Files

# From your local infinitetalk-hf-space directory
cp -r /path/to/infinitetalk-hf-space/* .

Commit and Push

git add .
git commit -m "Initial InfiniteTalk Space deployment"
git push

Monitor Build
- Go to your Space URL
- Check "Logs" for build progress

Option 3: CLI Upload

# From this directory
huggingface-cli upload YOUR_USERNAME/YOUR_SPACE_NAME . --repo-type=space

Troubleshooting

Build Fails with Flash-Attn Error

Symptom: flash-attn compilation fails

Solutions:

Try adding to requirements.txt:

flash-attn==2.7.4.post1 --no-build-isolation

Or use Dockerfile approach (create Dockerfile):

FROM nvidia/cuda:12.1.0-devel-ubuntu22.04

RUN apt-get update && apt-get install -y \
    python3.10 python3-pip git ffmpeg build-essential libsndfile1

WORKDIR /app

# Install PyTorch first
RUN pip install torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1

# Install flash-attn with pre-built wheels
RUN pip install flash-attn==2.7.4.post1 --no-build-isolation

# Copy and install requirements
COPY requirements.txt .
RUN pip install -r requirements.txt

# Copy application
COPY . .

CMD ["python3", "app.py"]

Models Not Downloading

Symptom: "Model download failed" error

Solutions:

Check HuggingFace is not down: https://status.huggingface.co
Add HF_TOKEN secret in Space settings (for private models)
Check model repository IDs in utils/model_loader.py

Out of Memory (OOM) Errors

Symptom: "CUDA out of memory"

Solutions:

Reduce resolution (use 480p instead of 720p)
Reduce diffusion steps (try 30 instead of 40)
Process shorter videos
Check utils/gpu_manager.py settings

Space Stuck in "Building"

Symptom: Build takes >15 minutes

Solutions:

Check "Logs" tab for errors
Flash-attn compilation can take 10+ minutes
If timeout, try Dockerfile approach
Consider pre-built flash-attn wheels

ZeroGPU Quota Exceeded

Symptom: "GPU quota exceeded"

Solutions:

Free Tier: Wait for quota to refill (1 ZeroGPU second = 30 real seconds)
Upgrade to PRO: $9/month for 8× quota
Apply for Grant: Community GPU Grant for innovative projects
Optimize generation time (reduce steps, use 480p)

Post-Deployment

Monitor Usage

Check "Logs" tab regularly
Monitor GPU quota in Space settings
Watch for user error reports in Community tab

Update Space

# Make changes locally
git add .
git commit -m "Update: [description]"
git push

Space will automatically rebuild on push.

Add Examples

Upload example images and audio to examples/ folder to help users get started quickly.

Enable Discussions

In Space settings, enable "Discussions" to get user feedback.

Apply for Community GPU Grant

If your Space is popular and useful:

Go to Space Settings
Click "Apply for community GPU grant"
Explain your project's value to the community

Hardware Options

Free ZeroGPU

Cost: FREE
Limits: 300s per session, 600s max quota
Best for: Testing, light usage, demos
GPU: H200 with 70GB VRAM

PRO ZeroGPU

Cost: $9/month
Benefits: 8× quota, priority queue, 10 Spaces
Best for: Regular usage, public demos

Dedicated GPU (Paid)

T4 (16GB): $0.60/hour - Too small for InfiniteTalk
A10G (24GB): $1.05/hour - Minimum viable
A100 (40GB): $3.00/hour - Overkill but works
Best for: Private, dedicated instances

Performance Expectations

First Generation

Model download: 2-3 minutes
Generation (10s video, 480p): 40 seconds
Total: ~3-4 minutes

Subsequent Generations

Generation (10s video, 480p): 35-40 seconds
Generation (10s video, 720p): 60-70 seconds

Free Tier Usage

~3-5 generations per quota period (600s ZeroGPU)
Quota refills gradually (1 ZeroGPU second per 30 real seconds)

Support

Issues: File at https://github.com/MeiGen-AI/InfiniteTalk/issues
Discussions: Use Space's Community tab
HF Forums: https://discuss.huggingface.co

Success Checklist

Space builds without errors
Models download successfully on first run
Example image-to-video generation works
Example video dubbing works
No OOM errors with 480p
GPU memory is cleaned up between runs
Gradio UI is responsive
Examples are loaded and working
README displays correctly
Space doesn't crash after multiple uses

Good luck with your deployment! 🚀