Spaces:
Paused
Paused
A newer version of the Gradio SDK is available:
6.1.0
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Project Overview
This is a face-preserving video generation system using Alibaba's Wan2.1 VACE model. It creates 512x512 resolution videos at 24fps by interpolating between start and end frames while maintaining the identity of a reference face.
Common Commands
Setup and Installation
# Install dependencies
pip install -r requirements.txt
# Download the model (first time only, ~20GB)
python model_download.py
# Run the application
python app.py
Development Commands
- No formal testing framework is configured
- No linting tools are configured (consider adding
rufforflake8) - To check if the app runs:
python -m py_compile app.py vace_integration.py config.py model_download.py
Architecture Overview
The codebase follows a clean separation of concerns:
app.py - Gradio web UI layer
- Handles user interactions and file uploads
- Manages temporary file operations
- Calls VACEProcessor for video generation
vace_integration.py - Core processing logic
VACEProcessorclass orchestrates the entire pipeline- Creates template videos with gray frame interpolation
- Invokes the Wan2.1-VACE model via subprocess
- Handles mask generation and frame processing
config.py - Centralized configuration
- All paths, parameters, and settings
- Model configuration (resolution, fps, frame count)
- Directory management
model_download.py - Model management
- Downloads Wan2.1-VACE-1.3B from Hugging Face
- Validates model integrity
Key Technical Details
- Model: Wan2.1-VACE-1.3B (1.3 billion parameters)
- GPU Requirements: NVIDIA A10 or better (24GB VRAM)
- Processing Flow:
- User uploads 3 images (reference face, start frame, end frame)
- System creates a template video with gray frames between start/end
- VACE model interpolates frames while preserving face identity
- Generated video is saved to
/results/directory
Important Directories
/cache/- Model and framework caches (huggingface, torch, transformers)/workspace/- Temporary processing directory/results/- Output videos/examples/- Sample images for demo
Notes for Development
- The project uses subprocess calls to invoke the model, not direct Python imports
- All configuration is centralized in config.py - modify settings there
- The Gradio interface is designed for Hugging Face Spaces deployment
- Japanese documentation files (構築ガイド.md, 作業タスク.md) contain implementation details
音声での通知
ユーザーへ何かしらの入力を求める際、つまりあなたの返答が終了したら、 毎回必ずsayコマンドを使って返答メッセージを出力してください。 なるべく短い文言で端的に答えること。タイムアウトしないよう5文字以内推奨。
例:
say "完了"
say "エラー"
say "準備OK"