Spaces:
Paused
Paused
| ο»Ώ# Alternative OmniAvatar Model Download Guide | |
| ## π― Why You're Getting Only Audio Output | |
| Your app is working correctly but running in **TTS-only mode** because the OmniAvatar-14B models are missing. The app gracefully falls back to audio-only generation when video models aren't available. | |
| ## π Solutions to Enable Video Generation | |
| ### Option 1: Use Git to Download Models (If you have Git LFS) | |
| # Create model directories | |
| mkdir pretrained_models\Wan2.1-T2V-14B | |
| mkdir pretrained_models\OmniAvatar-14B | |
| mkdir pretrained_models\wav2vec2-base-960h | |
| # Clone models (requires Git LFS) | |
| git lfs clone https://huggingface.co/Wan-AI/Wan2.1-T2V-14B pretrained_models/Wan2.1-T2V-14B | |
| git lfs clone https://huggingface.co/OmniAvatar/OmniAvatar-14B pretrained_models/OmniAvatar-14B | |
| git lfs clone https://huggingface.co/facebook/wav2vec2-base-960h pretrained_models/wav2vec2-base-960h | |
| ### Option 2: Install Python and Run Setup Script | |
| 1. **Install Python** (if not already done): | |
| - Download from: https://python.org/downloads/ | |
| - Or enable from Microsoft Store | |
| - Make sure to check "Add to PATH" during installation | |
| 2. **Run the setup script**: | |
| python setup_omniavatar.py | |
| ### Option 3: Manual Download from HuggingFace | |
| Visit these URLs and download manually: | |
| - https://huggingface.co/Wan-AI/Wan2.1-T2V-14B | |
| - https://huggingface.co/OmniAvatar/OmniAvatar-14B | |
| - https://huggingface.co/facebook/wav2vec2-base-960h | |
| Extract to: | |
| - pretrained_models/Wan2.1-T2V-14B/ | |
| - pretrained_models/OmniAvatar-14B/ | |
| - pretrained_models/wav2vec2-base-960h/ | |
| ### Option 4: Use Windows Subsystem for Linux (WSL) | |
| If you have WSL installed: | |
| ```bash | |
| wsl | |
| cd /mnt/c/path/to/your/project | |
| python setup_omniavatar.py | |
| ``` | |
| ## π Model Requirements | |
| Total download size: ~30.36GB | |
| - Wan2.1-T2V-14B: ~28GB (base text-to-video model) | |
| - OmniAvatar-14B: ~2GB (avatar animation weights) | |
| - wav2vec2-base-960h: ~360MB (audio encoder) | |
| ## π Verify Installation | |
| After downloading, restart your app and check: | |
| - The app should show "full functionality enabled" in logs | |
| - API responses should return video URLs instead of just audio | |
| - Gradio interface should show video output component | |
| ## π‘ Current Status | |
| Your setup is working perfectly for TTS! Once the OmniAvatar models are downloaded, you'll get: | |
| β Audio-driven avatar videos | |
| β Adaptive body animation | |
| β Lip-sync accuracy | |
| β 480p video output | |