Spaces:

banao-tech
/

model-testing

Build error

App Files Files Community

banao-tech commited on 16 days ago

Commit

e0ab64b

verified ·

1 Parent(s): d4cbec4

Update README.md

Browse files

Files changed (1) hide show

README.md +61 -14

README.md CHANGED Viewed

@@ -1,14 +1,61 @@
----
-title: Model Testing
-emoji: 📉
-colorFrom: indigo
-colorTo: red
-sdk: gradio
-sdk_version: 6.5.1
-app_file: app.py
-pinned: false
-license: apache-2.0
-short_description: Space for testing new open source AI Avatar models
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+# LatentSync on Hugging Face Spaces (T4 GPU)
+This is a working implementation of LatentSync 1.5 for lip-sync generation on Hugging Face Spaces with T4 GPU.
+## Key Fixes Applied
+1. **Config Path Fixed**: Changed from `configs/unet.yaml` to `configs/unet/stage2.yaml`
+2. **Requirements Optimized**: Properly formatted with newlines between packages
+3. **Python Version**: Using Python 3.10.13 as specified in `runtime.txt`
+## Files Required
+- `app.py` - Main application (UPDATED with correct config path)
+- `requirements.txt` - Python dependencies (UPDATED with proper formatting)
+- `packages.txt` - System packages (ffmpeg, git)
+- `runtime.txt` - Should contain: `python-3.10.13`
+## How It Works
+1. Clones LatentSync repository at runtime
+2. Downloads model checkpoints from `ByteDance/LatentSync-1.5`
+3. Converts input image + audio to a static video
+4. Runs lip-sync inference
+5. Returns the generated video
+## Model Notes
+- Using **LatentSync 1.5** which works better on T4 GPU (16GB)
+- Config: `configs/unet/stage2.yaml` (standard stage 2 config)
+- Alternative: For v1.6, use `configs/unet/stage2_512.yaml` and update `HF_CKPT_REPO` to `ByteDance/LatentSync-1.6`
+## Inference Parameters
+- **Inference Steps**: 10-40 (default 20)
+- **Guidance Scale**: 0.8-2.0 (default 1.0)
+- **Seed**: For reproducibility
+- **DeepCache**: Enabled by default for faster inference
+## GPU Requirements
+- T4 Small (16GB) - Works with LatentSync 1.5
+- Inference takes ~30-60 seconds per generation
+## Common Issues
+### If you get "FileNotFoundError: configs/unet.yaml"
+- Make sure you're using the updated `app.py` with the correct path: `configs/unet/stage2.yaml`
+### If you get CUDA out of memory
+- Reduce inference steps to 15
+- Make sure DeepCache is enabled
+- Use smaller input images (256x256 recommended)
+### If output quality is poor
+- Try increasing guidance_scale to 1.5-2.0
+- Increase inference_steps to 30-40
+- For v1.6, switch to `stage2_512.yaml` config for better quality
+## Credits
+Based on [LatentSync by ByteDance](https://github.com/bytedance/LatentSync)