Spaces:
Runtime error
Runtime error
| title: Coach Pro Lip Sync Tool | |
| emoji: π€ | |
| colorFrom: indigo | |
| colorTo: gray | |
| sdk: gradio | |
| sdk_version: 5.49.1 | |
| app_file: app.py | |
| pinned: false | |
| license: mit | |
| hardware: t4-small | |
| suggested_hardware: t4-small | |
| suggested_storage: small | |
| duplicate: true | |
| tags: | |
| - lip-sync | |
| - ai | |
| - video | |
| - audio | |
| - microphone | |
| - tts | |
| - gradio | |
| - ipad | |
| - mobile | |
| # π¬ Advanced Lip Sync Tool with Microphone Support | |
| ## π Features | |
| ### π€ **Microphone Integration** | |
| - **Real-time Recording**: Direct microphone access in browser | |
| - **Live Streaming**: Real-time audio processing | |
| - **Cross-platform**: Works on iPad, iPhone, Desktop | |
| - **Multiple Sources**: Upload files OR record live | |
| ### π€ **AI Models Supported** | |
| - **Wav2Lip**: High accuracy lip synchronization | |
| - **MuseTalk**: Real-time processing (30fps+) | |
| - **SadTalker**: Natural expressions and emotions | |
| ### π± **iPad Optimized** | |
| - β Touch-friendly interface | |
| - β Safari/Chrome compatible | |
| - β Mobile-responsive design | |
| - β No local installation required | |
| ### π΅ **Audio Features** | |
| - **Text-to-Speech**: Convert text to natural speech | |
| - **Multi-language**: English, Ψ§Ψ±Ψ―Ω, ΰ€Ήΰ€Ώΰ€ΰ€¦ΰ₯ support | |
| - **Voice Selection**: Male, Female, Natural voices | |
| - **Format Support**: WAV, MP3, M4A | |
| ## π **Quick Start** | |
| ### 1. **Basic Lip Sync** | |
| - Upload your video file | |
| - Record audio with microphone OR upload audio file | |
| - Choose AI model (Wav2Lip recommended for beginners) | |
| - Click "Generate Lip Sync" | |
| ### 2. **Text-to-Speech** | |
| - Upload your video | |
| - Type text in English or Ψ§Ψ±Ψ―Ω | |
| - Select voice type and language | |
| - Generate speech + lip sync automatically | |
| ### 3. **Live Recording** | |
| - Use real-time microphone monitoring | |
| - Test audio levels and quality | |
| - Perfect for testing before main processing | |
| ## π **Supported Formats** | |
| ### πΉ **Video Input** | |
| - MP4, AVI, MOV, WebM | |
| - Max size: 2GB | |
| - Max duration: 10 minutes | |
| - Recommended: 720p for faster processing | |
| ### π΅ **Audio Input** | |
| - WAV (recommended), MP3, M4A | |
| - Sample rates: 16kHz - 48kHz | |
| - Max recording: 30 seconds | |
| - Min quality: 16-bit | |
| ## π οΈ **Technical Specs** | |
| ### π§ **Processing** | |
| - **Hardware**: GPU T4 acceleration | |
| - **Speed**: 30-120 seconds per minute of video | |
| - **Quality**: Up to 1080p output | |
| - **Concurrent**: Multiple users supported | |
| ### π± **Mobile Compatibility** | |
| - **iOS Safari**: β Full support | |
| - **Chrome Mobile**: β Full support | |
| - **iPad**: β Optimized interface | |
| - **Android**: β Compatible | |
| ## π― **Model Comparison** | |
| | Feature | Wav2Lip | MuseTalk | SadTalker | | |
| |---------|---------|----------|----------| | |
| | **Accuracy** | βββββ | ββββ | ββββ | | |
| | **Speed** | βββ | βββββ | ββ | | |
| | **Quality** | βββββ | ββββ | ββββ | | |
| | **Expressions** | ββ | βββ | βββββ | | |
| | **Real-time** | β | β | β | | |
| | **Any Identity** | β | β | β | | |
| ## π¨ **Use Cases** | |
| ### πΊ **Content Creation** | |
| - YouTube videos | |
| - Social media content | |
| - Educational materials | |
| - Marketing videos | |
| ### π¬ **Professional** | |
| - Film dubbing | |
| - Language localization | |
| - Voice-over replacement | |
| - Commercial production | |
| ### π± **Personal** | |
| - Family videos | |
| - Meme creation | |
| - Fun projects | |
| - Learning tools | |
| ## π§ **iPad Setup Guide** | |
| ### π± **Step 1: Browser Setup** | |
| ``` | |
| 1. Open Safari or Chrome | |
| 2. Navigate to this Space | |
| 3. Allow microphone permissions | |
| 4. Test with Live Recording tab | |
| ``` | |
| ### π€ **Step 2: Microphone Test** | |
| ``` | |
| 1. Go to "Live Recording" tab | |
| 2. Tap microphone icon | |
| 3. Speak normally | |
| 4. Check audio levels | |
| ``` | |
| ### π¬ **Step 3: Processing** | |
| ``` | |
| 1. Upload video (MP4 recommended) | |
| 2. Record or upload audio | |
| 3. Choose quality (720p for speed) | |
| 4. Select AI model | |
| 5. Process and download | |
| ``` | |
| ## π **Troubleshooting** | |
| ### π€ **Microphone Issues** | |
| - Check browser permissions | |
| - Try refreshing the page | |
| - Use headphones to avoid feedback | |
| - Ensure quiet environment | |
| ### β‘ **Performance** | |
| - Use 720p for faster processing | |
| - Limit video to 5 minutes max | |
| - Close other browser tabs | |
| - Use WiFi for best experience | |
| ### π± **iPad Specific** | |
| - Safari works better than Chrome | |
| - Enable microphone in Settings > Safari | |
| - Use landscape mode for better view | |
| - Keep device charged during processing | |
| ## π **API Usage** | |
| This Space provides API endpoints for developers: | |
| ```python | |
| import requests | |
| # Example API call | |
| response = requests.post( | |
| "https://huggingface.co/spaces/[USERNAME]/[SPACE]/api/predict", | |
| json={ | |
| "data": [video_file, audio_file, "Wav2Lip"] | |
| } | |
| ) | |
| ``` | |
| ## π€ **Contributing** | |
| We welcome contributions! Areas for improvement: | |
| - Additional AI models | |
| - Better mobile optimization | |
| - More language support | |
| - Performance enhancements | |
| ## π **License** | |
| MIT License - Free for personal and commercial use | |
| ## π **Credits** | |
| - **Wav2Lip**: Original research by IIIT Hyderabad | |
| - **MuseTalk**: TMElyralab implementation | |
| - **SadTalker**: Xi'an Jiaotong University | |
| - **Gradio**: Interface framework | |
| - **Hugging Face**: Hosting platform | |
| ## π **Support** | |
| - π¬ **Discussions**: Use the Community tab | |
| - π **Bug Reports**: Create an issue | |
| - π§ **Contact**: [Your contact info] | |
| - π **Documentation**: See tabs in the app | |
| --- | |
| ### π **Star this Space if you find it useful!** | |
| **Made with β€οΈ for the AI community** |