# Gemini Live Avatar - FAQ ## Quick Start Guide ### Prerequisites - **GPU**: NVIDIA GPU with 11GB+ VRAM (recommended) - **Python**: 3.10 - **CUDA**: 11.8 - **OS**: Windows/Linux ### Installation 1. **Clone Repository** ```bash git clone https://github.com/Kedreamix/Linly-Talker.git cd Linly-Talker ``` 2. **Create Environment** ```bash conda create -n linly python=3.10 conda activate linly ``` 3. **Install PyTorch** ```bash # CUDA 11.8 pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118 ``` 4. **Install Dependencies** ```bash conda install -q ffmpeg pip install -r requirements_webui.txt # MuseTalk dependencies pip install --no-cache-dir -U openmim mim install mmengine mim install "mmcv>=2.0.1" mim install "mmdet>=3.1.0" mim install "mmpose>=1.1.0" ``` 5. **Download Models** Download the required models from one of these sources: - [Baidu Netdisk](https://pan.baidu.com/s/1eF13O-8wyw4B3MtesctQyg?pwd=linl) (Password: linl) - [HuggingFace](https://huggingface.co/Kedreamix/Linly-Talker) - [ModelScope](https://modelscope.cn/models/Kedreamix/Linly-Talker) **Required Models:** - MuseTalk models → `Musetalk/models/` - SadTalker checkpoints → `checkpoints/` - Face detection models → `gfpgan/weights/` 6. **Launch** ```bash python webui.py ``` Open `http://localhost:7860` in your browser. --- ## Common Issues ### 1. Installation Issues #### Q: `Microsoft Visual C++ 14.0 is required` **A:** Install [Microsoft C++ Build Tools](https://visualstudio.microsoft.com/visual-cpp-build-tools/) #### Q: `version GLIBCXX_3.4.* not found` **A:** Use Python 3.10 or downgrade libraries: ```bash pip install pyopenjtalk==0.3.1 pip install opencc==1.1.1 ``` #### Q: FFMPEG not found **A:** Install via conda: ```bash conda install -q ffmpeg ``` Or on Linux: ```bash sudo apt install ffmpeg ``` --- ### 2. Model & Weight Issues #### Q: `FileNotFoundError` for model weights **A:** Ensure models are in correct folders: ``` Linly-Talker/ ├── checkpoints/ │ ├── mapping_00109-model.pth.tar (149MB) │ ├── mapping_00229-model.pth.tar (149MB) │ └── ... ├── Musetalk/ │ └── models/ │ ├── musetalk/ │ ├── dwpose/ │ └── ... └── gfpgan/ └── weights/ ``` #### Q: `SadTalker Error: invalid load key, 'v'` **A:** Re-download `mapping_*.pth.tar` files (they should be 149MB each): ```bash wget -c https://modelscope.cn/api/v1/models/Kedreamix/Linly-Talker/repo?Revision=master&FilePath=checkpoints%2Fmapping_00109-model.pth.tar wget -c https://modelscope.cn/api/v1/models/Kedreamix/Linly-Talker/repo?Revision=master&FilePath=checkpoints%2Fmapping_00229-model.pth.tar ``` #### Q: `File is not a zip file` (NLTK error) **A:** Manually download `nltk_data`: ```python import nltk print(nltk.data.path) # Find cache path ``` Download from [Quark Netdisk](https://pan.quark.cn/s/f48f5e35796b) and place in cache path. --- ### 3. Runtime Issues #### Q: VRAM overflow / Out of Memory **A:** - **Minimum**: 6GB VRAM (SadTalker only) - **Recommended**: 11GB+ VRAM (MuseTalk) - **Solution**: Use lower resolution images or reduce batch size #### Q: `GFPGANer is not defined` **A:** Install enhancement module: ```bash pip install gfpgan ``` #### Q: `Gradio Connection errored out` **A:** - Check firewall settings - Try different port in `webui.py`: ```python demo.launch(server_port=7861) # Change port ``` #### Q: Avatar preparation fails **A:** - Use clear frontal face images/videos - Recommended resolution: 512x512 to 1024x1024 - Supported formats: `.jpg`, `.png`, `.mp4` --- ### 4. Gemini Live Specific Issues #### Q: WebSocket connection fails **A:** - Verify Railway bridge is running: `wss://gemini-live-bridge-production.up.railway.app/ws` - Check internet connection - Ensure no firewall blocking WebSocket connections #### Q: No audio playback **A:** - Check browser audio permissions - Verify `speaker_output` component has `autoplay=True` - Test with different browser (Chrome recommended) #### Q: Avatar not lip-syncing **A:** 1. Click "🎭 Prepare Avatar" and wait for "✅ Ready" 2. Click "🔌 Connect to Gemini" and wait for "✅ Connected" 3. Ensure microphone permissions are granted 4. Check audio buffer is receiving data #### Q: High latency / Lag **A:** - **Target**: <1 second end-to-end - **Optimize**: - Use GPU (not CPU) - Reduce image resolution - Set `return_frame_only=True` in `inference_streaming()` for faster rendering - Check network speed to Railway bridge --- ### 5. Usage Tips #### Q: How to use custom avatar? **A:** 1. Uncheck "Use Default Avatar" 2. Upload your image/video (frontal face, clear features) 3. Adjust "Mouth Position Fix" slider if needed 4. Click "🎭 Prepare Avatar" #### Q: How to adjust mouth position? **A:** Use the "BBox Shift" slider: - **Positive values** (+): Move mouth down - **Negative values** (-): Move mouth up - Default: 5 #### Q: Best practices for demo? **A:** 1. **Preparation**: Always prepare avatar before connecting 2. **Connection**: Wait for "✅ Connected" status 3. **Speaking**: Speak clearly, natural pace 4. **Interruption**: Gemini 2.5 Flash handles interruptions natively - try it! 5. **Quality**: Use good microphone for best results --- ## Performance Benchmarks | Component | Latency | VRAM Usage | |-----------|---------|------------| | WebSocket (Railway) | ~50ms | 0GB | | Gemini 2.5 Flash | ~200ms | 0GB (Cloud) | | MuseTalk Inference | ~40ms/frame | 6-8GB | | Audio Buffer | ~200ms | <1GB | | **Total End-to-End** | **~500ms** | **8-11GB** | --- ## System Requirements ### Minimum - GPU: 6GB VRAM - RAM: 8GB - CPU: 4 cores - Network: 10 Mbps ### Recommended - GPU: 11GB+ VRAM (RTX 2080 Ti / RTX 3060 or better) - RAM: 16GB - CPU: 8 cores - Network: 50 Mbps --- ## Troubleshooting Checklist Before reporting issues, verify: - [ ] Python 3.10 installed - [ ] CUDA 11.8 installed (for GPU) - [ ] All model weights downloaded (check file sizes) - [ ] Models in correct folder structure - [ ] Dependencies installed (`requirements_webui.txt`) - [ ] FFMPEG installed - [ ] Sufficient VRAM available - [ ] Railway bridge is accessible - [ ] Firewall allows WebSocket connections - [ ] Browser has microphone permissions --- ## Getting Help 1. **Check this FAQ first** 2. **Review error messages** - most include hints 3. **Check model file sizes** - incomplete downloads are common 4. **Try with default avatar** - isolates custom image issues 5. **Report issues** with: - Full error message - Python version - GPU model - Steps to reproduce --- ## Links - **GitHub**: [Kedreamix/Linly-Talker](https://github.com/Kedreamix/Linly-Talker) - **Models**: [HuggingFace](https://huggingface.co/Kedreamix/Linly-Talker) | [ModelScope](https://modelscope.cn/models/Kedreamix/Linly-Talker) - **Railway Bridge**: [gemini-live-bridge](https://gemini-live-bridge-production.up.railway.app) --- **Last Updated**: February 2026 **Version**: Gemini Live Integration v1.0