Spaces:

eshwar06
/

personaxgemini

Runtime error

File size: 7,374 Bytes

229897d

# Gemini Live Avatar - FAQ

## Quick Start Guide

### Prerequisites
- **GPU**: NVIDIA GPU with 11GB+ VRAM (recommended)
- **Python**: 3.10
- **CUDA**: 11.8
- **OS**: Windows/Linux

### Installation

1. **Clone Repository**
```bash

git clone https://github.com/Kedreamix/Linly-Talker.git

cd Linly-Talker

```

2. **Create Environment**
```bash

conda create -n linly python=3.10

conda activate linly

```

3. **Install PyTorch**
```bash

# CUDA 11.8

pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118

```

4. **Install Dependencies**
```bash

conda install -q ffmpeg

pip install -r requirements_webui.txt



# MuseTalk dependencies

pip install --no-cache-dir -U openmim

mim install mmengine 

mim install "mmcv>=2.0.1" 

mim install "mmdet>=3.1.0" 

mim install "mmpose>=1.1.0"

```

5. **Download Models**

Download the required models from one of these sources:
- [Baidu Netdisk](https://pan.baidu.com/s/1eF13O-8wyw4B3MtesctQyg?pwd=linl) (Password: linl)
- [HuggingFace](https://huggingface.co/Kedreamix/Linly-Talker)
- [ModelScope](https://modelscope.cn/models/Kedreamix/Linly-Talker)

**Required Models:**
- MuseTalk models → `Musetalk/models/`
- SadTalker checkpoints → `checkpoints/`
- Face detection models → `gfpgan/weights/`

6. **Launch**
```bash

python webui.py

```

Open `http://localhost:7860` in your browser.

---

## Common Issues

### 1. Installation Issues

#### Q: `Microsoft Visual C++ 14.0 is required`
**A:** Install [Microsoft C++ Build Tools](https://visualstudio.microsoft.com/visual-cpp-build-tools/)

#### Q: `version GLIBCXX_3.4.* not found`

**A:** Use Python 3.10 or downgrade libraries:

```bash

pip install pyopenjtalk==0.3.1

pip install opencc==1.1.1

```



#### Q: FFMPEG not found

**A:** Install via conda:

```bash

conda install -q ffmpeg

```



Or on Linux:

```bash

sudo apt install ffmpeg

```



---



### 2. Model & Weight Issues



#### Q: `FileNotFoundError` for model weights

**A:** Ensure models are in correct folders:

```

Linly-Talker/

├── checkpoints/

│   ├── mapping_00109-model.pth.tar (149MB)
│   ├── mapping_00229-model.pth.tar (149MB)

│   └── ...

├── Musetalk/

│   └── models/

│       ├── musetalk/

│       ├── dwpose/

│       └── ...

└── gfpgan/

    └── weights/

```



#### Q: `SadTalker Error: invalid load key, 'v'`

**A:** Re-download `mapping_*.pth.tar` files (they should be 149MB each):

```bash

wget -c https://modelscope.cn/api/v1/models/Kedreamix/Linly-Talker/repo?Revision=master&FilePath=checkpoints%2Fmapping_00109-model.pth.tar

wget -c https://modelscope.cn/api/v1/models/Kedreamix/Linly-Talker/repo?Revision=master&FilePath=checkpoints%2Fmapping_00229-model.pth.tar

```



#### Q: `File is not a zip file` (NLTK error)

**A:** Manually download `nltk_data`:

```python

import nltk

print(nltk.data.path)  # Find cache path

```

Download from [Quark Netdisk](https://pan.quark.cn/s/f48f5e35796b) and place in cache path.



---



### 3. Runtime Issues



#### Q: VRAM overflow / Out of Memory

**A:** 

- **Minimum**: 6GB VRAM (SadTalker only)

- **Recommended**: 11GB+ VRAM (MuseTalk)

- **Solution**: Use lower resolution images or reduce batch size



#### Q: `GFPGANer is not defined`

**A:** Install enhancement module:

```bash

pip install gfpgan

```



#### Q: `Gradio Connection errored out`

**A:** 

- Check firewall settings

- Try different port in `webui.py`:

```python

demo.launch(server_port=7861)  # Change port

```



#### Q: Avatar preparation fails

**A:**

- Use clear frontal face images/videos

- Recommended resolution: 512x512 to 1024x1024

- Supported formats: `.jpg`, `.png`, `.mp4`



---



### 4. Gemini Live Specific Issues



#### Q: WebSocket connection fails

**A:** 

- Verify Railway bridge is running: `wss://gemini-live-bridge-production.up.railway.app/ws`

- Check internet connection

- Ensure no firewall blocking WebSocket connections



#### Q: No audio playback

**A:** 

- Check browser audio permissions

- Verify `speaker_output` component has `autoplay=True`

- Test with different browser (Chrome recommended)



#### Q: Avatar not lip-syncing

**A:**

1. Click "🎭 Prepare Avatar" and wait for "✅ Ready"

2. Click "🔌 Connect to Gemini" and wait for "✅ Connected"

3. Ensure microphone permissions are granted

4. Check audio buffer is receiving data



#### Q: High latency / Lag

**A:**

- **Target**: <1 second end-to-end

- **Optimize**:

  - Use GPU (not CPU)

  - Reduce image resolution

  - Set `return_frame_only=True` in `inference_streaming()` for faster rendering

  - Check network speed to Railway bridge



---



### 5. Usage Tips



#### Q: How to use custom avatar?

**A:**

1. Uncheck "Use Default Avatar"

2. Upload your image/video (frontal face, clear features)

3. Adjust "Mouth Position Fix" slider if needed

4. Click "🎭 Prepare Avatar"



#### Q: How to adjust mouth position?

**A:** Use the "BBox Shift" slider:

- **Positive values** (+): Move mouth down

- **Negative values** (-): Move mouth up

- Default: 5



#### Q: Best practices for demo?

**A:**

1. **Preparation**: Always prepare avatar before connecting

2. **Connection**: Wait for "✅ Connected" status

3. **Speaking**: Speak clearly, natural pace

4. **Interruption**: Gemini 2.5 Flash handles interruptions natively - try it!

5. **Quality**: Use good microphone for best results



---



## Performance Benchmarks



| Component | Latency | VRAM Usage |

|-----------|---------|------------|

| WebSocket (Railway) | ~50ms | 0GB |

| Gemini 2.5 Flash | ~200ms | 0GB (Cloud) |

| MuseTalk Inference | ~40ms/frame | 6-8GB |

| Audio Buffer | ~200ms | <1GB |

| **Total End-to-End** | **~500ms** | **8-11GB** |



---



## System Requirements



### Minimum

- GPU: 6GB VRAM

- RAM: 8GB

- CPU: 4 cores

- Network: 10 Mbps



### Recommended

- GPU: 11GB+ VRAM (RTX 2080 Ti / RTX 3060 or better)

- RAM: 16GB

- CPU: 8 cores

- Network: 50 Mbps



---



## Troubleshooting Checklist



Before reporting issues, verify:



- [ ] Python 3.10 installed

- [ ] CUDA 11.8 installed (for GPU)

- [ ] All model weights downloaded (check file sizes)

- [ ] Models in correct folder structure

- [ ] Dependencies installed (`requirements_webui.txt`)

- [ ] FFMPEG installed

- [ ] Sufficient VRAM available

- [ ] Railway bridge is accessible

- [ ] Firewall allows WebSocket connections

- [ ] Browser has microphone permissions



---



## Getting Help



1. **Check this FAQ first**

2. **Review error messages** - most include hints

3. **Check model file sizes** - incomplete downloads are common

4. **Try with default avatar** - isolates custom image issues

5. **Report issues** with:

   - Full error message

   - Python version

   - GPU model

   - Steps to reproduce



---



## Links



- **GitHub**: [Kedreamix/Linly-Talker](https://github.com/Kedreamix/Linly-Talker)

- **Models**: [HuggingFace](https://huggingface.co/Kedreamix/Linly-Talker) | [ModelScope](https://modelscope.cn/models/Kedreamix/Linly-Talker)

- **Railway Bridge**: [gemini-live-bridge](https://gemini-live-bridge-production.up.railway.app)



---



**Last Updated**: February 2026  

**Version**: Gemini Live Integration v1.0