|
|
--- |
|
|
license: apache-2.0 |
|
|
tags: |
|
|
- text-to-video |
|
|
- image-to-video |
|
|
- custom |
|
|
- inference-endpoints |
|
|
library_name: diffusers |
|
|
--- |
|
|
|
|
|
# MeiGen-MultiTalk Endpoint Handler |
|
|
|
|
|
This repository contains a custom handler for deploying MeiGen-AI's MultiTalk model on Hugging Face Inference Endpoints. |
|
|
|
|
|
## Model Description |
|
|
|
|
|
MeiGen-MultiTalk is an advanced model for generating audio-driven multi-person conversational videos. This handler wraps the original model to work with HF Inference Endpoints. |
|
|
|
|
|
## Features |
|
|
|
|
|
- Text-to-video generation |
|
|
- Image-to-video generation |
|
|
- Multi-person conversation synthesis |
|
|
- Support for various resolutions (480p, 720p) |
|
|
- Optimized for A100 GPUs |
|
|
|
|
|
## Usage with Inference Endpoints |
|
|
|
|
|
### Recommended Configuration |
|
|
|
|
|
- **Hardware**: GPU 路 A100 路 1x GPU (80 GB) |
|
|
- **Autoscaling**: |
|
|
- Min replicas: 0 |
|
|
- Max replicas: 1 |
|
|
- Scale to zero after: 300 seconds |
|
|
|
|
|
### API Example |
|
|
|
|
|
```python |
|
|
import requests |
|
|
import json |
|
|
import base64 |
|
|
|
|
|
API_URL = "https://YOUR-ENDPOINT-URL.endpoints.huggingface.cloud" |
|
|
headers = { |
|
|
"Authorization": "Bearer YOUR_HF_TOKEN", |
|
|
"Content-Type": "application/json" |
|
|
} |
|
|
|
|
|
# Text-to-video generation |
|
|
data = { |
|
|
"inputs": { |
|
|
"prompt": "A person giving a presentation" |
|
|
}, |
|
|
"parameters": { |
|
|
"num_frames": 16, |
|
|
"height": 480, |
|
|
"width": 640, |
|
|
"num_inference_steps": 25, |
|
|
"guidance_scale": 7.5 |
|
|
} |
|
|
} |
|
|
|
|
|
response = requests.post(API_URL, headers=headers, json=data) |
|
|
result = response.json() |
|
|
``` |
|
|
|
|
|
## Technical Details |
|
|
|
|
|
The handler includes: |
|
|
- Automatic model loading from MeiGen-AI/MeiGen-MultiTalk |
|
|
- Memory optimization for GPU inference |
|
|
- Support for both diffusion pipeline and transformer modes |
|
|
- Error handling and logging |
|
|
- Base64 encoding for image/video I/O |
|
|
|
|
|
## License |
|
|
|
|
|
Apache 2.0 License |
|
|
|
|
|
## Credits |
|
|
|
|
|
Based on the original [MeiGen-AI/MeiGen-MultiTalk](https://huggingface.co/MeiGen-AI/MeiGen-MultiTalk) model. |