File size: 1,895 Bytes
ab4557b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
---
license: apache-2.0
tags:
- text-to-video
- image-to-video
- custom
- inference-endpoints
library_name: diffusers
---

# MeiGen-MultiTalk Endpoint Handler

This repository contains a custom handler for deploying MeiGen-AI's MultiTalk model on Hugging Face Inference Endpoints.

## Model Description

MeiGen-MultiTalk is an advanced model for generating audio-driven multi-person conversational videos. This handler wraps the original model to work with HF Inference Endpoints.

## Features

- Text-to-video generation
- Image-to-video generation
- Multi-person conversation synthesis
- Support for various resolutions (480p, 720p)
- Optimized for A100 GPUs

## Usage with Inference Endpoints

### Recommended Configuration

- **Hardware**: GPU · A100 · 1x GPU (80 GB)
- **Autoscaling**:
  - Min replicas: 0
  - Max replicas: 1
  - Scale to zero after: 300 seconds

### API Example

```python
import requests
import json
import base64

API_URL = "https://YOUR-ENDPOINT-URL.endpoints.huggingface.cloud"
headers = {
    "Authorization": "Bearer YOUR_HF_TOKEN",
    "Content-Type": "application/json"
}

# Text-to-video generation
data = {
    "inputs": {
        "prompt": "A person giving a presentation"
    },
    "parameters": {
        "num_frames": 16,
        "height": 480,
        "width": 640,
        "num_inference_steps": 25,
        "guidance_scale": 7.5
    }
}

response = requests.post(API_URL, headers=headers, json=data)
result = response.json()
```

## Technical Details

The handler includes:
- Automatic model loading from MeiGen-AI/MeiGen-MultiTalk
- Memory optimization for GPU inference
- Support for both diffusion pipeline and transformer modes
- Error handling and logging
- Base64 encoding for image/video I/O

## License

Apache 2.0 License

## Credits

Based on the original [MeiGen-AI/MeiGen-MultiTalk](https://huggingface.co/MeiGen-AI/MeiGen-MultiTalk) model.