LTX-Video / README-API.md
qhillerich's picture
Upload 2 files
d37ebe0 verified
# Video Generation Endpoint API (Custom Handler)
This repository is configured for deployment as a **Hugging Face Inference Endpoint** using a **custom `handler.py`**. The endpoint generates a short video from a text prompt and can return the result as:
- **GIF** (preview-friendly)
- **WebM** (higher quality, better compression)
- **ZIP of PNG frames** (maximum control / post-processing)
---
## Endpoint URL
After deployment, your endpoint will look like:
```
https://<your-endpoint>.aws.endpoints.huggingface.cloud
```
Example:
```
https://cyjm1rsdzy6la31w.us-east-1.aws.endpoints.huggingface.cloud
```
---
## Authentication
All requests require a Hugging Face token with permission to call the endpoint.
Send it as a Bearer token:
```
Authorization: Bearer YOUR_HF_TOKEN
```
---
## Request Format
Hugging Face endpoint requests should be wrapped in a top-level `inputs` object:
```json
{
"inputs": {
"prompt": "cinematic sunset over mountains",
"outputs": ["gif"]
}
}
```
### Core Fields
| Field | Type | Default | Description |
|------|------|---------|-------------|
| `prompt` | string | **required** | Text prompt describing the video. |
| `negative_prompt` | string | `""` | Things you want to avoid. |
| `num_frames` | int | `32` | Number of frames to generate. |
| `fps` | int | `12` | Playback FPS for GIF/WebM (may be overridden per output). |
| `height` | int | `512` | Frame height. |
| `width` | int | `512` | Frame width. |
| `seed` | int | `null` | Seed for reproducibility. |
| `outputs` | array | `["gif"]` | Any subset: `["gif","webm","zip"]`. |
| `return_base64` | bool | `true` | If true, returns file contents as base64 strings. |
| `num_inference_steps` | int | `30` | More steps can improve quality but increases latency. |
| `guidance_scale` | float | `7.5` | Prompt adherence strength (higher = more literal). |
---
## Output Configuration
You can optionally include per-output options inside `inputs`.
### GIF options
```json
"gif": { "fps": 10 }
```
### WebM options
```json
"webm": { "fps": 24, "quality": "good" }
```
Quality values:
- `"fast"` — fastest encode
- `"good"` — balanced (recommended)
- `"best"` — higher quality, slower encode
### ZIP output
ZIP output contains PNG frames:
```
frame_000000.png
frame_000001.png
...
```
---
## Response Format
The handler returns JSON. On success:
```json
{
"ok": true,
"outputs": {
"gif_base64": "...",
"webm_base64": "...",
"zip_base64": "..."
},
"diagnostics": {
"timing_ms": { ... },
"generator": { ... }
}
}
```
On error:
```json
{
"ok": false,
"error": "human readable error message",
"diagnostics": { ... }
}
```
---
## Example curl Commands (Direct-to-file)
These examples download **only the file** (decoded from base64 in the JSON response) without saving the JSON to disk.
> **Important:** We use `jq -er` so the command fails if the output key is missing. This prevents writing corrupted files when the API returns an error.
Replace `YOUR_HF_TOKEN` and your endpoint URL as needed.
---
### 1) GIF → `output.gif`
```bash
curl -sS -X POST "https://cyjm1rsdzy6la31w.us-east-1.aws.endpoints.huggingface.cloud" \
-H "Authorization: Bearer YOUR_HF_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"inputs": {
"prompt": "cinematic sunset over mountains, slow pan",
"num_frames": 20,
"fps": 10,
"outputs": ["gif"]
}
}' \
| jq -er '.outputs.gif_base64' \
| base64 --decode > output.gif
```
---
### 2) WebM → `output.webm`
```bash
curl -sS -X POST "https://cyjm1rsdzy6la31w.us-east-1.aws.endpoints.huggingface.cloud" \
-H "Authorization: Bearer YOUR_HF_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"inputs": {
"prompt": "a drone flying through clouds, volumetric lighting",
"num_frames": 32,
"fps": 24,
"outputs": ["webm"],
"webm": { "quality": "good" }
}
}' \
| jq -er '.outputs.webm_base64' \
| base64 --decode > output.webm
```
---
### 3) ZIP (frames) → `frames.zip`
```bash
curl -sS -X POST "https://cyjm1rsdzy6la31w.us-east-1.aws.endpoints.huggingface.cloud" \
-H "Authorization: Bearer YOUR_HF_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"inputs": {
"prompt": "ocean waves crashing in slow motion",
"num_frames": 16,
"outputs": ["zip"]
}
}' \
| jq -er '.outputs.zip_base64' \
| base64 --decode > frames.zip
```
Unzip frames:
```bash
unzip frames.zip
```
---
### 4) Multi-output (GIF + WebM + ZIP)
```bash
curl -sS -X POST "https://cyjm1rsdzy6la31w.us-east-1.aws.endpoints.huggingface.cloud" \
-H "Authorization: Bearer YOUR_HF_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"inputs": {
"prompt": "epic cinematic space nebula, slow parallax motion",
"num_frames": 24,
"fps": 12,
"outputs": ["gif", "webm", "zip"],
"gif": { "fps": 10 },
"webm": { "fps": 24, "quality": "good" }
}
}' \
-o response.json
```
Extract:
```bash
jq -er '.outputs.gif_base64' response.json | base64 --decode > output.gif
jq -er '.outputs.webm_base64' response.json | base64 --decode > output.webm
jq -er '.outputs.zip_base64' response.json | base64 --decode > frames.zip
```
---
## Troubleshooting
### “Corrupted” output files
Inspect the JSON first:
```bash
jq . response.json
```
Ensure:
```
"ok": true
```
### Large outputs
Reduce:
- `num_frames`
- `height` / `width`
Or modify the handler to upload to cloud storage and return a download URL.
---
## Repository Notes
This repo is designed for Hugging Face Inference Endpoints with a custom handler.
Key files:
- `handler.py` — request parsing, model invocation, output encoding
- `requirements.txt` — Python dependencies
If your model lives in a subdirectory, set the environment variable:
```
HF_MODEL_SUBDIR
```
---
## Security Notes
- Do not commit secrets or tokens into this repository.
- Use Endpoint Secrets / Environment Variables for credentials.
---
## License
Specify your license here (e.g., MIT, Apache-2.0).