LTX-Video / README-API.md
qhillerich's picture
Upload 2 files
d37ebe0 verified

Video Generation Endpoint API (Custom Handler)

This repository is configured for deployment as a Hugging Face Inference Endpoint using a custom handler.py. The endpoint generates a short video from a text prompt and can return the result as:

  • GIF (preview-friendly)
  • WebM (higher quality, better compression)
  • ZIP of PNG frames (maximum control / post-processing)

Endpoint URL

After deployment, your endpoint will look like:

https://<your-endpoint>.aws.endpoints.huggingface.cloud

Example:

https://cyjm1rsdzy6la31w.us-east-1.aws.endpoints.huggingface.cloud

Authentication

All requests require a Hugging Face token with permission to call the endpoint.

Send it as a Bearer token:

Authorization: Bearer YOUR_HF_TOKEN

Request Format

Hugging Face endpoint requests should be wrapped in a top-level inputs object:

{
  "inputs": {
    "prompt": "cinematic sunset over mountains",
    "outputs": ["gif"]
  }
}

Core Fields

Field Type Default Description
prompt string required Text prompt describing the video.
negative_prompt string "" Things you want to avoid.
num_frames int 32 Number of frames to generate.
fps int 12 Playback FPS for GIF/WebM (may be overridden per output).
height int 512 Frame height.
width int 512 Frame width.
seed int null Seed for reproducibility.
outputs array ["gif"] Any subset: ["gif","webm","zip"].
return_base64 bool true If true, returns file contents as base64 strings.
num_inference_steps int 30 More steps can improve quality but increases latency.
guidance_scale float 7.5 Prompt adherence strength (higher = more literal).

Output Configuration

You can optionally include per-output options inside inputs.

GIF options

"gif": { "fps": 10 }

WebM options

"webm": { "fps": 24, "quality": "good" }

Quality values:

  • "fast" — fastest encode
  • "good" — balanced (recommended)
  • "best" — higher quality, slower encode

ZIP output

ZIP output contains PNG frames:

frame_000000.png
frame_000001.png
...

Response Format

The handler returns JSON. On success:

{
  "ok": true,
  "outputs": {
    "gif_base64": "...",
    "webm_base64": "...",
    "zip_base64": "..."
  },
  "diagnostics": {
    "timing_ms": { ... },
    "generator": { ... }
  }
}

On error:

{
  "ok": false,
  "error": "human readable error message",
  "diagnostics": { ... }
}

Example curl Commands (Direct-to-file)

These examples download only the file (decoded from base64 in the JSON response) without saving the JSON to disk.

Important: We use jq -er so the command fails if the output key is missing. This prevents writing corrupted files when the API returns an error.

Replace YOUR_HF_TOKEN and your endpoint URL as needed.


1) GIF → output.gif

curl -sS -X POST "https://cyjm1rsdzy6la31w.us-east-1.aws.endpoints.huggingface.cloud" \
  -H "Authorization: Bearer YOUR_HF_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "inputs": {
      "prompt": "cinematic sunset over mountains, slow pan",
      "num_frames": 20,
      "fps": 10,
      "outputs": ["gif"]
    }
  }' \
| jq -er '.outputs.gif_base64' \
| base64 --decode > output.gif

2) WebM → output.webm

curl -sS -X POST "https://cyjm1rsdzy6la31w.us-east-1.aws.endpoints.huggingface.cloud" \
  -H "Authorization: Bearer YOUR_HF_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "inputs": {
      "prompt": "a drone flying through clouds, volumetric lighting",
      "num_frames": 32,
      "fps": 24,
      "outputs": ["webm"],
      "webm": { "quality": "good" }
    }
  }' \
| jq -er '.outputs.webm_base64' \
| base64 --decode > output.webm

3) ZIP (frames) → frames.zip

curl -sS -X POST "https://cyjm1rsdzy6la31w.us-east-1.aws.endpoints.huggingface.cloud" \
  -H "Authorization: Bearer YOUR_HF_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "inputs": {
      "prompt": "ocean waves crashing in slow motion",
      "num_frames": 16,
      "outputs": ["zip"]
    }
  }' \
| jq -er '.outputs.zip_base64' \
| base64 --decode > frames.zip

Unzip frames:

unzip frames.zip

4) Multi-output (GIF + WebM + ZIP)

curl -sS -X POST "https://cyjm1rsdzy6la31w.us-east-1.aws.endpoints.huggingface.cloud" \
  -H "Authorization: Bearer YOUR_HF_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "inputs": {
      "prompt": "epic cinematic space nebula, slow parallax motion",
      "num_frames": 24,
      "fps": 12,
      "outputs": ["gif", "webm", "zip"],
      "gif":  { "fps": 10 },
      "webm": { "fps": 24, "quality": "good" }
    }
  }' \
  -o response.json

Extract:

jq -er '.outputs.gif_base64'  response.json | base64 --decode > output.gif
jq -er '.outputs.webm_base64' response.json | base64 --decode > output.webm
jq -er '.outputs.zip_base64'  response.json | base64 --decode > frames.zip

Troubleshooting

“Corrupted” output files

Inspect the JSON first:

jq . response.json

Ensure:

"ok": true

Large outputs

Reduce:

  • num_frames
  • height / width

Or modify the handler to upload to cloud storage and return a download URL.


Repository Notes

This repo is designed for Hugging Face Inference Endpoints with a custom handler.

Key files:

  • handler.py — request parsing, model invocation, output encoding
  • requirements.txt — Python dependencies

If your model lives in a subdirectory, set the environment variable:

HF_MODEL_SUBDIR

Security Notes

  • Do not commit secrets or tokens into this repository.
  • Use Endpoint Secrets / Environment Variables for credentials.

License

Specify your license here (e.g., MIT, Apache-2.0).