LTX-Video / README-API.md

Upload 2 files

d37ebe0 verified about 1 month ago

6.09 kB

Video Generation Endpoint API (Custom Handler)

This repository is configured for deployment as a Hugging Face Inference Endpoint using a custom handler.py. The endpoint generates a short video from a text prompt and can return the result as:

GIF (preview-friendly)
WebM (higher quality, better compression)
ZIP of PNG frames (maximum control / post-processing)

Endpoint URL

After deployment, your endpoint will look like:

https://<your-endpoint>.aws.endpoints.huggingface.cloud

Example:

https://cyjm1rsdzy6la31w.us-east-1.aws.endpoints.huggingface.cloud

Authentication

All requests require a Hugging Face token with permission to call the endpoint.

Send it as a Bearer token:

Authorization: Bearer YOUR_HF_TOKEN

Request Format

Hugging Face endpoint requests should be wrapped in a top-level inputs object:

{
  "inputs": {
    "prompt": "cinematic sunset over mountains",
    "outputs": ["gif"]
  }
}

Core Fields

Field	Type	Default	Description
`prompt`	string	required	Text prompt describing the video.
`negative_prompt`	string	`""`	Things you want to avoid.
`num_frames`	int	`32`	Number of frames to generate.
`fps`	int	`12`	Playback FPS for GIF/WebM (may be overridden per output).
`height`	int	`512`	Frame height.
`width`	int	`512`	Frame width.
`seed`	int	`null`	Seed for reproducibility.
`outputs`	array	`["gif"]`	Any subset: `["gif","webm","zip"]`.
`return_base64`	bool	`true`	If true, returns file contents as base64 strings.
`num_inference_steps`	int	`30`	More steps can improve quality but increases latency.
`guidance_scale`	float	`7.5`	Prompt adherence strength (higher = more literal).

Output Configuration

You can optionally include per-output options inside inputs.

GIF options

"gif": { "fps": 10 }

WebM options

"webm": { "fps": 24, "quality": "good" }

Quality values:

"fast" — fastest encode
"good" — balanced (recommended)
"best" — higher quality, slower encode

ZIP output

ZIP output contains PNG frames:

frame_000000.png
frame_000001.png
...

Response Format

The handler returns JSON. On success:

{
  "ok": true,
  "outputs": {
    "gif_base64": "...",
    "webm_base64": "...",
    "zip_base64": "..."
  },
  "diagnostics": {
    "timing_ms": { ... },
    "generator": { ... }
  }
}

On error:

{
  "ok": false,
  "error": "human readable error message",
  "diagnostics": { ... }
}

Example curl Commands (Direct-to-file)

These examples download only the file (decoded from base64 in the JSON response) without saving the JSON to disk.

Important: We use jq -er so the command fails if the output key is missing. This prevents writing corrupted files when the API returns an error.

Replace YOUR_HF_TOKEN and your endpoint URL as needed.

1) GIF → `output.gif`

curl -sS -X POST "https://cyjm1rsdzy6la31w.us-east-1.aws.endpoints.huggingface.cloud" \
  -H "Authorization: Bearer YOUR_HF_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "inputs": {
      "prompt": "cinematic sunset over mountains, slow pan",
      "num_frames": 20,
      "fps": 10,
      "outputs": ["gif"]
    }
  }' \
| jq -er '.outputs.gif_base64' \
| base64 --decode > output.gif

2) WebM → `output.webm`

curl -sS -X POST "https://cyjm1rsdzy6la31w.us-east-1.aws.endpoints.huggingface.cloud" \
  -H "Authorization: Bearer YOUR_HF_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "inputs": {
      "prompt": "a drone flying through clouds, volumetric lighting",
      "num_frames": 32,
      "fps": 24,
      "outputs": ["webm"],
      "webm": { "quality": "good" }
    }
  }' \
| jq -er '.outputs.webm_base64' \
| base64 --decode > output.webm

3) ZIP (frames) → `frames.zip`

curl -sS -X POST "https://cyjm1rsdzy6la31w.us-east-1.aws.endpoints.huggingface.cloud" \
  -H "Authorization: Bearer YOUR_HF_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "inputs": {
      "prompt": "ocean waves crashing in slow motion",
      "num_frames": 16,
      "outputs": ["zip"]
    }
  }' \
| jq -er '.outputs.zip_base64' \
| base64 --decode > frames.zip

Unzip frames:

unzip frames.zip

4) Multi-output (GIF + WebM + ZIP)

curl -sS -X POST "https://cyjm1rsdzy6la31w.us-east-1.aws.endpoints.huggingface.cloud" \
  -H "Authorization: Bearer YOUR_HF_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "inputs": {
      "prompt": "epic cinematic space nebula, slow parallax motion",
      "num_frames": 24,
      "fps": 12,
      "outputs": ["gif", "webm", "zip"],
      "gif":  { "fps": 10 },
      "webm": { "fps": 24, "quality": "good" }
    }
  }' \
  -o response.json

Extract:

jq -er '.outputs.gif_base64'  response.json | base64 --decode > output.gif
jq -er '.outputs.webm_base64' response.json | base64 --decode > output.webm
jq -er '.outputs.zip_base64'  response.json | base64 --decode > frames.zip

Troubleshooting

“Corrupted” output files

Inspect the JSON first:

jq . response.json

Ensure:

"ok": true

Large outputs

Reduce:

num_frames
height / width

Or modify the handler to upload to cloud storage and return a download URL.

Repository Notes

This repo is designed for Hugging Face Inference Endpoints with a custom handler.

Key files:

handler.py — request parsing, model invocation, output encoding
requirements.txt — Python dependencies

If your model lives in a subdirectory, set the environment variable:

HF_MODEL_SUBDIR

Security Notes

Do not commit secrets or tokens into this repository.
Use Endpoint Secrets / Environment Variables for credentials.

License

Specify your license here (e.g., MIT, Apache-2.0).