Video Generation Endpoint API (Custom Handler)
This repository is configured for deployment as a Hugging Face Inference Endpoint using a custom handler.py. The endpoint generates a short video from a text prompt and can return the result as:
- GIF (preview-friendly)
- WebM (higher quality, better compression)
- ZIP of PNG frames (maximum control / post-processing)
Endpoint URL
After deployment, your endpoint will look like:
https://<your-endpoint>.aws.endpoints.huggingface.cloud
Example:
https://cyjm1rsdzy6la31w.us-east-1.aws.endpoints.huggingface.cloud
Authentication
All requests require a Hugging Face token with permission to call the endpoint.
Send it as a Bearer token:
Authorization: Bearer YOUR_HF_TOKEN
Request Format
Hugging Face endpoint requests should be wrapped in a top-level inputs object:
{
"inputs": {
"prompt": "cinematic sunset over mountains",
"outputs": ["gif"]
}
}
Core Fields
| Field | Type | Default | Description |
|---|---|---|---|
prompt |
string | required | Text prompt describing the video. |
negative_prompt |
string | "" |
Things you want to avoid. |
num_frames |
int | 32 |
Number of frames to generate. |
fps |
int | 12 |
Playback FPS for GIF/WebM (may be overridden per output). |
height |
int | 512 |
Frame height. |
width |
int | 512 |
Frame width. |
seed |
int | null |
Seed for reproducibility. |
outputs |
array | ["gif"] |
Any subset: ["gif","webm","zip"]. |
return_base64 |
bool | true |
If true, returns file contents as base64 strings. |
num_inference_steps |
int | 30 |
More steps can improve quality but increases latency. |
guidance_scale |
float | 7.5 |
Prompt adherence strength (higher = more literal). |
Output Configuration
You can optionally include per-output options inside inputs.
GIF options
"gif": { "fps": 10 }
WebM options
"webm": { "fps": 24, "quality": "good" }
Quality values:
"fast"— fastest encode"good"— balanced (recommended)"best"— higher quality, slower encode
ZIP output
ZIP output contains PNG frames:
frame_000000.png
frame_000001.png
...
Response Format
The handler returns JSON. On success:
{
"ok": true,
"outputs": {
"gif_base64": "...",
"webm_base64": "...",
"zip_base64": "..."
},
"diagnostics": {
"timing_ms": { ... },
"generator": { ... }
}
}
On error:
{
"ok": false,
"error": "human readable error message",
"diagnostics": { ... }
}
Example curl Commands (Direct-to-file)
These examples download only the file (decoded from base64 in the JSON response) without saving the JSON to disk.
Important: We use
jq -erso the command fails if the output key is missing. This prevents writing corrupted files when the API returns an error.
Replace YOUR_HF_TOKEN and your endpoint URL as needed.
1) GIF → output.gif
curl -sS -X POST "https://cyjm1rsdzy6la31w.us-east-1.aws.endpoints.huggingface.cloud" \
-H "Authorization: Bearer YOUR_HF_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"inputs": {
"prompt": "cinematic sunset over mountains, slow pan",
"num_frames": 20,
"fps": 10,
"outputs": ["gif"]
}
}' \
| jq -er '.outputs.gif_base64' \
| base64 --decode > output.gif
2) WebM → output.webm
curl -sS -X POST "https://cyjm1rsdzy6la31w.us-east-1.aws.endpoints.huggingface.cloud" \
-H "Authorization: Bearer YOUR_HF_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"inputs": {
"prompt": "a drone flying through clouds, volumetric lighting",
"num_frames": 32,
"fps": 24,
"outputs": ["webm"],
"webm": { "quality": "good" }
}
}' \
| jq -er '.outputs.webm_base64' \
| base64 --decode > output.webm
3) ZIP (frames) → frames.zip
curl -sS -X POST "https://cyjm1rsdzy6la31w.us-east-1.aws.endpoints.huggingface.cloud" \
-H "Authorization: Bearer YOUR_HF_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"inputs": {
"prompt": "ocean waves crashing in slow motion",
"num_frames": 16,
"outputs": ["zip"]
}
}' \
| jq -er '.outputs.zip_base64' \
| base64 --decode > frames.zip
Unzip frames:
unzip frames.zip
4) Multi-output (GIF + WebM + ZIP)
curl -sS -X POST "https://cyjm1rsdzy6la31w.us-east-1.aws.endpoints.huggingface.cloud" \
-H "Authorization: Bearer YOUR_HF_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"inputs": {
"prompt": "epic cinematic space nebula, slow parallax motion",
"num_frames": 24,
"fps": 12,
"outputs": ["gif", "webm", "zip"],
"gif": { "fps": 10 },
"webm": { "fps": 24, "quality": "good" }
}
}' \
-o response.json
Extract:
jq -er '.outputs.gif_base64' response.json | base64 --decode > output.gif
jq -er '.outputs.webm_base64' response.json | base64 --decode > output.webm
jq -er '.outputs.zip_base64' response.json | base64 --decode > frames.zip
Troubleshooting
“Corrupted” output files
Inspect the JSON first:
jq . response.json
Ensure:
"ok": true
Large outputs
Reduce:
num_framesheight/width
Or modify the handler to upload to cloud storage and return a download URL.
Repository Notes
This repo is designed for Hugging Face Inference Endpoints with a custom handler.
Key files:
handler.py— request parsing, model invocation, output encodingrequirements.txt— Python dependencies
If your model lives in a subdirectory, set the environment variable:
HF_MODEL_SUBDIR
Security Notes
- Do not commit secrets or tokens into this repository.
- Use Endpoint Secrets / Environment Variables for credentials.
License
Specify your license here (e.g., MIT, Apache-2.0).