| # Video Generation Endpoint API (Custom Handler) | |
| This repository is configured for deployment as a **Hugging Face Inference Endpoint** using a **custom `handler.py`**. The endpoint generates a short video from a text prompt and can return the result as: | |
| - **GIF** (preview-friendly) | |
| - **WebM** (higher quality, better compression) | |
| - **ZIP of PNG frames** (maximum control / post-processing) | |
| --- | |
| ## Endpoint URL | |
| After deployment, your endpoint will look like: | |
| ``` | |
| https://<your-endpoint>.aws.endpoints.huggingface.cloud | |
| ``` | |
| Example: | |
| ``` | |
| https://cyjm1rsdzy6la31w.us-east-1.aws.endpoints.huggingface.cloud | |
| ``` | |
| --- | |
| ## Authentication | |
| All requests require a Hugging Face token with permission to call the endpoint. | |
| Send it as a Bearer token: | |
| ``` | |
| Authorization: Bearer YOUR_HF_TOKEN | |
| ``` | |
| --- | |
| ## Request Format | |
| Hugging Face endpoint requests should be wrapped in a top-level `inputs` object: | |
| ```json | |
| { | |
| "inputs": { | |
| "prompt": "cinematic sunset over mountains", | |
| "outputs": ["gif"] | |
| } | |
| } | |
| ``` | |
| ### Core Fields | |
| | Field | Type | Default | Description | | |
| |------|------|---------|-------------| | |
| | `prompt` | string | **required** | Text prompt describing the video. | | |
| | `negative_prompt` | string | `""` | Things you want to avoid. | | |
| | `num_frames` | int | `32` | Number of frames to generate. | | |
| | `fps` | int | `12` | Playback FPS for GIF/WebM (may be overridden per output). | | |
| | `height` | int | `512` | Frame height. | | |
| | `width` | int | `512` | Frame width. | | |
| | `seed` | int | `null` | Seed for reproducibility. | | |
| | `outputs` | array | `["gif"]` | Any subset: `["gif","webm","zip"]`. | | |
| | `return_base64` | bool | `true` | If true, returns file contents as base64 strings. | | |
| | `num_inference_steps` | int | `30` | More steps can improve quality but increases latency. | | |
| | `guidance_scale` | float | `7.5` | Prompt adherence strength (higher = more literal). | | |
| --- | |
| ## Output Configuration | |
| You can optionally include per-output options inside `inputs`. | |
| ### GIF options | |
| ```json | |
| "gif": { "fps": 10 } | |
| ``` | |
| ### WebM options | |
| ```json | |
| "webm": { "fps": 24, "quality": "good" } | |
| ``` | |
| Quality values: | |
| - `"fast"` — fastest encode | |
| - `"good"` — balanced (recommended) | |
| - `"best"` — higher quality, slower encode | |
| ### ZIP output | |
| ZIP output contains PNG frames: | |
| ``` | |
| frame_000000.png | |
| frame_000001.png | |
| ... | |
| ``` | |
| --- | |
| ## Response Format | |
| The handler returns JSON. On success: | |
| ```json | |
| { | |
| "ok": true, | |
| "outputs": { | |
| "gif_base64": "...", | |
| "webm_base64": "...", | |
| "zip_base64": "..." | |
| }, | |
| "diagnostics": { | |
| "timing_ms": { ... }, | |
| "generator": { ... } | |
| } | |
| } | |
| ``` | |
| On error: | |
| ```json | |
| { | |
| "ok": false, | |
| "error": "human readable error message", | |
| "diagnostics": { ... } | |
| } | |
| ``` | |
| --- | |
| ## Example curl Commands (Direct-to-file) | |
| These examples download **only the file** (decoded from base64 in the JSON response) without saving the JSON to disk. | |
| > **Important:** We use `jq -er` so the command fails if the output key is missing. This prevents writing corrupted files when the API returns an error. | |
| Replace `YOUR_HF_TOKEN` and your endpoint URL as needed. | |
| --- | |
| ### 1) GIF → `output.gif` | |
| ```bash | |
| curl -sS -X POST "https://cyjm1rsdzy6la31w.us-east-1.aws.endpoints.huggingface.cloud" \ | |
| -H "Authorization: Bearer YOUR_HF_TOKEN" \ | |
| -H "Content-Type: application/json" \ | |
| -d '{ | |
| "inputs": { | |
| "prompt": "cinematic sunset over mountains, slow pan", | |
| "num_frames": 20, | |
| "fps": 10, | |
| "outputs": ["gif"] | |
| } | |
| }' \ | |
| | jq -er '.outputs.gif_base64' \ | |
| | base64 --decode > output.gif | |
| ``` | |
| --- | |
| ### 2) WebM → `output.webm` | |
| ```bash | |
| curl -sS -X POST "https://cyjm1rsdzy6la31w.us-east-1.aws.endpoints.huggingface.cloud" \ | |
| -H "Authorization: Bearer YOUR_HF_TOKEN" \ | |
| -H "Content-Type: application/json" \ | |
| -d '{ | |
| "inputs": { | |
| "prompt": "a drone flying through clouds, volumetric lighting", | |
| "num_frames": 32, | |
| "fps": 24, | |
| "outputs": ["webm"], | |
| "webm": { "quality": "good" } | |
| } | |
| }' \ | |
| | jq -er '.outputs.webm_base64' \ | |
| | base64 --decode > output.webm | |
| ``` | |
| --- | |
| ### 3) ZIP (frames) → `frames.zip` | |
| ```bash | |
| curl -sS -X POST "https://cyjm1rsdzy6la31w.us-east-1.aws.endpoints.huggingface.cloud" \ | |
| -H "Authorization: Bearer YOUR_HF_TOKEN" \ | |
| -H "Content-Type: application/json" \ | |
| -d '{ | |
| "inputs": { | |
| "prompt": "ocean waves crashing in slow motion", | |
| "num_frames": 16, | |
| "outputs": ["zip"] | |
| } | |
| }' \ | |
| | jq -er '.outputs.zip_base64' \ | |
| | base64 --decode > frames.zip | |
| ``` | |
| Unzip frames: | |
| ```bash | |
| unzip frames.zip | |
| ``` | |
| --- | |
| ### 4) Multi-output (GIF + WebM + ZIP) | |
| ```bash | |
| curl -sS -X POST "https://cyjm1rsdzy6la31w.us-east-1.aws.endpoints.huggingface.cloud" \ | |
| -H "Authorization: Bearer YOUR_HF_TOKEN" \ | |
| -H "Content-Type: application/json" \ | |
| -d '{ | |
| "inputs": { | |
| "prompt": "epic cinematic space nebula, slow parallax motion", | |
| "num_frames": 24, | |
| "fps": 12, | |
| "outputs": ["gif", "webm", "zip"], | |
| "gif": { "fps": 10 }, | |
| "webm": { "fps": 24, "quality": "good" } | |
| } | |
| }' \ | |
| -o response.json | |
| ``` | |
| Extract: | |
| ```bash | |
| jq -er '.outputs.gif_base64' response.json | base64 --decode > output.gif | |
| jq -er '.outputs.webm_base64' response.json | base64 --decode > output.webm | |
| jq -er '.outputs.zip_base64' response.json | base64 --decode > frames.zip | |
| ``` | |
| --- | |
| ## Troubleshooting | |
| ### “Corrupted” output files | |
| Inspect the JSON first: | |
| ```bash | |
| jq . response.json | |
| ``` | |
| Ensure: | |
| ``` | |
| "ok": true | |
| ``` | |
| ### Large outputs | |
| Reduce: | |
| - `num_frames` | |
| - `height` / `width` | |
| Or modify the handler to upload to cloud storage and return a download URL. | |
| --- | |
| ## Repository Notes | |
| This repo is designed for Hugging Face Inference Endpoints with a custom handler. | |
| Key files: | |
| - `handler.py` — request parsing, model invocation, output encoding | |
| - `requirements.txt` — Python dependencies | |
| If your model lives in a subdirectory, set the environment variable: | |
| ``` | |
| HF_MODEL_SUBDIR | |
| ``` | |
| --- | |
| ## Security Notes | |
| - Do not commit secrets or tokens into this repository. | |
| - Use Endpoint Secrets / Environment Variables for credentials. | |
| --- | |
| ## License | |
| Specify your license here (e.g., MIT, Apache-2.0). | |