Spaces:
Running on Zero
Running on Zero
| # REST API & Automation (Quick Reference) | |
| LightDiffusion-Next ships with a FastAPI service (`server.py`) that sits in front of the shared pipeline. It batches compatible requests, streams telemetry and exposes health probes so you can plug the system into automation workflows, bots or orchestrators. | |
| ## Common endpoints | |
| | Method | Path | Description | | |
| | --- | --- | --- | | |
| | `GET` | `/health` | Lightweight readiness probe. Returns `{ "status": "ok" }` when the server is reachable. | | |
| | `GET` | `/api/telemetry` | Queue and VRAM telemetry: batching stats, pending requests, cache state, uptime. | | |
| | `POST` | `/api/generate` | Submit a generation job. Requests are buffered, batched when signatures match and resolved asynchronously. | | |
| The service listens on port `7861` by default. Launch it with: | |
| ```fish | |
| uvicorn server:app --host 0.0.0.0 --port 7861 | |
| ``` | |
| ## Payload schema (`/api/generate`) | |
| ```json | |
| { | |
| "prompt": "string", | |
| "negative_prompt": "string", | |
| "width": 512, | |
| "height": 512, | |
| "num_images": 1, | |
| "batch_size": 1, | |
| "scheduler": "ays", | |
| "sampler": "dpmpp_sde_cfgpp", | |
| "steps": 20, | |
| "hires_fix": false, | |
| "adetailer": false, | |
| "enhance_prompt": false, | |
| "img2img_enabled": false, | |
| "img2img_image": null, | |
| "stable_fast": false, | |
| "reuse_seed": false, | |
| "flux_enabled": false, | |
| "realistic_model": false, | |
| "multiscale_enabled": true, | |
| "multiscale_intermittent": true, | |
| "multiscale_factor": 0.5, | |
| "multiscale_fullres_start": 10, | |
| "multiscale_fullres_end": 8, | |
| "keep_models_loaded": true, | |
| "enable_preview": false, | |
| "preview_fidelity": "balanced", | |
| "guidance_scale": null, | |
| "seed": null | |
| } | |
| ``` | |
| Not all fields are required—only `prompt`, `width`, `height` and `num_images` are strictly necessary. Any unknown keys are ignored, making the endpoint forward-compatible with UI features. | |
| ### Response format | |
| Successful requests return either: | |
| ```json | |
| { "image": "<base64-png>" } | |
| ``` | |
| or, if multiple images were requested: | |
| ```json | |
| { "images": ["<base64-png>", "<base64-png>"] } | |
| ``` | |
| Base64 strings represent PNG files with embedded metadata identical to the Streamlit UI output. Decode and write them to disk. | |
| ### Img2Img uploads | |
| When `img2img_enabled` is `true`, `img2img_image` may be provided as any of the following: | |
| - A local file path (e.g., `"tests/test.png"`) | |
| - A data URL (e.g., `"data:image/png;base64,<...>"`) | |
| - A raw Base64-encoded PNG string | |
| The server will decode data URLs and raw Base64 strings and save them to the system temporary directory before processing (default max upload size: 10 MB). Keep payloads under a few megabytes to avoid HTTP timeouts. | |
| ## Telemetry shape (`/api/telemetry`) | |
| The telemetry endpoint returns operational stats that help with autoscaling or queue dashboards. Example snippet: | |
| ```json | |
| { | |
| "uptime_seconds": 1234.56, | |
| "pending_count": 2, | |
| "pending_by_signature": { | |
| "(False, 512, 512, True, False, False, True, True, 0.5, 10, 8, False, True, False)": 2 | |
| }, | |
| "pending_preview": [ | |
| {"request_id": "a1b2c3d4", "waiting_s": 0.42, "prompt_preview": "a cinematic robot..."} | |
| ], | |
| "max_batch_size": 4, | |
| "max_images_per_group": 256, | |
| "batch_timeout": 0.5, | |
| "batches_processed": 12, | |
| "items_processed": 24, | |
| "requests_processed": 12, | |
| "avg_processed_wait_s": 0.31, | |
| "pending_avg_wait_s": 0.12, | |
| "memory_info": { | |
| "vram_allocated_mb": 5623, | |
| "vram_reserved_mb": 6144, | |
| "system_ram_mb": 12345 | |
| }, | |
| "loaded_models_count": 2, | |
| "loaded_models": ["SD15 UNet", "SD15 VAE"], | |
| "pipeline_import_ok": true, | |
| "pipeline_import_error": null | |
| } | |
| ``` | |
| Use this data to spot batching mismatches (different signatures cannot be coalesced), monitor VRAM usage or expose metrics to Prometheus/Grafana. | |
| ## Queue tuning knobs | |
| The queue accepts a few environment variables that influence behaviour: | |
| | Variable | Default | Effect | | |
| | --- | --- | --- | | |
| | `LD_MAX_BATCH_SIZE` | `4` | Maximum items processed together when signatures match. | | |
| | `LD_BATCH_TIMEOUT` | `0.5` | Seconds to wait before flushing a batch. | | |
| | `LD_BATCH_WAIT_SINGLETONS` | `0` | If `1`, single jobs wait the timeout hoping for companions. Set to `0` to process singletons immediately. | | |
| | `LD_MAX_IMAGES_PER_GROUP` | `256` | Maximum combined images processed in a single pipeline run when coalescing multiple requests. Groups larger than this are processed sequentially in smaller chunks to avoid memory and disk pressure. | | |
| | `LD_MAX_IMAGES_PER_SAVE` | `16` | Maximum images allowed in a single `save_images` call. If exceeded, the save is aborted to avoid creating many tile files; change with `LD_MAX_IMAGES_PER_SAVE` if needed. | | |
| | `LD_SERVER_LOGLEVEL` | `DEBUG` | Logging verbosity for `logs/server.log`. | | |
| ## Deploying behind a reverse proxy | |
| When hosting remotely: | |
| - Front the FastAPI app with Nginx/Caddy and increase client body size if you accept Img2Img uploads. | |
| - Expose `/health` for liveness checks and `/api/telemetry` for readiness/autoscaling gates. | |
| - Mount `./include`, `./output` and `~/.cache/torch_extensions` as volumes so workers share models, outputs and compiled kernels. | |
| ## Testing the service quickly | |
| ```fish | |
| # Send a simple generation job | |
| curl -X POST http://localhost:7861/api/generate \ | |
| -H "Content-Type: application/json" \ | |
| -d '{"prompt": "painted nebula over distant mountains", "width": 512, "height": 512, "num_images": 1}' \ | |
| | jq -r '.image' | base64 -d > nebula.png | |
| # Inspect queue state | |
| curl http://localhost:7861/api/telemetry | jq | |
| ``` | |
| That’s it! Check the [Troubleshooting guide](quirks.md) if the service reports missing models or the queue appears stalled. | |