cosmos-transfer π¬
A REST API wrapper around NVIDIA Cosmos-Transfer2.5 β a 2B parameter video diffusion model that converts synthetic renders into photorealistic video (Sim2Real).
Packaged as a ready-to-run Docker microservice with battle-tested parameters tuned across 80+ surveillance clips.
Quick Start
docker pull ghcr.io/eyalenav/cosmos-transfer:latest
docker run --rm --gpus '"device=0"' -p 8080:8080 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
-e HUGGINGFACE_TOKEN=hf_... \
ghcr.io/eyalenav/cosmos-transfer:latest
β οΈ First run downloads Cosmos-Transfer2.5-2B weights (~20GB). Requires a HuggingFace token with access to
nvidia/Cosmos-Transfer2.5-2B.
API
POST /transfer
Convert a synthetic video to photorealistic.
curl -X POST http://localhost:8080/transfer \
-F "video=@synthetic_render.mp4" \
-F "prompt=surveillance camera footage of a crowded street" \
--output photorealistic.mp4
Parameters:
| Field | Default | Description |
|---|---|---|
video |
required | Input synthetic MP4 |
prompt |
"" |
Text guidance for the scene |
edge_strength |
0.85 |
Canny edge control (geometry preservation) |
vis_strength |
0.45 |
Visual blur control (scene structure) |
sigma |
100 |
Noise level (realism vs. fidelity) |
GET /health
curl http://localhost:8080/health
# {"status": "ok"}
Tuned Parameters
After 80+ clips, the sweet spot for surveillance synthetic data:
edge=0.85 + vis=0.45 + sigma=100
- edge 0.85 β strong geometry/silhouette preservation from Canny
- vis 0.45 β moderate scene structure preservation
- sigma 100 β balanced realism without losing the synthetic layout
Requirements
| Resource | Minimum |
|---|---|
| GPU | A100 / RTX 6000 Ada / H100 |
| VRAM | 40 GB |
| RAM | 64 GB |
| Disk | 30 GB (model weights) |
Part of VisionAI-Flywheel
This service is one component of a full synthetic surveillance data pipeline:
[kimodo-api] β NPZ motion
β
[render-api] β SOMA mesh render (MP4)
β
[cosmos-transfer] β Sim2Real photorealistic video β this image
β
[NVIDIA VSS] β VLM annotation β fine-tuning dataset
π Full pipeline: github.com/EyalEnav/VisionAI-Flywheel
License
Apache 2.0 β see LICENSE
Cosmos-Transfer2.5 model weights are released under the NVIDIA Open Model License and downloaded at runtime. They are not bundled in this image.
- Downloads last month
- -