Cosmos
Diffusers
Safetensors
cosmos3_omni
nvidia
cosmos3
vllm
vllm-omni
text, image, video, audio, and action generation
omnimodel
Instructions to use nvidia/Cosmos3-Nano with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Cosmos
How to use nvidia/Cosmos3-Nano with Cosmos:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Diffusers
How to use nvidia/Cosmos3-Nano with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("nvidia/Cosmos3-Nano", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
MickJ commited on
Commit ·
cfa6ad7
1
Parent(s): 24d4790
Use async SGLang video API
Browse files
README.md
CHANGED
|
@@ -928,6 +928,8 @@ Cosmos3 outputs should not be treated as physically accurate simulation, reliabl
|
|
| 928 |
[SGLang Diffusion](https://github.com/sgl-project/sglang) can serve Cosmos3-Nano through OpenAI-compatible image and video endpoints. Install SGLang from source with diffusion dependencies, then start a server:
|
| 929 |
|
| 930 |
```shell
|
|
|
|
|
|
|
| 931 |
pip install -e "python[diffusion]"
|
| 932 |
pip install "cosmos-guardrail==0.3.1"
|
| 933 |
|
|
@@ -945,14 +947,13 @@ Supported SGLang endpoints:
|
|
| 945 |
| Mode | Endpoint | Notes |
|
| 946 |
| --- | --- | --- |
|
| 947 |
| Text to image | `POST /v1/images/generations` | Returns base64 image data by default |
|
| 948 |
-
| Text to video | `POST /v1/videos
|
| 949 |
-
| Image to video | `POST /v1/videos
|
| 950 |
|
| 951 |
Example text-to-video request:
|
| 952 |
|
| 953 |
```shell
|
| 954 |
-
curl -sS -X POST http://localhost:8000/v1/videos
|
| 955 |
-
-H "Accept: video/mp4" \
|
| 956 |
--form-string "prompt=A small warehouse robot moves a blue box across a clean floor." \
|
| 957 |
--form-string "negative_prompt=blurry, distorted, low quality" \
|
| 958 |
--form-string "size=1280x720" \
|
|
@@ -963,6 +964,17 @@ curl -sS -X POST http://localhost:8000/v1/videos/sync \
|
|
| 963 |
--form-string "flow_shift=10.0" \
|
| 964 |
--form-string "seed=42" \
|
| 965 |
--form-string 'extra_params={"guardrails":true,"use_resolution_template":false,"use_duration_template":false}' \
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 966 |
-o cosmos3_t2v_output.mp4
|
| 967 |
```
|
| 968 |
|
|
|
|
| 928 |
[SGLang Diffusion](https://github.com/sgl-project/sglang) can serve Cosmos3-Nano through OpenAI-compatible image and video endpoints. Install SGLang from source with diffusion dependencies, then start a server:
|
| 929 |
|
| 930 |
```shell
|
| 931 |
+
git clone https://github.com/sgl-project/sglang.git
|
| 932 |
+
cd sglang
|
| 933 |
pip install -e "python[diffusion]"
|
| 934 |
pip install "cosmos-guardrail==0.3.1"
|
| 935 |
|
|
|
|
| 947 |
| Mode | Endpoint | Notes |
|
| 948 |
| --- | --- | --- |
|
| 949 |
| Text to image | `POST /v1/images/generations` | Returns base64 image data by default |
|
| 950 |
+
| Text to video | `POST /v1/videos` | Creates an async job; poll `GET /v1/videos/{id}` and download `/content` |
|
| 951 |
+
| Image to video | `POST /v1/videos` | Upload the conditioning image with `input_reference` |
|
| 952 |
|
| 953 |
Example text-to-video request:
|
| 954 |
|
| 955 |
```shell
|
| 956 |
+
job_id=$(curl -sS -X POST http://localhost:8000/v1/videos \
|
|
|
|
| 957 |
--form-string "prompt=A small warehouse robot moves a blue box across a clean floor." \
|
| 958 |
--form-string "negative_prompt=blurry, distorted, low quality" \
|
| 959 |
--form-string "size=1280x720" \
|
|
|
|
| 964 |
--form-string "flow_shift=10.0" \
|
| 965 |
--form-string "seed=42" \
|
| 966 |
--form-string 'extra_params={"guardrails":true,"use_resolution_template":false,"use_duration_template":false}' \
|
| 967 |
+
| python -c 'import json, sys; print(json.load(sys.stdin)["id"])')
|
| 968 |
+
|
| 969 |
+
while true; do
|
| 970 |
+
status=$(curl -sS "http://localhost:8000/v1/videos/${job_id}" \
|
| 971 |
+
| python -c 'import json, sys; print(json.load(sys.stdin)["status"])')
|
| 972 |
+
[ "$status" = "completed" ] && break
|
| 973 |
+
[ "$status" = "failed" ] && exit 1
|
| 974 |
+
sleep 1
|
| 975 |
+
done
|
| 976 |
+
|
| 977 |
+
curl -sS -L "http://localhost:8000/v1/videos/${job_id}/content" \
|
| 978 |
-o cosmos3_t2v_output.mp4
|
| 979 |
```
|
| 980 |
|