Add SGLang to model card

#7
by majchrow - opened
Files changed (1) hide show
  1. README.md +10 -1
README.md CHANGED
@@ -10,6 +10,8 @@ tags:
10
  - cosmos3
11
  - vllm-omni
12
  - diffusers
 
 
13
  - image-to-video
14
  - video-generation
15
  countDownloads:
@@ -211,6 +213,7 @@ Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated sys
211
  - [PyTorch](https://github.com/nvidia/cosmos3)
212
  - [vLLM-Omni](https://github.com/vllm-project/vllm-omni)
213
  - [Hugging Face Diffusers](https://huggingface.co/docs/diffusers/en/index)
 
214
 
215
  **Supported Hardware Microarchitecture Compatibility:**
216
 
@@ -527,6 +530,12 @@ Example output generated by Diffusers:
527
 
528
  <video controls width="832" height="480" src="https://huggingface.co/nvidia/Cosmos3-Super-Image2Video/resolve/main/assets/example_output_diffusers.mp4"></video>
529
 
 
 
 
 
 
 
530
  ## Limitations
531
 
532
  Cosmos3 may produce imperfect outputs in challenging scenarios. Generation artifacts include temporal inconsistency, unstable camera or object motion, imprecise physical interactions, inaccurate audio-video synchronization, and action-state drift — especially in long-horizon or high-resolution outputs. Reasoning may also be incorrect: object states, causal relationships, spatial geometry, temporal ordering, agent intent, and future outcomes can be misinferred, and complex or long-context inputs may yield hallucinated entities, inconsistent interpretations, or implausible predictions. Because the model lacks an explicit physics simulator, 3D geometry, 4D space-time evolution, object permanence, contact dynamics, and physical laws are only approximated — producing artifacts such as disappearing or morphing objects, unrealistic collisions, and physically implausible motions. Quality further degrades in out-of-distribution environments, safety-critical edge cases, and domains underrepresented in training.
@@ -535,7 +544,7 @@ Cosmos3 outputs should not be treated as physically accurate simulation, reliabl
535
 
536
  ## Inference
537
 
538
- **Acceleration Engine:** [PyTorch](https://pytorch.org/), [vLLM](https://github.com/vllm-project/vllm), [vLLM-Omni](https://github.com/vllm-project/vllm-omni), [Hugging Face Diffusers](https://github.com/huggingface/diffusers)
539
 
540
  **Test Hardware:** GB200 and H100
541
 
 
10
  - cosmos3
11
  - vllm-omni
12
  - diffusers
13
+ - sglang
14
+ - sglang-diffusion
15
  - image-to-video
16
  - video-generation
17
  countDownloads:
 
213
  - [PyTorch](https://github.com/nvidia/cosmos3)
214
  - [vLLM-Omni](https://github.com/vllm-project/vllm-omni)
215
  - [Hugging Face Diffusers](https://huggingface.co/docs/diffusers/en/index)
216
+ - [SGLang](https://sgl-project.github.io/)
217
 
218
  **Supported Hardware Microarchitecture Compatibility:**
219
 
 
530
 
531
  <video controls width="832" height="480" src="https://huggingface.co/nvidia/Cosmos3-Super-Image2Video/resolve/main/assets/example_output_diffusers.mp4"></video>
532
 
533
+ ### SGLang
534
+
535
+ [SGLang Diffusion](https://sgl-project.github.io/diffusion) can serve `nvidia/Cosmos3-Super-Image2Video` through OpenAI-compatible video generation endpoints.
536
+
537
+ For complete serving instructions and request examples, see the [Cosmos3 SGLang cookbook](https://lmsysorg.mintlify.app/cookbook/diffusion/Cosmos/Cosmos3).
538
+
539
  ## Limitations
540
 
541
  Cosmos3 may produce imperfect outputs in challenging scenarios. Generation artifacts include temporal inconsistency, unstable camera or object motion, imprecise physical interactions, inaccurate audio-video synchronization, and action-state drift — especially in long-horizon or high-resolution outputs. Reasoning may also be incorrect: object states, causal relationships, spatial geometry, temporal ordering, agent intent, and future outcomes can be misinferred, and complex or long-context inputs may yield hallucinated entities, inconsistent interpretations, or implausible predictions. Because the model lacks an explicit physics simulator, 3D geometry, 4D space-time evolution, object permanence, contact dynamics, and physical laws are only approximated — producing artifacts such as disappearing or morphing objects, unrealistic collisions, and physically implausible motions. Quality further degrades in out-of-distribution environments, safety-critical edge cases, and domains underrepresented in training.
 
544
 
545
  ## Inference
546
 
547
+ **Acceleration Engine:** [PyTorch](https://pytorch.org/), [vLLM](https://github.com/vllm-project/vllm), [vLLM-Omni](https://github.com/vllm-project/vllm-omni), [Hugging Face Diffusers](https://github.com/huggingface/diffusers), [SGLang](https://sgl-project.github.io/), [SGLang Diffusion](https://sgl-project.github.io/diffusion)
548
 
549
  **Test Hardware:** GB200 and H100
550