NeverMore0123's picture
clone cosmos to huggingface
6d46028
# Release Notes
# [01/27/2025](https://github.com/NVIDIA/Cosmos/commit/c82c9dc6f9a2f046033d0a26ec525bc389b641ef)
- Stability and safety improvements
# [01/09/2025](https://github.com/NVIDIA/Cosmos/commit/a6e2fdd49053ae75836cedc2a99c7c84bc1c8c1b)
- Support [General Post-Training](../cosmos1/models/POST_TRAINING.md) through NeMO
# [01/06/2025](https://github.com/NVIDIA/Cosmos/commit/00d50f897a111069d43386e626aecb2167259bca)
- Initial release of Cosmos 1.0 along with the [Cosmos paper](https://research.nvidia.com/publication/2025-01_cosmos-world-foundation-model-platform-physical-ai)
- 13 models were released in the [Hugging Face](https://huggingface.co/collections/nvidia/cosmos-6751e884dc10e013a0a0d8e6) as shown in the table below.
- Inference scripts for the models were released in the [Cosmos repository](https://github.com/NVIDIA/Cosmos).
| Item | Model name | Description | Try it out |
|--|------------|----------|----------|
|1| [Cosmos-1.0-Diffusion-7B-Text2World](https://huggingface.co/nvidia/Cosmos-1.0-Diffusion-7B-Text2World) | Text to visual world generation | [Inference](cosmos1/models/diffusion/README.md) |
|2| [Cosmos-1.0-Diffusion-14B-Text2World](https://huggingface.co/nvidia/Cosmos-1.0-Diffusion-14B-Text2World) | Text to visual world generation | [Inference](cosmos1/models/diffusion/README.md) |
|3| [Cosmos-1.0-Diffusion-7B-Video2World](https://huggingface.co/nvidia/Cosmos-1.0-Diffusion-7B-Video2World) | Video + Text based future visual world generation | [Inference](cosmos1/models/diffusion/README.md) |
|4| [Cosmos-1.0-Diffusion-14B-Video2World](https://huggingface.co/nvidia/Cosmos-1.0-Diffusion-14B-Video2World) | Video + Text based future visual world generation | [Inference](cosmos1/models/diffusion/README.md) |
|5| [Cosmos-1.0-Autoregressive-4B](https://huggingface.co/nvidia/Cosmos-1.0-Autoregressive-4B) | Future visual world generation | [Inference](cosmos1/models/autoregressive/README.md) |
|6| [Cosmos-1.0-Autoregressive-12B](https://huggingface.co/nvidia/Cosmos-1.0-Autoregressive-12B) | Future visual world generation | [Inference](cosmos1/models/autoregressive/README.md) |
|7| [Cosmos-1.0-Autoregressive-5B-Video2World](https://huggingface.co/nvidia/Cosmos-1.0-Autoregressive-5B-Video2World) | Video + Text based future visual world generation | [Inference](cosmos1/models/autoregressive/README.md) |
|8| [Cosmos-1.0-Autoregressive-13B-Video2World](https://huggingface.co/nvidia/Cosmos-1.0-Autoregressive-13B-Video2World) | Video + Text based future visual world generation | [Inference](cosmos1/models/autoregressive/README.md) |
|9| [Cosmos-1.0-Tokenizer-CV8x8x8](https://huggingface.co/nvidia/Cosmos-1.0-Tokenizer-CV8x8x8) | Continuous video tokenizer with 8x8x8 compression ratio | [Inference](cosmos1/models/diffusion/README.md) |
|10| [Cosmos-1.0-Tokenizer-DV8x16x16](https://huggingface.co/nvidia/Cosmos-1.0-Tokenizer-DV8x16x16) | Discrete video tokenizer with 16x8x8 compression ratio | [Inference](cosmos1/models/autoregressive/README.md) |
|11| [Cosmos-1.0-PromptUpsampler-12B-Text2World](https://huggingface.co/nvidia/Cosmos-1.0-Prompt-Upsampler-12B-Text2World) | Prompt upsampler for Text2World | [Inference](cosmos1/models/diffusion/README.md) |
|12| [Cosmos-1.0-Diffusion-7B-Decoder-DV8x16x16ToCV8x8x8](https://huggingface.co/nvidia/Cosmos-1.0-Diffusion-7B-Decoder-DV8x16x16ToCV8x8x8) | Diffusion decoder for enhancing Cosmos 1.0 autoregressive WFMs' outputs | [Inference](cosmos1/models/autoregressive/README.md) |
|13| [Cosmos-1.0-Guardrail](https://huggingface.co/nvidia/Cosmos-1.0-Guardrail) | Guardrail contains pre-Guard and post-Guard for safe use | Embedded in model inference scripts |