clone cosmos to huggingface

6d46028 about 1 year ago

3.69 kB

	# Release Notes

	# [01/27/2025](https://github.com/NVIDIA/Cosmos/commit/c82c9dc6f9a2f046033d0a26ec525bc389b641ef)
	- Stability and safety improvements

	# [01/09/2025](https://github.com/NVIDIA/Cosmos/commit/a6e2fdd49053ae75836cedc2a99c7c84bc1c8c1b)
	- Support [General Post-Training](../cosmos1/models/POST_TRAINING.md) through NeMO

	# [01/06/2025](https://github.com/NVIDIA/Cosmos/commit/00d50f897a111069d43386e626aecb2167259bca)

	- Initial release of Cosmos 1.0 along with the [Cosmos paper](https://research.nvidia.com/publication/2025-01_cosmos-world-foundation-model-platform-physical-ai)
	- 13 models were released in the [Hugging Face](https://huggingface.co/collections/nvidia/cosmos-6751e884dc10e013a0a0d8e6) as shown in the table below.
	- Inference scripts for the models were released in the [Cosmos repository](https://github.com/NVIDIA/Cosmos).

	\| Item \| Model name \| Description \| Try it out \|
	\|--\|------------\|----------\|----------\|
	\|1\| [Cosmos-1.0-Diffusion-7B-Text2World](https://huggingface.co/nvidia/Cosmos-1.0-Diffusion-7B-Text2World) \| Text to visual world generation \| [Inference](cosmos1/models/diffusion/README.md) \|
	\|2\| [Cosmos-1.0-Diffusion-14B-Text2World](https://huggingface.co/nvidia/Cosmos-1.0-Diffusion-14B-Text2World) \| Text to visual world generation \| [Inference](cosmos1/models/diffusion/README.md) \|
	\|3\| [Cosmos-1.0-Diffusion-7B-Video2World](https://huggingface.co/nvidia/Cosmos-1.0-Diffusion-7B-Video2World) \| Video + Text based future visual world generation \| [Inference](cosmos1/models/diffusion/README.md) \|
	\|4\| [Cosmos-1.0-Diffusion-14B-Video2World](https://huggingface.co/nvidia/Cosmos-1.0-Diffusion-14B-Video2World) \| Video + Text based future visual world generation \| [Inference](cosmos1/models/diffusion/README.md) \|
	\|5\| [Cosmos-1.0-Autoregressive-4B](https://huggingface.co/nvidia/Cosmos-1.0-Autoregressive-4B) \| Future visual world generation \| [Inference](cosmos1/models/autoregressive/README.md) \|
	\|6\| [Cosmos-1.0-Autoregressive-12B](https://huggingface.co/nvidia/Cosmos-1.0-Autoregressive-12B) \| Future visual world generation \| [Inference](cosmos1/models/autoregressive/README.md) \|
	\|7\| [Cosmos-1.0-Autoregressive-5B-Video2World](https://huggingface.co/nvidia/Cosmos-1.0-Autoregressive-5B-Video2World) \| Video + Text based future visual world generation \| [Inference](cosmos1/models/autoregressive/README.md) \|
	\|8\| [Cosmos-1.0-Autoregressive-13B-Video2World](https://huggingface.co/nvidia/Cosmos-1.0-Autoregressive-13B-Video2World) \| Video + Text based future visual world generation \| [Inference](cosmos1/models/autoregressive/README.md) \|
	\|9\| [Cosmos-1.0-Tokenizer-CV8x8x8](https://huggingface.co/nvidia/Cosmos-1.0-Tokenizer-CV8x8x8) \| Continuous video tokenizer with 8x8x8 compression ratio \| [Inference](cosmos1/models/diffusion/README.md) \|
	\|10\| [Cosmos-1.0-Tokenizer-DV8x16x16](https://huggingface.co/nvidia/Cosmos-1.0-Tokenizer-DV8x16x16) \| Discrete video tokenizer with 16x8x8 compression ratio \| [Inference](cosmos1/models/autoregressive/README.md) \|
	\|11\| [Cosmos-1.0-PromptUpsampler-12B-Text2World](https://huggingface.co/nvidia/Cosmos-1.0-Prompt-Upsampler-12B-Text2World) \| Prompt upsampler for Text2World \| [Inference](cosmos1/models/diffusion/README.md) \|
	\|12\| [Cosmos-1.0-Diffusion-7B-Decoder-DV8x16x16ToCV8x8x8](https://huggingface.co/nvidia/Cosmos-1.0-Diffusion-7B-Decoder-DV8x16x16ToCV8x8x8) \| Diffusion decoder for enhancing Cosmos 1.0 autoregressive WFMs' outputs \| [Inference](cosmos1/models/autoregressive/README.md) \|
	\|13\| [Cosmos-1.0-Guardrail](https://huggingface.co/nvidia/Cosmos-1.0-Guardrail) \| Guardrail contains pre-Guard and post-Guard for safe use \| Embedded in model inference scripts \|

	# Release Notes

	# [01/27/2025](https://github.com/NVIDIA/Cosmos/commit/c82c9dc6f9a2f046033d0a26ec525bc389b641ef)
	- Stability and safety improvements

	# [01/09/2025](https://github.com/NVIDIA/Cosmos/commit/a6e2fdd49053ae75836cedc2a99c7c84bc1c8c1b)
	- Support [General Post-Training](../cosmos1/models/POST_TRAINING.md) through NeMO

	# [01/06/2025](https://github.com/NVIDIA/Cosmos/commit/00d50f897a111069d43386e626aecb2167259bca)

	- Initial release of Cosmos 1.0 along with the [Cosmos paper](https://research.nvidia.com/publication/2025-01_cosmos-world-foundation-model-platform-physical-ai)
	- 13 models were released in the [Hugging Face](https://huggingface.co/collections/nvidia/cosmos-6751e884dc10e013a0a0d8e6) as shown in the table below.
	- Inference scripts for the models were released in the [Cosmos repository](https://github.com/NVIDIA/Cosmos).

	\| Item \| Model name \| Description \| Try it out \|
	\|--\|------------\|----------\|----------\|
	\|1\| [Cosmos-1.0-Diffusion-7B-Text2World](https://huggingface.co/nvidia/Cosmos-1.0-Diffusion-7B-Text2World) \| Text to visual world generation \| [Inference](cosmos1/models/diffusion/README.md) \|
	\|2\| [Cosmos-1.0-Diffusion-14B-Text2World](https://huggingface.co/nvidia/Cosmos-1.0-Diffusion-14B-Text2World) \| Text to visual world generation \| [Inference](cosmos1/models/diffusion/README.md) \|
	\|3\| [Cosmos-1.0-Diffusion-7B-Video2World](https://huggingface.co/nvidia/Cosmos-1.0-Diffusion-7B-Video2World) \| Video + Text based future visual world generation \| [Inference](cosmos1/models/diffusion/README.md) \|
	\|4\| [Cosmos-1.0-Diffusion-14B-Video2World](https://huggingface.co/nvidia/Cosmos-1.0-Diffusion-14B-Video2World) \| Video + Text based future visual world generation \| [Inference](cosmos1/models/diffusion/README.md) \|
	\|5\| [Cosmos-1.0-Autoregressive-4B](https://huggingface.co/nvidia/Cosmos-1.0-Autoregressive-4B) \| Future visual world generation \| [Inference](cosmos1/models/autoregressive/README.md) \|
	\|6\| [Cosmos-1.0-Autoregressive-12B](https://huggingface.co/nvidia/Cosmos-1.0-Autoregressive-12B) \| Future visual world generation \| [Inference](cosmos1/models/autoregressive/README.md) \|
	\|7\| [Cosmos-1.0-Autoregressive-5B-Video2World](https://huggingface.co/nvidia/Cosmos-1.0-Autoregressive-5B-Video2World) \| Video + Text based future visual world generation \| [Inference](cosmos1/models/autoregressive/README.md) \|
	\|8\| [Cosmos-1.0-Autoregressive-13B-Video2World](https://huggingface.co/nvidia/Cosmos-1.0-Autoregressive-13B-Video2World) \| Video + Text based future visual world generation \| [Inference](cosmos1/models/autoregressive/README.md) \|
	\|9\| [Cosmos-1.0-Tokenizer-CV8x8x8](https://huggingface.co/nvidia/Cosmos-1.0-Tokenizer-CV8x8x8) \| Continuous video tokenizer with 8x8x8 compression ratio \| [Inference](cosmos1/models/diffusion/README.md) \|
	\|10\| [Cosmos-1.0-Tokenizer-DV8x16x16](https://huggingface.co/nvidia/Cosmos-1.0-Tokenizer-DV8x16x16) \| Discrete video tokenizer with 16x8x8 compression ratio \| [Inference](cosmos1/models/autoregressive/README.md) \|
	\|11\| [Cosmos-1.0-PromptUpsampler-12B-Text2World](https://huggingface.co/nvidia/Cosmos-1.0-Prompt-Upsampler-12B-Text2World) \| Prompt upsampler for Text2World \| [Inference](cosmos1/models/diffusion/README.md) \|
	\|12\| [Cosmos-1.0-Diffusion-7B-Decoder-DV8x16x16ToCV8x8x8](https://huggingface.co/nvidia/Cosmos-1.0-Diffusion-7B-Decoder-DV8x16x16ToCV8x8x8) \| Diffusion decoder for enhancing Cosmos 1.0 autoregressive WFMs' outputs \| [Inference](cosmos1/models/autoregressive/README.md) \|
	\|13\| [Cosmos-1.0-Guardrail](https://huggingface.co/nvidia/Cosmos-1.0-Guardrail) \| Guardrail contains pre-Guard and post-Guard for safe use \| Embedded in model inference scripts \|