Environment setup
Cosmos runs only on Linux systems. We have tested the installation with Ubuntu 24.04, 22.04, and 20.04.
Cosmos requires the Python version to be 3.10.x. Please also make sure you have conda installed (instructions).
The below commands creates the lyra conda environment and installs the dependencies for inference:
# Create the lyra conda environment.
conda env create --file lyra.yaml
# Activate the lyra conda environment.
conda activate lyra
# Install the dependencies.
pip install -r requirements_gen3c.txt
pip install -r requirements_lyra.txt
# Patch Transformer engine linking issues in conda environments.
ln -sf $CONDA_PREFIX/lib/python3.10/site-packages/nvidia/*/include/* $CONDA_PREFIX/include/
ln -sf $CONDA_PREFIX/lib/python3.10/site-packages/nvidia/*/include/* $CONDA_PREFIX/include/python3.10
# Install Transformer engine.
pip install transformer-engine[pytorch]==1.12.0
# Install Apex for inference.
git clone https://github.com/NVIDIA/apex
CUDA_HOME=$CONDA_PREFIX pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --config-settings "--build-option=--cpp_ext" --config-settings "--build-option=--cuda_ext" ./apex
# Install MoGe for inference.
pip install git+https://github.com/microsoft/MoGe.git
# Install Mamba for reconstruction model.
pip install --no-build-isolation "git+https://github.com/state-spaces/mamba@v2.2.4"
You can test the environment setup for inference with
CUDA_HOME=$CONDA_PREFIX PYTHONPATH=$(pwd) python scripts/test_environment.py
Download Cosmos-Predict1 tokenizer
Generate a Hugging Face access token (if you haven't done so already). Set the access token to
Readpermission (default isFine-grained).Log in to Hugging Face with the access token:
huggingface-cli loginDownload the Cosmos Tokenize model weights from Hugging Face:
python3 -m scripts.download_tokenizer_checkpoints --checkpoint_dir checkpoints/cosmos_predict1 --tokenizer_types CV8x8x8-720p
The downloaded files should be in the following structure:
checkpoints/
βββ Cosmos-Tokenize1-CV8x8x8-720p
βββ Cosmos-Tokenize1-DV8x16x16-720p
βββ Cosmos-Tokenize1-CI8x8-360p
βββ Cosmos-Tokenize1-CI16x16-360p
βββ Cosmos-Tokenize1-CV4x8x8-360p
βββ Cosmos-Tokenize1-DI8x8-360p
βββ Cosmos-Tokenize1-DI16x16-360p
βββ Cosmos-Tokenize1-DV4x8x8-360p
Under the checkpoint repository checkpoints/<model-name>, we provide the encoder, decoder, the full autoencoder in TorchScript (PyTorch JIT mode) and the native PyTorch checkpoints. For instance for Cosmos-Tokenize1-CV8x8x8-720p model:
βββ checkpoints/
β βββ Cosmos-Tokenize1-CV8x8x8-720p/
β β βββ encoder.jit
β β βββ decoder.jit
β β βββ autoencoder.jit
β β βββ model.pt
Download GEN3C checkpoints
Generate a Hugging Face access token (if you haven't done so already). Set the access token to
Readpermission (default isFine-grained).Log in to Hugging Face with the access token:
huggingface-cli loginDownload the GEN3C model weights from Hugging Face:
CUDA_HOME=$CONDA_PREFIX PYTHONPATH=$(pwd) python scripts/download_gen3c_checkpoints.py --checkpoint_dir checkpoints
Download Lyra checkpoints
- Download the Lyra model weights from Hugging Face:
CUDA_HOME=$CONDA_PREFIX PYTHONPATH=$(pwd) python scripts/download_lyra_checkpoints.py --checkpoint_dir checkpoints