Spatio-temporal Transformer: 1X World Model Compression Challenge

Code for our spatio-temporal Transformer which won the 1X World Model Compression Challenge.

The model has 136M parameters.

Setup

Install environment with

uv sync --all-extras --group gpu

Configure accelerate to use

uv run accelerate config

Download the data

Download the tokenized data with

uv run huggingface-cli download 1x-technologies/worldmodel --repo-type dataset --local-dir data/tokenized

Download the raw data with

uv run huggingface-cli download 1x-technologies/worldmodel_raw_data --repo-type dataset --local-dir data/raw

Download the Cosmos tokenizer

Download the Cosmos tokenizers (Cosmos-0.1-Tokenizer-DV8x8x8 and Cosmos-0.1-Tokenizer-DV8x16x16) with

uv run python download_cosmos_tokenizer.py

Running experiments

The run with test loss of 6.9334 was ran with

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 uv run accelerate launch --config_file accelerate/default_config.yaml src/train.py ++use_wandb=True ++per_device_batch_size=20 ++lr=8e-4 ++grad_accum_steps=1

Note we used an effective batch size of 20x8=160 without any gradient accumulation. If you're struggling for GPU memory you can decrease the per_device_batch_size and increase grad_accum_steps to get the same effective batch size.

We can also use the extra small Transformer with

 CUDA_VISIBLE_DEVICES=0 uv run python train.py ++use_wandb=True +model=xsmall

Inference from a checkpoint

Load a checkpoint and run inference with

CUDA_VISIBLE_DEVICES=0,1,2,3 uv run accelerate launch --num_processes 4 src/inference.py ++ckpt_dir="output/hydra/train_accelerate/2025-09-10_11-33-19

Generate submission

Generate a submission with

CUDA_VISIBLE_DEVICES=0 uv run python src/generate_submission_compression.py ++ckpt_dir="output/hydra/train_accelerate/2025-09-10_11-33-19"

It takes a little while to run as it has to save a separate NumPy file for each data point.

Downloads last month: -; Downloads are not tracked for this model. How to track

Safetensors

Model size

0.1B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support