File size: 2,224 Bytes

# WorldMem

Long-term consistent world simulation with memory.

## Environment (conda)

```bash
conda create -n worldmem python=3.10
conda activate worldmem
pip install -r requirements.txt
conda install -c conda-forge ffmpeg=4.3.2
```

## Data preparation (data folder)

1. Download the Minecraft dataset:
   https://huggingface.co/datasets/zeqixiao/worldmem_minecraft_dataset
2. Place it under `data/` with this structure:

```text
data/
└── minecraft/
    ├── training/
    ├── validation/
    └── test/
```

The training and evaluation scripts expect the dataset to live at `data/minecraft` by default.

## Checkpoints

Pretrained checkpoints are hosted on Hugging Face: `zeqixiao/worldmem_checkpoints`.

Example download to `checkpoints/`:

```bash
huggingface-cli download zeqixiao/worldmem_checkpoints diffusion_only.ckpt --local-dir checkpoints
huggingface-cli download zeqixiao/worldmem_checkpoints vae_only.ckpt --local-dir checkpoints
huggingface-cli download zeqixiao/worldmem_checkpoints pose_prediction_model_only.ckpt --local-dir checkpoints
```

Then point your scripts or configs to these files, for example:

```bash
python -m main +name=train +diffusion_model_path=checkpoints/diffusion_only.ckpt +vae_path=checkpoints/vae_only.ckpt
```

## Training

Run a single stage:

```bash
sh train_stage_1.sh
sh train_stage_2.sh
sh train_stage_3.sh
```

Run all stages:

```bash
sh train_3stages.sh
```

The stage scripts include dataset and checkpoint paths. Update those paths or override them on the CLI to match your local setup.

## Training config (exp_video.yaml)

Defaults live in `configurations/experiment/exp_video.yaml`.

Common fields to edit:
- `training.lr`
- `training.precision`
- `training.batch_size`
- `training.max_steps`
- `training.checkpointing.every_n_train_steps`
- `validation.val_every_n_step`
- `validation.batch_size`
- `test.batch_size`

You can also override values from the CLI used in the scripts:

```bash
python -m main +name=train experiment.training.batch_size=8 experiment.training.max_steps=100000
```

W&B run IDs: `configurations/training.yaml` has `resume` and `load` fields. The run ID is the short token in the run URL (for example, `ot7jqmgn`).