File size: 2,224 Bytes
681f346
8652b14
681f346
8652b14
681f346
 
 
 
8652b14
 
 
 
 
681f346
8652b14
681f346
 
 
8652b14
681f346
8652b14
 
 
681f346
8652b14
 
 
681f346
8652b14
5e9da5a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
681f346
 
 
8652b14
 
681f346
 
 
8652b14
 
681f346
8652b14
681f346
 
 
8652b14
681f346
8652b14
681f346
8652b14
681f346
8652b14
681f346
 
 
 
 
 
 
 
 
8652b14
681f346
8652b14
681f346
 
8652b14
d5a893d
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
# WorldMem

Long-term consistent world simulation with memory.

## Environment (conda)

```bash
conda create -n worldmem python=3.10
conda activate worldmem
pip install -r requirements.txt
conda install -c conda-forge ffmpeg=4.3.2
```

## Data preparation (data folder)

1. Download the Minecraft dataset:
   https://huggingface.co/datasets/zeqixiao/worldmem_minecraft_dataset
2. Place it under `data/` with this structure:

```text
data/
└── minecraft/
    β”œβ”€β”€ training/
    β”œβ”€β”€ validation/
    └── test/
```

The training and evaluation scripts expect the dataset to live at `data/minecraft` by default.

## Checkpoints

Pretrained checkpoints are hosted on Hugging Face: `zeqixiao/worldmem_checkpoints`.

Example download to `checkpoints/`:

```bash
huggingface-cli download zeqixiao/worldmem_checkpoints diffusion_only.ckpt --local-dir checkpoints
huggingface-cli download zeqixiao/worldmem_checkpoints vae_only.ckpt --local-dir checkpoints
huggingface-cli download zeqixiao/worldmem_checkpoints pose_prediction_model_only.ckpt --local-dir checkpoints
```

Then point your scripts or configs to these files, for example:

```bash
python -m main +name=train +diffusion_model_path=checkpoints/diffusion_only.ckpt +vae_path=checkpoints/vae_only.ckpt
```

## Training

Run a single stage:

```bash
sh train_stage_1.sh
sh train_stage_2.sh
sh train_stage_3.sh
```

Run all stages:

```bash
sh train_3stages.sh
```

The stage scripts include dataset and checkpoint paths. Update those paths or override them on the CLI to match your local setup.

## Training config (exp_video.yaml)

Defaults live in `configurations/experiment/exp_video.yaml`.

Common fields to edit:
- `training.lr`
- `training.precision`
- `training.batch_size`
- `training.max_steps`
- `training.checkpointing.every_n_train_steps`
- `validation.val_every_n_step`
- `validation.batch_size`
- `test.batch_size`

You can also override values from the CLI used in the scripts:

```bash
python -m main +name=train experiment.training.batch_size=8 experiment.training.max_steps=100000
```

W&B run IDs: `configurations/training.yaml` has `resume` and `load` fields. The run ID is the short token in the run URL (for example, `ot7jqmgn`).