Robotics
PyTorch
world-model
jepa
planning
File size: 5,292 Bytes
63aa637
 
 
 
 
 
 
 
 
 
 
 
074000e
63aa637
 
a06a805
 
 
 
 
074000e
 
a06a805
 
 
 
 
 
 
 
 
 
 
 
 
63aa637
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a06a805
63aa637
a06a805
63aa637
 
a06a805
 
 
 
63aa637
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a06a805
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
---
license: cc-by-nc-4.0
tags:
- robotics
- world-model
- jepa
- planning
- pytorch
library_name: pytorch
pipeline_tag: robotics
datasets:
- facebook/jepa-wms
arxiv: "2512.24497"
---

<h1 align="center">
    <p>πŸ€– <b>JEPA-WMs Pretrained Models</b></p>
</h1>

<div align="center" style="line-height: 1;">
  <a href="https://github.com/facebookresearch/jepa-wms" target="_blank" style="margin: 2px;"><img alt="Github" src="https://img.shields.io/badge/Github-facebookresearch%2Fjepa--wms-black?logo=github" style="display: inline-block; vertical-align: middle;"/></a>
  <a href="https://huggingface.co/facebook/jepa-wms" target="_blank" style="margin: 2px;"><img alt="HuggingFace" src="https://img.shields.io/badge/πŸ€—%20HuggingFace-facebook%2Fjepa--wms-ffc107" style="display: inline-block; vertical-align: middle;"/></a>
  <a href="https://arxiv.org/abs/2512.24497" target="_blank" style="margin: 2px;"><img alt="ArXiv" src="https://img.shields.io/badge/arXiv-2512.24497-b5212f?logo=arxiv" style="display: inline-block; vertical-align: middle;"/></a>
</div>

<br>

<p align="center">
  <b><a href="https://ai.facebook.com/research/">Meta AI Research, FAIR</a></b>
</p>

<p align="center">
  This πŸ€— HuggingFace repository hosts pretrained <b>JEPA-WM</b> world models.<br>
  πŸ‘‰ See the <a href="https://github.com/facebookresearch/jepa-wms">main repository</a> for training code and datasets.
</p>

This repository contains pretrained world model checkpoints from the paper
["What Drives Success in Physical Planning with Joint-Embedding Predictive World Models?"](https://arxiv.org/abs/2512.24497)

## Available Models

### JEPA-WM Models

| Model | Environment | Resolution | Encoder | Pred. Depth |
|-------|-------------|------------|---------|-------------|
| `jepa_wm_droid` | DROID & RoboCasa | 256Γ—256 | DINOv3 ViT-L/16 | 12 |
| `jepa_wm_metaworld` | Metaworld | 224Γ—224 | DINOv2 ViT-S/14 | 6 |
| `jepa_wm_pusht` | Push-T | 224Γ—224 | DINOv2 ViT-S/14 | 6 |
| `jepa_wm_pointmaze` | PointMaze | 224Γ—224 | DINOv2 ViT-S/14 | 6 |
| `jepa_wm_wall` | Wall | 224Γ—224 | DINOv2 ViT-S/14 | 6 |

### DINO-WM Baseline Models

| Model | Environment | Resolution | Encoder | Pred. Depth |
|-------|-------------|------------|---------|-------------|
| `dino_wm_droid` | DROID & RoboCasa | 224Γ—224 | DINOv2 ViT-S/14 | 6 |
| `dino_wm_metaworld` | Metaworld | 224Γ—224 | DINOv2 ViT-S/14 | 6 |
| `dino_wm_pusht` | Push-T | 224Γ—224 | DINOv2 ViT-S/14 | 6 |
| `dino_wm_pointmaze` | PointMaze | 224Γ—224 | DINOv2 ViT-S/14 | 6 |
| `dino_wm_wall` | Wall | 224Γ—224 | DINOv2 ViT-S/14 | 6 |

### V-JEPA-2-AC Baseline Models

| Model | Environment | Resolution | Encoder | Pred. Depth |
|-------|-------------|------------|---------|-------------|
| `vjepa2_ac_droid` | DROID & RoboCasa | 256Γ—256 | V-JEPA-2 ViT-G/16 | 24 |
| `vjepa2_ac_oss` | DROID & RoboCasa | 256Γ—256 | V-JEPA-2 ViT-G/16 | 24 |

### VM2M Decoder Heads

| Model | Encoder | Resolution |
|-------|---------|------------|
| `dinov2_vits_224` | DINOv2 ViT-S/14 | 224Γ—224 |
| `dinov2_vits_224_INet` | DINOv2 ViT-S/14 | 224Γ—224 |
| `dinov3_vitl_256_INet` | DINOv3 ViT-L/16 | 256Γ—256 |
| `vjepa2_vitg_256_INet` | V-JEPA-2 ViT-G/16 | 256Γ—256 |

## Usage

### Via PyTorch Hub (Recommended)

```python
import torch

# Load JEPA-WM models
model, preprocessor = torch.hub.load('facebookresearch/jepa-wms', 'jepa_wm_droid')
model, preprocessor = torch.hub.load('facebookresearch/jepa-wms', 'jepa_wm_metaworld')

# Load DINO-WM baselines
model, preprocessor = torch.hub.load('facebookresearch/jepa-wms', 'dino_wm_metaworld')

# Load V-JEPA-2-AC baseline
model, preprocessor = torch.hub.load('facebookresearch/jepa-wms', 'vjepa2_ac_droid')
```

### Via Hugging Face Hub

```python
from huggingface_hub import hf_hub_download
import torch

# Download a specific checkpoint
checkpoint_path = hf_hub_download(
    repo_id="facebook/jepa-wms",
    filename="jepa_wm_droid.pth.tar"
)

# Load checkpoint (contains 'encoder', 'predictor', and 'heads' state dicts)
checkpoint = torch.load(checkpoint_path, map_location="cpu")
print(checkpoint.keys())  # dict_keys(['encoder', 'predictor', 'heads', 'opt', 'scaler', 'epoch', 'batch_size', 'lr', 'amp'])
```

> **Note**: This only downloads the weights. To instantiate the full model with the correct
> architecture and load the weights, we recommend using PyTorch Hub (see above) or cloning the
> [jepa-wms repository](https://github.com/facebookresearch/jepa-wms) and using the training/eval scripts.

## Citation

```bibtex
@misc{terver2025drivessuccessphysicalplanning,
      title={What Drives Success in Physical Planning with Joint-Embedding Predictive World Models?},
      author={Basile Terver and Tsung-Yen Yang and Jean Ponce and Adrien Bardes and Yann LeCun},
      year={2025},
      eprint={2512.24497},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2512.24497},
}
```

## License

These models are licensed under [CC-BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/).

## Links

- πŸ“„ [Paper](https://arxiv.org/abs/2512.24497)
- πŸ’» [GitHub Repository](https://github.com/facebookresearch/jepa-wms)
- πŸ€— [Datasets](https://huggingface.co/datasets/facebook/jepa-wms)
- πŸ€— [Models](https://huggingface.co/facebook/jepa-wms)