Shell Game World Model Demo (~18M)

A small shell-game world model that predicts which cup holds a hidden ball from a single contact-sheet image.

Local UI overview

Links

What This Model Does

The bundled checkpoint is designed for a shell-game benchmark:

  • three cups
  • one hidden ball
  • smooth cup swaps
  • predict the final cup from the observed sequence

The repo slices a contact sheet into ordered frames, runs the checkpoint locally, and returns the final cup prediction.

Main Result

Hidden-ball balanced accuracy:

Setting Result
1 swap 100.0% ± 0.0%
3 swaps 74.3% ± 2.9%
1-2-3-4 swaps 75.8% ± 2.7%
1-2-3-4-5 swaps 74.5% ± 1.2%
1-2-3-4-5-6 swaps 73.7% ± 1.1%

Random chance is 33.3%.

Important Caveat

This is not a plain next-step JEPA checkpoint.

The working shell-game variant uses explicit hidden-state supervision during training. Pure next-step JEPA-style prediction did not solve this benchmark.

Quick Start

git clone https://github.com/illegalcall/jepa-track-hidden-ball.git
cd jepa-track-hidden-ball
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

PYTHONPATH=local_inference_assets python3 demo_jepawm_predict.py \
  --checkpoint /path/to/lewm_auxonly_123456_h12_epoch_12_object.ckpt \
  --sheet demo_cases/case_1/sheet.png \
  --history-size 12 \
  --device cpu \
  --output result.json

Local Demo UI

python3 serve_demo_ui.py --host 127.0.0.1 --port 8123

Then open:

  • http://127.0.0.1:8123/demo_ui/

Local UI output card

Model Size

This checkpoint has 18,048,683 trainable parameters.

Files In This Release

  • lewm_auxonly_123456_h12_epoch_12_object.ckpt
  • local-ui-overview.png
  • local-ui-output-card.png

Limitations

  • this checkpoint is specialized to the shell-game benchmark in the repo
  • it is not a general-purpose vision model
  • retraining from scratch still depends on upstream LeWM / stable-worldmodel code
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support