Update README.md

02c1abc verified 3 days ago

1.77 kB

license: mit

Models used in 'Verification of the Implicit World Model in a Generative Model via Adversarial Sequences' (ICLR 2026).

This repo contains 48 chess-playing GPT-2 and LLaMA models, as well as 24 board state probes that were used in the experiments of the paper.

Each model architecture folder contains 6 subfolders for the 6 datasets used in our experiments. Each of these 6 subfolders contains 4 checkpoint files, corresponding to the four training methods we used:

Next-token prediction (NT) → next_token.ckpt
Matching the probability distribution (PD) of valid single token continuations → prob_dist.ckpt
NT with a jointly trained board state probe (NT+JP) → next_token_joint_probe.ckpt
PD with a jointly trained board state probe (PD+JP) → prob_dist_joint_probe.ckpt

Models trained without a joint probe have their linear board state probes in the probes folder.

Links

Paper links:
arXiv: https://arxiv.org/abs/2602.05903
HuggingFace: https://huggingface.co/papers/2602.05903

All corresponding code and links to further resources are available at https://github.com/szegedai/world-model-verification

Citation

If you use our code, models, or datasets, please cite the following:

@inproceedings{
  balogh2026verification,
  title={Verification of the Implicit World Model in a Generative Model via Adversarial Sequences},
  author={Andr{\'a}s Balogh and M{\'a}rk Jelasity},
  booktitle={The Fourteenth International Conference on Learning Representations},
  year={2026},
  url={https://openreview.net/forum?id=BLOIB8CwBI}
}

Erasiel
/

chess-models

Models used in 'Verification of the Implicit World Model in a Generative Model via Adversarial Sequences' (ICLR 2026).

Contents

Links

Citation