Update README.md
Browse files
README.md
CHANGED
|
@@ -14,4 +14,16 @@ P2P is trained on 8,000+ hours of human-annotated gameplay videos. The full data
|
|
| 14 |
|
| 15 |
Our smallest model (150M parameters) can be trained in ~70 hours, and the largest model (1.2B parameters) can be trained in ~140 hours on 8× H100 GPUs.
|
| 16 |
|
| 17 |
-
Please checkout our [website](https://elefant-ai.github.io/open-p2p/) to watch our model play against real human player on Roblox games,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 14 |
|
| 15 |
Our smallest model (150M parameters) can be trained in ~70 hours, and the largest model (1.2B parameters) can be trained in ~140 hours on 8× H100 GPUs.
|
| 16 |
|
| 17 |
+
Please checkout our [website](https://elefant-ai.github.io/open-p2p/) to watch our model play against real human player on Roblox games,
|
| 18 |
+
and checkout our [github](https://github.com/elefant-ai/open-p2p) for training/inference details. Our [arxiv paper](https://arxiv.org/abs/2601.04575) is also available.
|
| 19 |
+
|
| 20 |
+
If you use our models, please kindly consider citing our paper:
|
| 21 |
+
```bibtex
|
| 22 |
+
@misc{yue2026scaling,
|
| 23 |
+
title={Scaling Behavior Cloning Improves Causal Reasoning: An Open Model for Real-Time Video Game Playing},
|
| 24 |
+
author={Yuguang Yue and Irakli Salia and Samuel Hunt and Chris Green and Wenzhe Shi and Jonathan J. Hunt},
|
| 25 |
+
year={2026},
|
| 26 |
+
eprint={2601.04575},
|
| 27 |
+
archivePrefix={arXiv},
|
| 28 |
+
primaryClass={cs.LG}
|
| 29 |
+
}
|