Improve model card: Add pipeline tag, links, and structure
#1
by
nielsr
HF Staff
- opened
README.md
CHANGED
|
@@ -1,23 +1,35 @@
|
|
| 1 |
---
|
| 2 |
-
license: mit
|
| 3 |
-
language:
|
| 4 |
-
- en
|
| 5 |
datasets:
|
| 6 |
- elefantai/p2p-full-data
|
| 7 |
- elefantai/p2p-toy-examples
|
|
|
|
|
|
|
|
|
|
|
|
|
| 8 |
---
|
|
|
|
|
|
|
|
|
|
| 9 |

|
| 10 |
|
| 11 |
Open Pixel2Play (P2P) is an open foundation model trained to play video games in real time. The model takes visual input (images) and text instructions and outputs keyboard and mouse actions, enabling direct interaction with real game environments.
|
| 12 |
|
| 13 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 14 |
|
| 15 |
Our smallest model (150M parameters) can be trained in ~70 hours, and the largest model (1.2B parameters) can be trained in ~140 hours on 8× H100 GPUs.
|
| 16 |
|
| 17 |
-
|
| 18 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 19 |
|
| 20 |
-
If you use our models, please kindly consider citing our paper:
|
| 21 |
```bibtex
|
| 22 |
@misc{yue2026scaling,
|
| 23 |
title={Scaling Behavior Cloning Improves Causal Reasoning: An Open Model for Real-Time Video Game Playing},
|
|
@@ -27,3 +39,4 @@ If you use our models, please kindly consider citing our paper:
|
|
| 27 |
archivePrefix={arXiv},
|
| 28 |
primaryClass={cs.LG}
|
| 29 |
}
|
|
|
|
|
|
| 1 |
---
|
|
|
|
|
|
|
|
|
|
| 2 |
datasets:
|
| 3 |
- elefantai/p2p-full-data
|
| 4 |
- elefantai/p2p-toy-examples
|
| 5 |
+
language:
|
| 6 |
+
- en
|
| 7 |
+
license: mit
|
| 8 |
+
pipeline_tag: image-text-to-text
|
| 9 |
---
|
| 10 |
+
|
| 11 |
+
# Open Pixel2Play (P2P)
|
| 12 |
+
|
| 13 |

|
| 14 |
|
| 15 |
Open Pixel2Play (P2P) is an open foundation model trained to play video games in real time. The model takes visual input (images) and text instructions and outputs keyboard and mouse actions, enabling direct interaction with real game environments.
|
| 16 |
|
| 17 |
+
- **Paper:** [Scaling Behavior Cloning Improves Causal Reasoning: An Open Model for Real-Time Video Game Playing](https://huggingface.co/papers/2601.04575)
|
| 18 |
+
- **Project Page:** [elefant-ai.github.io/open-p2p](https://elefant-ai.github.io/open-p2p/)
|
| 19 |
+
- **Repository:** [github.com/elefant-ai/open-p2p](https://github.com/elefant-ai/open-p2p)
|
| 20 |
+
|
| 21 |
+
P2P is trained on 8,000+ hours of human-annotated gameplay videos. The full dataset is available at [elefantai/p2p-full-data](https://huggingface.co/datasets/elefantai/p2p-full-data).
|
| 22 |
|
| 23 |
Our smallest model (150M parameters) can be trained in ~70 hours, and the largest model (1.2B parameters) can be trained in ~140 hours on 8× H100 GPUs.
|
| 24 |
|
| 25 |
+
## Usage
|
| 26 |
+
|
| 27 |
+
For detailed instructions on installation, training, offline inference, and real-time game inference (including integration with Recap on Windows), please refer to the [official GitHub repository](https://github.com/elefant-ai/open-p2p).
|
| 28 |
+
|
| 29 |
+
## Citation
|
| 30 |
+
|
| 31 |
+
If you use our models or data in your research, please kindly consider citing our paper:
|
| 32 |
|
|
|
|
| 33 |
```bibtex
|
| 34 |
@misc{yue2026scaling,
|
| 35 |
title={Scaling Behavior Cloning Improves Causal Reasoning: An Open Model for Real-Time Video Game Playing},
|
|
|
|
| 39 |
archivePrefix={arXiv},
|
| 40 |
primaryClass={cs.LG}
|
| 41 |
}
|
| 42 |
+
```
|