Add `library_name` metadata, improve top links, and update citation URL
#1
by
nielsr
HF Staff
- opened
README.md
CHANGED
|
@@ -2,6 +2,8 @@
|
|
| 2 |
language:
|
| 3 |
- en
|
| 4 |
- multilingual
|
|
|
|
|
|
|
| 5 |
tags:
|
| 6 |
- physics
|
| 7 |
- reinforcement-learning
|
|
@@ -9,8 +11,7 @@ tags:
|
|
| 9 |
- reasoning
|
| 10 |
- competition
|
| 11 |
- education
|
| 12 |
-
|
| 13 |
-
pipeline_tag: text-generation
|
| 14 |
---
|
| 15 |
|
| 16 |
<div align="center">
|
|
@@ -18,6 +19,8 @@ pipeline_tag: text-generation
|
|
| 18 |
</div>
|
| 19 |
|
| 20 |
<p align="center">
|
|
|
|
|
|
|
| 21 |
<a href="https://prime-rl.github.io/P1/"><b>π P1 Project Page</b></a> |
|
| 22 |
<a href="https://phyarena.github.io/"><b>π HiPhO Leaderboard</b></a>
|
| 23 |
</p>
|
|
@@ -47,7 +50,7 @@ pipeline_tag: text-generation
|
|
| 47 |
<div align="center">
|
| 48 |
|
| 49 |
| Model | Score | Medal |
|
| 50 |
-
|
| 51 |
| **P1-30B-A3B** | **18.5** | **π₯ Silver** |
|
| 52 |
| DeepSeek-R1 | 18.5 | **π₯ Silver** |
|
| 53 |
| Qwen3-235B-A22B-Thinking-2507 | 17.1 | **π₯ Silver** |
|
|
@@ -59,7 +62,7 @@ pipeline_tag: text-generation
|
|
| 59 |
<div align="center">
|
| 60 |
|
| 61 |
| Category | P1-30B-A3B | Qwen3-235B-A22B | DeepSeek-R1 | Qwen3-30B-A3B (Base) |
|
| 62 |
-
|
| 63 |
| **Overall Score** | **32.5** | 33.5 | 32.9 | 29.9 |
|
| 64 |
| Gold Medals (π₯) | 8 | 10 | 9 | 6 |
|
| 65 |
| Silver Medals (π₯) | 4 | 3 | 3 | 6 |
|
|
@@ -74,8 +77,8 @@ Beyond physics reasoning, P1 improves across multiple domains. As shown below, P
|
|
| 74 |
|
| 75 |
<div align="center">
|
| 76 |
|
| 77 |
-
| Model | AIME24 | AIME25 | HMMT | GPQA | HLE | LiveCodeBench | LiveBench
|
| 78 |
-
|
| 79 |
| Qwen3-30B-A3B-Thinking-2507 (Base) | 90.4 | 85.0 | 71.3 | 73.0 | 11.6 | 66.7 | 76.6 |
|
| 80 |
| **P1-30B-A3B** | **91.0** | **91.0** | **76.9** | **74.4** | **14.3** | **68.1** | **77.0** |
|
| 81 |
|
|
@@ -138,8 +141,6 @@ We are grateful to the open-source community for their invaluable contributions.
|
|
| 138 |
title={P1: Mastering Physics Olympiads with Reinforcement Learning},
|
| 139 |
author={P1 Team},
|
| 140 |
year={2025},
|
| 141 |
-
url={https://
|
| 142 |
}
|
| 143 |
-
```
|
| 144 |
-
|
| 145 |
-
</div>
|
|
|
|
| 2 |
language:
|
| 3 |
- en
|
| 4 |
- multilingual
|
| 5 |
+
license: apache-2.0
|
| 6 |
+
pipeline_tag: text-generation
|
| 7 |
tags:
|
| 8 |
- physics
|
| 9 |
- reinforcement-learning
|
|
|
|
| 11 |
- reasoning
|
| 12 |
- competition
|
| 13 |
- education
|
| 14 |
+
library_name: transformers
|
|
|
|
| 15 |
---
|
| 16 |
|
| 17 |
<div align="center">
|
|
|
|
| 19 |
</div>
|
| 20 |
|
| 21 |
<p align="center">
|
| 22 |
+
<a href="https://huggingface.co/papers/2511.13612"><b>π Paper</b></a> |
|
| 23 |
+
<a href="https://github.com/PRIME-RL/P1"><b>π» Code</b></a> |
|
| 24 |
<a href="https://prime-rl.github.io/P1/"><b>π P1 Project Page</b></a> |
|
| 25 |
<a href="https://phyarena.github.io/"><b>π HiPhO Leaderboard</b></a>
|
| 26 |
</p>
|
|
|
|
| 50 |
<div align="center">
|
| 51 |
|
| 52 |
| Model | Score | Medal |
|
| 53 |
+
|:-----:|:-----:|:-----:|\
|
| 54 |
| **P1-30B-A3B** | **18.5** | **π₯ Silver** |
|
| 55 |
| DeepSeek-R1 | 18.5 | **π₯ Silver** |
|
| 56 |
| Qwen3-235B-A22B-Thinking-2507 | 17.1 | **π₯ Silver** |
|
|
|
|
| 62 |
<div align="center">
|
| 63 |
|
| 64 |
| Category | P1-30B-A3B | Qwen3-235B-A22B | DeepSeek-R1 | Qwen3-30B-A3B (Base) |
|
| 65 |
+
|:--------:|:----------:|:---------------:|:-----------:|:--------------------:|\
|
| 66 |
| **Overall Score** | **32.5** | 33.5 | 32.9 | 29.9 |
|
| 67 |
| Gold Medals (π₯) | 8 | 10 | 9 | 6 |
|
| 68 |
| Silver Medals (π₯) | 4 | 3 | 3 | 6 |
|
|
|
|
| 77 |
|
| 78 |
<div align="center">
|
| 79 |
|
| 80 |
+
| Model | AIME24 | AIME25 | HMMT | GPQA | HLE | LiveCodeBench | LiveBench |\
|
| 81 |
+
|:-----:|:------:|:------:|:----:|:----:|:---:|:-------------:|:---------:|\
|
| 82 |
| Qwen3-30B-A3B-Thinking-2507 (Base) | 90.4 | 85.0 | 71.3 | 73.0 | 11.6 | 66.7 | 76.6 |
|
| 83 |
| **P1-30B-A3B** | **91.0** | **91.0** | **76.9** | **74.4** | **14.3** | **68.1** | **77.0** |
|
| 84 |
|
|
|
|
| 141 |
title={P1: Mastering Physics Olympiads with Reinforcement Learning},
|
| 142 |
author={P1 Team},
|
| 143 |
year={2025},
|
| 144 |
+
url={https://huggingface.co/papers/2511.13612}
|
| 145 |
}
|
| 146 |
+
```
|
|
|
|
|
|