Update README.md
Browse files
README.md
CHANGED
|
@@ -1,6 +1,10 @@
|
|
| 1 |
---
|
| 2 |
library_name: transformers
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4 |
---
|
| 5 |
|
| 6 |
# QED-Nano
|
|
@@ -21,16 +25,10 @@ tags: []
|
|
| 21 |
|
| 22 |
QED-Nano is a 4B parameter language model designed to push the theorem-proving boundaries of small models. It was post-trained to solve Olympiad-level proof problems by continually improving at test-time. When we scale test-time compute for our model, we get QED-Nano (Agent) which outperforms several much larger open source models (notably GPT-OSS-120B, Qwen3-235B-Thinking), and comes close to matching Gemini-3-Pro model on the challenging IMOProofBench benchmark:
|
| 23 |
|
| 24 |
-
[
|
| 25 |
|
| 26 |
QED-Nano is based on Qwen/Qwen3-4B-Thinking-2507, and was post-trained via a combination of supervised fine-tuning and reinforcement learning with reasoning cache on a mixture of Olympiads proof problems from various sources.
|
| 27 |
|
| 28 |
-
### Key features
|
| 29 |
-
- Reasoning model optimized for **theorem proving**
|
| 30 |
-
- **Fully open model**: open weights + full training details including public data mixture and training configs
|
| 31 |
-
|
| 32 |
-
For more details refer to our blog post: https://huggingface.co/spaces/lm-provers/qed-nano-blogpost
|
| 33 |
-
|
| 34 |
## How to use
|
| 35 |
|
| 36 |
```python
|
|
|
|
| 1 |
---
|
| 2 |
library_name: transformers
|
| 3 |
+
license: apache-2.0
|
| 4 |
+
language:
|
| 5 |
+
- en
|
| 6 |
+
base_model:
|
| 7 |
+
- Qwen/Qwen3-4B-Thinking-2507
|
| 8 |
---
|
| 9 |
|
| 10 |
# QED-Nano
|
|
|
|
| 25 |
|
| 26 |
QED-Nano is a 4B parameter language model designed to push the theorem-proving boundaries of small models. It was post-trained to solve Olympiad-level proof problems by continually improving at test-time. When we scale test-time compute for our model, we get QED-Nano (Agent) which outperforms several much larger open source models (notably GPT-OSS-120B, Qwen3-235B-Thinking), and comes close to matching Gemini-3-Pro model on the challenging IMOProofBench benchmark:
|
| 27 |
|
| 28 |
+

|
| 29 |
|
| 30 |
QED-Nano is based on Qwen/Qwen3-4B-Thinking-2507, and was post-trained via a combination of supervised fine-tuning and reinforcement learning with reasoning cache on a mixture of Olympiads proof problems from various sources.
|
| 31 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 32 |
## How to use
|
| 33 |
|
| 34 |
```python
|