lm-provers
/

QED-Nano

Text Generation

text-generation-inference

Model card Files Files and versions

ars22 commited on 22 days ago

Commit

a03dd85

·

verified ·

1 Parent(s): 455ac25

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -22,7 +22,7 @@ base_model:
 ## Model Summary
-QED-Nano is a 4B parameter model explicitly post-trained to strengthen its proof-writing capabilities. Despite its small size, QED-Nano achieves an impressive 40% score on the challenging IMO-ProofBench benchmark (+20% over the Qwen3 base model), matching the performance of [GPT-OSS-120B](https://huggingface.co/openai/gpt-oss-120b) from OpenAI. With an agent scaffold that scales inference-time compute to over 1M tokens per problem, QED-Nano approaches the performance of Gemini-3-Pro:
 ![imoproofbench.png](https://huggingface.co/lm-provers/QED-Nano/resolve/main/imoproofbench.png)

 ## Model Summary
+QED-Nano is a 4B parameter model explicitly post-trained to strengthen its proof-writing capabilities. Despite its small size, QED-Nano achieves an impressive 40% score on the challenging IMO-ProofBench benchmark (+20% over the Qwen3 base model), matching the performance of [GPT-OSS-120B](https://huggingface.co/openai/gpt-oss-120b) from OpenAI. With an agent scaffold that scales inference-time compute to over 1M tokens per problem, QED-Nano approaches the performance of Gemini-3-Pro. Crucially, the same agentic scaffold on the base model (Qwen3-4B-Thinking-2507) barely improves performance.
 ![imoproofbench.png](https://huggingface.co/lm-provers/QED-Nano/resolve/main/imoproofbench.png)