SJTULean
/

LeanFormalizer_PPO

Model card Files Files and versions

Improve language tag

#1

by lbourdois - opened Apr 28, 2025

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

Files changed (1) hide show

README.md +26 -14

README.md CHANGED Viewed

@@ -1,14 +1,26 @@
----
-license: apache-2.0
-datasets:
-- SJTULean/LeanStatement_SFT
-- SJTULean/LeanStatement_RL
-language:
-- en
-metrics:
-- accuracy
-base_model:
-- Qwen/Qwen2.5-7B-Instruct
----
-Through PPO using a reward model trained on merely 5% of the LeanStatement_RL dataset, our model achieved enhanced performance, reaching a **pass@1 compilation success rate** of **95.3%** (465/488) on MiniF2F and **76.0%** (284/374) on ProofNet benchmarks.
-Combined with our SFT results, these findings establish a new **state-of-the-art** in mathematics formalization as of December 2024.

+---
+license: apache-2.0
+datasets:
+- SJTULean/LeanStatement_SFT
+- SJTULean/LeanStatement_RL
+language:
+- zho
+- eng
+- fra
+- spa
+- por
+- deu
+- ita
+- rus
+- jpn
+- kor
+- vie
+- tha
+- ara
+metrics:
+- accuracy
+base_model:
+- Qwen/Qwen2.5-7B-Instruct
+---
+Through PPO using a reward model trained on merely 5% of the LeanStatement_RL dataset, our model achieved enhanced performance, reaching a **pass@1 compilation success rate** of **95.3%** (465/488) on MiniF2F and **76.0%** (284/374) on ProofNet benchmarks.
+Combined with our SFT results, these findings establish a new **state-of-the-art** in mathematics formalization as of December 2024.