RanXinByteDance commited on
Commit
bc433dd
Β·
verified Β·
1 Parent(s): 3908c64

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +95 -3
README.md CHANGED
@@ -1,3 +1,95 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model:
4
+ - Qwen/Qwen2.5-Math-7B
5
+ pipeline_tag: text-generation
6
+ tags:
7
+ - lean4
8
+ - step-prover
9
+ ---
10
+
11
+ <div align="center">
12
+ <h1 style="font-size: 2.0em;">πŸš€ BFS-Prover-V2: Scaling up Multi-Turn Off-Policy RL and Multi-Agent Tree Search for LLM Step-Provers</h1>
13
+ <div style="display: flex; justify-content: center; gap: 8px; flex-wrap: wrap;">
14
+ <a href="https://arxiv.org/abs/2509.06493"><img src="https://img.shields.io/badge/arXiv-2509.06493-b31b1b.svg" alt="arXiv"></a>
15
+ <a href="https://choosealicense.com/licenses/apache-2.0/"><img src="https://img.shields.io/badge/License-Apache%202.0-blue.svg" alt="License: Apache 2.0"></a>
16
+ <a href="https://github.com/leanprover-community/mathlib4"><img src="https://img.shields.io/badge/Lean-4-orange" alt="Lean 4"></a>
17
+ </div>
18
+ <h2>State-of-the-art tactic generation model in Lean4</h2>
19
+ </div>
20
+ This repository contains the latest tactic generator model checkpoint from BFS-Prover-V2, a state-of-the-art step-level theorem proving system in Lean4. While the full BFS-Prover-V2 system integrates multiple components for scalable theorem proving, we are releasing the core tactic generation model here. Given a proof state in Lean4, the model generates a tactic that transforms the current proof state into a new state, progressively working towards completing the proof.
21
+
22
+ **πŸ“„ Paper: [Scaling up Multi-Turn Off-Policy RL and Multi-Agent Tree Search for LLM Step-Provers](https://arxiv.org/abs/2509.06493)**
23
+
24
+
25
+ ## ✨ Model Details
26
+
27
+ - Base Model: Qwen2.5-32B
28
+ - Training Approach: Multi-stage expert iteration with best-first tree search
29
+ - Training Data Sources:
30
+ - Mathlib (via LeanDojo)
31
+ - Lean-Github repositories
32
+ - Autoformalized NuminaMath datasets
33
+
34
+ ## πŸ“ˆ Performance
35
+ BFS-Prover-V2-32B achieves 95.08% on the miniF2F test, when integrated with the planner-based multi-agent tree search system, which significantly outperforms all previous step-provers.
36
+ Additionally, the model demonstrates strong generalization to undergraduate-level mathematics, independently attaining 41.4% on the ProofNet test without a planner.
37
+
38
+ ## βš™οΈ Usage
39
+ - The model expects Lean4 tactic states in the format `"{state}:::"`
40
+ - `:::` serves as a special indicator to signal the model to generate a tactic for the given state.
41
+ - The model will echo back the input state followed by the generated tactic.
42
+
43
+
44
+ ```python
45
+ # Example code for loading and using the tactic generator model
46
+ from transformers import AutoModelForCausalLM, AutoTokenizer
47
+ model = AutoModelForCausalLM.from_pretrained("ByteDance-Seed/BFS-Prover-V2-32B")
48
+ tokenizer = AutoTokenizer.from_pretrained("ByteDance-Seed/BFS-Prover-V2-32B")
49
+
50
+ # imo_1964_p2 from miniF2F
51
+ state = """a b c : ℝ
52
+
53
+ hβ‚€ : 0 < a ∧ 0 < b ∧ 0 < c
54
+
55
+ h₁ : c < a + b
56
+
57
+ hβ‚‚ : b < a + c
58
+
59
+ h₃ : a < b + c
60
+
61
+ ⊒ a ^ 2 * (b + c - a) + b ^ 2 * (c + a - b) + c ^ 2 * (a + b - c) ≀ 3 * a * b * c"""
62
+
63
+ # Tactic generation
64
+ sep = ":::"
65
+ prompt = state + sep
66
+ inputs = tokenizer(prompt, return_tensors="pt")
67
+ outputs = model.generate(**inputs)
68
+ tactic = tokenizer.decode(outputs[0], skip_special_tokens=True).split(sep)[1]
69
+ print(tactic)
70
+
71
+ # Generated tactic: "nlinarith [sq_nonneg (a - b), sq_nonneg (c - a), sq_nonneg (b - c)]"
72
+ ```
73
+
74
+ ## πŸ“š Citation
75
+
76
+ If you use this model in your research, please cite our paper:
77
+
78
+ ```bibtex
79
+ @article{xin2025bfsproverv2,
80
+ title={Scaling up Multi-Turn Off-Policy RL and Multi-Agent Tree Search for LLM Step-Provers},
81
+ author={Xin, Ran and Zheng, Zeyu and Nie, Yanchen and Yuan, Kun and Xiao, Xia},
82
+ journal={arXiv preprint arXiv:2509.06493},
83
+ year={2025}
84
+ }
85
+ ```
86
+
87
+ ## πŸ“„ License
88
+
89
+ https://choosealicense.com/licenses/apache-2.0/
90
+
91
+ ## πŸ“§ Contact
92
+
93
+ For questions and feedback about the tactic generator model, please contact:
94
+ - Ran Xin (ran.xin@bytedance.com)
95
+ - Zeyu Zheng (zeyuzhen@andrew.cmu.edu)