ByteDance-Seed
/

BFS-Prover-V2-7B

+---
+license: apache-2.0
+base_model:
+- Qwen/Qwen2.5-Math-7B
+pipeline_tag: text-generation
+tags:
+- lean4
+- step-prover
+---
+<div align="center">
+  <h1 style="font-size: 2.0em;">🚀 BFS-Prover-V2: Scaling up Multi-Turn Off-Policy RL and Multi-Agent Tree Search for LLM Step-Provers</h1>
+  <div style="display: flex; justify-content: center; gap: 8px; flex-wrap: wrap;">
+    <a href="https://arxiv.org/abs/2509.06493"><img src="https://img.shields.io/badge/arXiv-2509.06493-b31b1b.svg" alt="arXiv"></a>
+    <a href="https://choosealicense.com/licenses/apache-2.0/"><img src="https://img.shields.io/badge/License-Apache%202.0-blue.svg" alt="License: Apache 2.0"></a>
+    <a href="https://github.com/leanprover-community/mathlib4"><img src="https://img.shields.io/badge/Lean-4-orange" alt="Lean 4"></a>
+  </div>
+  <h2>State-of-the-art tactic generation model in Lean4</h2>
+</div>
+This repository contains the latest tactic generator model checkpoint from BFS-Prover-V2, a state-of-the-art step-level theorem proving system in Lean4. While the full BFS-Prover-V2 system integrates multiple components for scalable theorem proving, we are releasing the core tactic generation model here. Given a proof state in Lean4, the model generates a tactic that transforms the current proof state into a new state, progressively working towards completing the proof.
+**📄 Paper: [Scaling up Multi-Turn Off-Policy RL and Multi-Agent Tree Search for LLM Step-Provers](https://arxiv.org/abs/2509.06493)**
+## ✨ Model Details
+- Base Model: Qwen2.5-32B
+- Training Approach: Multi-stage expert iteration with best-first tree search
+- Training Data Sources:
+  - Mathlib (via LeanDojo)
+  - Lean-Github repositories
+  - Autoformalized NuminaMath datasets
+## 📈 Performance
+BFS-Prover-V2-32B achieves 95.08% on the miniF2F test, when integrated with the planner-based multi-agent tree search system, which significantly outperforms all previous step-provers.
+Additionally, the model demonstrates strong generalization to undergraduate-level mathematics, independently attaining 41.4% on the ProofNet test without a planner.
+## ⚙️ Usage
+- The model expects Lean4 tactic states in the format `"{state}:::"`
+- `:::` serves as a special indicator to signal the model to generate a tactic for the given state.
+- The model will echo back the input state followed by the generated tactic.
+```python
+# Example code for loading and using the tactic generator model
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model = AutoModelForCausalLM.from_pretrained("ByteDance-Seed/BFS-Prover-V2-32B")
+tokenizer = AutoTokenizer.from_pretrained("ByteDance-Seed/BFS-Prover-V2-32B")
+# imo_1964_p2 from miniF2F
+state = """a b c : ℝ
+    h₀ : 0 < a ∧ 0 < b ∧ 0 < c
+    h₁ : c < a + b
+    h₂ : b < a + c
+    h₃ : a < b + c
+    ⊢ a ^ 2 * (b + c - a) + b ^ 2 * (c + a - b) + c ^ 2 * (a + b - c) ≤ 3 * a * b * c"""
+# Tactic generation
+sep = ":::"
+prompt = state + sep
+inputs = tokenizer(prompt, return_tensors="pt")
+outputs = model.generate(**inputs)
+tactic = tokenizer.decode(outputs[0], skip_special_tokens=True).split(sep)[1]
+print(tactic)
+# Generated tactic: "nlinarith [sq_nonneg (a - b), sq_nonneg (c - a), sq_nonneg (b - c)]"
+```
+## 📚 Citation
+If you use this model in your research, please cite our paper:
+```bibtex
+@article{xin2025bfsproverv2,
+  title={Scaling up Multi-Turn Off-Policy RL and Multi-Agent Tree Search for LLM Step-Provers},
+  author={Xin, Ran and Zheng, Zeyu and Nie, Yanchen and Yuan, Kun and Xiao, Xia},
+  journal={arXiv preprint arXiv:2509.06493},
+  year={2025}
+}
+```
+## 📄 License
+https://choosealicense.com/licenses/apache-2.0/
+## 📧 Contact
+For questions and feedback about the tactic generator model, please contact:
+- Ran Xin (ran.xin@bytedance.com)
+- Zeyu Zheng (zeyuzhen@andrew.cmu.edu)