Update README.md

f86aa4d verified 5 months ago

4.47 kB

	---
	license: apache-2.0
	base_model:
	- Qwen/Qwen2.5-Math-7B
	pipeline_tag: text-generation
	tags:
	- lean4
	- step-prover
	---

	<div align="center">
	<h1 style="font-size: 2.0em;">BFS-Prover-V2: Scaling up Multi-Turn Off-Policy RL and Multi-Agent Tree Search for LLM Step-Provers</h1>
	<div align="center" style="line-height: 1;">
	<a href="https://bfs-prover.github.io/V2/">
	<img src="https://img.shields.io/badge/Homepage-BFS--Prover--V2-78DED4?style=flat-square&labelColor=2E5AA8">
	</a>
	<a href="https://arxiv.org/abs/2509.06493">
	<img src="https://img.shields.io/badge/arXiv-2509.06493-b31b1b.svg?style=flat-square&labelColor=2E5AA8">
	</a>
	<a href="https://huggingface.co/collections/ByteDance-Seed/bfs-prover-68db961a5fdf9de045440230">
	<img src="https://img.shields.io/badge/GitHub-BFS--Prover--V2-808080?&style=flat-square&labelColor=2E5AA8">
	</a>
	<a href="https://github.com/cmu-l3/llmlean">
	<img src="https://img.shields.io/badge/Integration-LLMLean-black?style=flat-square&labelColor=2E5AA8">
	</a>
	<a href="https://www.apache.org/licenses/LICENSE-2.0.txt">
	<img src="https://img.shields.io/badge/License-Apache%202.0-purple.svg?style=flat-square&labelColor=2E5AA8">
	</a>
	</div>
	</div>

	## Introduction

	We introduce BFS-Prover-V2, the state-of-the-art open-source step-level theorem proving system for Lean4, designed to address the dual challenges of scaling both training and inference in neural theorem proving. BFS-Prover-V2 introduces novel solutions to overcome these limitations through:

	1. Training-time scaling: A novel multi-stage expert iteration framework with adaptive tactic-level data filtering and periodic retraining to surmount the performance plateaus that typically curtail long-term post training

	2. Inference-time scaling: A planner-enhanced multi-agent tree search system for hierarchical reasoning that scales performance at inference time

	BFS-Prover-V2 achieves 95.08\% and 41.4\% on the miniF2F and ProofNet test sets respectively, setting a new state-of-the-art for step-level provers.

	This repo contains the BFS-Prover-V2-7B model, with the following features:

	- Base Model: Qwen2.5-Math-7B
	- Training Approach: Multi-stage expert iteration with best-first tree search
	- Training Data Sources:
	- Mathlib (via LeanDojo)
	- Lean-Github repositories
	- Autoformalized NuminaMath datasets
	- Goedel-Pset

	## Benchmark Performance

	<div align="center">

	\| Model \| miniF2F-test \| miniF2F-valid \| ProofNet-test \|
	\|:------\|:------------:\|:-------------:\|:-------------:\|
	\| 👉 BFS-Prover-V2-7B \| 82.4% \| - \| - \|
	\| BFS-Prover-V2-32B \| 86.1% \| 85.5% \| 41.4% \|
	\| BFS-Prover-V2-32B w/ Planner \| 95.08% \| 95.5% \| - \|
	</div>

	## Usage
	- The model expects input in the format `"{state}:::"` where {state} is a Lean4 tactic state.
	- `:::` serves as a special indicator to signal the model to generate a tactic for the given state.
	- The model will echo back the input state followed by the generated tactic.


	```python
	# Example code for loading and using the tactic generator model
	from transformers import AutoModelForCausalLM, AutoTokenizer
	model = AutoModelForCausalLM.from_pretrained("ByteDance-Seed/BFS-Prover-V2-7B")
	tokenizer = AutoTokenizer.from_pretrained("ByteDance-Seed/BFS-Prover-V2-7B")

	# imo_1964_p2 from miniF2F
	state = """a b c : ℝ

	h₀ : 0 < a ∧ 0 < b ∧ 0 < c

	h₁ : c < a + b

	h₂ : b < a + c

	h₃ : a < b + c

	⊢ a ^ 2 * (b + c - a) + b ^ 2 * (c + a - b) + c ^ 2 * (a + b - c) ≤ 3 * a * b * c"""

	# Tactic generation
	sep = ":::"
	prompt = state + sep
	inputs = tokenizer(prompt, return_tensors="pt")
	outputs = model.generate(**inputs)
	tactic = tokenizer.decode(outputs[0], skip_special_tokens=True).split(sep)[1]
	print(tactic)

	# Generated tactic: "nlinarith [sq_nonneg (a - b), sq_nonneg (c - a), sq_nonneg (b - c)]"
	```

	## Citation

	```bibtex
	@article{xin2025scaling,
	title={Scaling up Multi-Turn Off-Policy RL and Multi-Agent Tree Search for LLM Step-Provers},
	author={Xin, Ran and Zheng, Zeyu and Nie, Yanchen and Yuan, Kun and Xiao, Xia},
	journal={arXiv preprint arXiv:2509.06493},
	year={2025}
	}
	```

	## License

	This project is licensed under the [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0.txt).

	## Contact

	For questions and feedback about the tactic generator model, please contact:
	- Ran Xin (ran.xin@bytedance.com)
	- Zeyu Zheng (zeyuzhen@andrew.cmu.edu)