mmtke commited on
Commit
b3cc8b0
·
verified ·
1 Parent(s): ae49dc4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +77 -3
README.md CHANGED
@@ -1,3 +1,77 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - your-org/expr_2000
5
+ language:
6
+ - en
7
+ base_model:
8
+ - Qwen/Qwen2.5-Math-7B
9
+ pipeline_tag: text-generation
10
+ library_name: transformers
11
+ tags:
12
+ - lean4
13
+ - theorem-proving
14
+ - formal-mathematics
15
+ - retrieval-augmented
16
+ - mathematical-reasoning
17
+ ---
18
+
19
+ <div align="center">
20
+ <h1 style="font-size: 2.0em;">REAL-Prover: Retrieval Augmented Lean Prover for Mathematical Reasoning</h1>
21
+ <div style="display: flex; justify-content: center; gap: 8px; flex-wrap: wrap;">
22
+ <a href="https://arxiv.org/abs/2505.20613"><img src="https://img.shields.io/badge/arXiv-Paper-b31b1b.svg" alt="arXiv"></a>
23
+ <a href="https://github.com/frenzymath/REAL-Prover"><img src="https://img.shields.io/badge/GitHub-REAL--Prover-blue.svg" alt="REAL-Prover"></a>
24
+ <a href="https://choosealicense.com/licenses/apache-2.0/"><img src="https://img.shields.io/badge/License-Apache%202.0-blue.svg" alt="License: Apache 2.0"></a>
25
+ <a href="https://leanprover-community.github.io/"><img src="https://img.shields.io/badge/Lean-4-orange" alt="Lean 4"></a>
26
+ </div>
27
+ </div>
28
+
29
+ ## Overview
30
+
31
+ **REAL-Prover** is a retrieval-augmented, stepwise large language model-based theorem prover for **Lean 4**. It is built on top of **Qwen2.5-Math-7B**, and fine-tuned using our custom pipeline and dataset to support formal reasoning in Lean. The model is integrated with a retrieval module (**Leansearch-PS**) that fetches relevant theorems from the Lean math library to enhance generation quality.
32
+
33
+ Our approach introduces several innovations:
34
+
35
+ - A retrieval-enhanced generation pipeline
36
+ - A custom interactive environment (**Jixia-interactive**) for data synthesis
37
+ - Evaluations on both **ProofNet** and our proposed benchmark **FATE-M**
38
+
39
+
40
+ ## Associated Paper
41
+
42
+ **REAL-Prover: Retrieval Augmented Lean Prover for Mathematical Reasoning**
43
+ [arXiv:2505.20613](https://arxiv.org/abs/2505.20613)
44
+ ## Running Experiments
45
+ **Repository**: [https://github.com/frenzymath/REAL-Prover](https://github.com/frenzymath/REAL-Prover)
46
+
47
+ ### Input Data Format
48
+ Each input data should be a JSONL line with two keys: `id` and `formal_statement`.
49
+ See `data/minif2f_test.jsonl` for examples.
50
+
51
+ ### Start Leansearch-PS-inference server
52
+ See the detailed documentation in [LeanSearch-PS-inference/README.md](../LeanSearch-PS-inference/README.md).
53
+
54
+
55
+ ### Running
56
+
57
+
58
+
59
+ Create a runtime configuration file (in TOML format) in `./experiment/examples/`, then run:
60
+
61
+ ```bash
62
+ cd Realprover
63
+ python -m experiment.run /path/to/config.toml
64
+ ```
65
+
66
+ ## Citation
67
+
68
+ ```bibtex
69
+ @article{shen2025realproverretrievalaugmentedlean,
70
+ title={REAL-Prover: Retrieval Augmented Lean Prover for Mathematical Reasoning},
71
+ author={Ziju Shen and Naohao Huang and Fanyi Yang and Yutong Wang and Guoxiong Gao and Tianyi Xu and Jiedong Jiang and Wanyi He and Pu Yang and Mengzhou Sun and Haocheng Ju and Peihao Wu and Bryan Dai and Bin Dong},
72
+ year={2025},
73
+ eprint={2505.20613},
74
+ archivePrefix={arXiv},
75
+ primaryClass={cs.CL},
76
+ url={https://arxiv.org/abs/2505.20613},
77
+ }