File size: 3,933 Bytes
55a1033
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5ffa9a8
55a1033
 
 
 
 
b1e10aa
55a1033
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5ffa9a8
55a1033
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
---
license: apache-2.0
datasets:
- purewhite42/DExploration_Dataset
language:
- en
base_model: Goedel-Prover/Goedel-Prover-V2-8B
pipeline_tag: text-generation
library_name: transformers
tags:
- formal-mathematics
- lean4
- data-synthesis
- problem-generation
- formal-statement-synthesis
---

<div align="center">
  <h1 style="font-size: 1.5em;">[ICLR'26] Let's Explore Step by Step: Generating Provable Formal Statements with Deductive Exploration</h1>
  <div style="display: flex; justify-content: center; gap: 8px; flex-wrap: wrap;">
    <a href="https://choosealicense.com/licenses/apache-2.0/"><img src="https://img.shields.io/badge/License-Apache%202.0-blue.svg" alt="License: Apache 2.0"></a>
    <a href="https://github.com/leanprover-community/mathlib4"><img src="https://img.shields.io/badge/Lean-4-orange" alt="Lean 4"></a>
    <a href="https://github.com/Purewhite2019/dexploration_main"><img src="https://img.shields.io/badge/GitHub-%23121011.svg?logo=github&logoColor=white" alt="GitHub"></a>
  </div>
  <h2>Qi Liu, Kangjie Bao, Yue Yang, Xinhao Zheng, Renqiu Xia, Qinxiang Cao, Junchi Yan* (* indicates corresponding author)</h2>
  <h3>School of Computer Science & School of Artificial Intelligence, Shanghai Jiao Tong University</h3>
  <h3>Shanghai Innovation Institute</h3>
</div>

Please refer to the [📺GitHub repo](https://github.com/Purewhite2019/dexploration_main) and
[📃Paper](https://openreview.net/pdf?id=Njrkeo3DiJ) for more details.

## 🔍 About

**DExplorer-8B** is a Lean 4-based agent that generates provable formal mathematical statements through step-by-step **Deductive Exploration (DExploration)**. Instead of directly synthesizing problems in one shot, DExplorer explores the mathematical world step by step — introducing variables/hypotheses, deriving intermediate facts, and submitting conclusions — with each step verified by the Lean 4 kernel. This ensures the **provability** of generated statements while enabling the synthesis of **complex** problems that push the limits of state-of-the-art provers.

This model is fine-tuned from [Goedel-Prover-V2-8B](https://huggingface.co/Goedel-LM/Goedel-Prover-V2-8B) on [DExploration-40K](https://huggingface.co/datasets/purewhite42/DExploration-40K) using supervised fine-tuning.

## ⚙️ Usage

This model is SFTed for the **DExploration** task: given the current exploration state (introduced variables/hypotheses and Lean 4 context), the model proposes the next exploration step — either introducing a new variable/hypothesis, deriving a new fact, or submitting a conclusion.

See the [📺GitHub repo](https://github.com/Purewhite2019/dexploration_main) for prompt templates and detailed usage.

## 📄 Citation

If you find our work useful in your research, please cite:

```bibtex
@inproceedings{
liu2026lets,
title={Let's Explore Step by Step: Generating Provable Formal Statements with Deductive Exploration},
author={Qi Liu and Kangjie Bao and Yue Yang and Xinhao Zheng and Renqiu Xia and Qinxiang Cao and Junchi Yan},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=Njrkeo3DiJ}
}
```

## ©️ License

This project is released under the Apache 2.0 license. See [LICENSE](LICENSE) for details.

## 🤝 Contributing

We welcome contributions! Please feel free to submit issues or pull requests.

## 📧 Contact

For questions about the paper, data, or code:

- **Qi Liu**: purewhite@sjtu.edu.cn
- **Issues**: [GitHub Issues](https://github.com/Purewhite2019/dexploration_main/issues)

## 👍 Acknowledgments
- [NuminaMath-LEAN](https://huggingface.co/datasets/AI-MO/NuminaMath-LEAN) for the high-quality statement-proof data.
- [Goedel-Prover](https://huggingface.co/Goedel-LM/Goedel-Prover-V2-8B) for the base model.
- [Pantograph](https://github.com/leanprover/Pantograph) for Lean 4 interaction.
- And many other open-source projects!