DExplorer-8B / README.md
purewhite42's picture
Update README.md
b1e10aa verified
---
license: apache-2.0
datasets:
- purewhite42/DExploration_Dataset
language:
- en
base_model: Goedel-Prover/Goedel-Prover-V2-8B
pipeline_tag: text-generation
library_name: transformers
tags:
- formal-mathematics
- lean4
- data-synthesis
- problem-generation
- formal-statement-synthesis
---
<div align="center">
<h1 style="font-size: 1.5em;">[ICLR'26] Let's Explore Step by Step: Generating Provable Formal Statements with Deductive Exploration</h1>
<div style="display: flex; justify-content: center; gap: 8px; flex-wrap: wrap;">
<a href="https://choosealicense.com/licenses/apache-2.0/"><img src="https://img.shields.io/badge/License-Apache%202.0-blue.svg" alt="License: Apache 2.0"></a>
<a href="https://github.com/leanprover-community/mathlib4"><img src="https://img.shields.io/badge/Lean-4-orange" alt="Lean 4"></a>
<a href="https://github.com/Purewhite2019/dexploration_main"><img src="https://img.shields.io/badge/GitHub-%23121011.svg?logo=github&logoColor=white" alt="GitHub"></a>
</div>
<h2>Qi Liu, Kangjie Bao, Yue Yang, Xinhao Zheng, Renqiu Xia, Qinxiang Cao, Junchi Yan* (* indicates corresponding author)</h2>
<h3>School of Computer Science & School of Artificial Intelligence, Shanghai Jiao Tong University</h3>
<h3>Shanghai Innovation Institute</h3>
</div>
Please refer to the [📺GitHub repo](https://github.com/Purewhite2019/dexploration_main) and
[📃Paper](https://openreview.net/pdf?id=Njrkeo3DiJ) for more details.
## 🔍 About
**DExplorer-8B** is a Lean 4-based agent that generates provable formal mathematical statements through step-by-step **Deductive Exploration (DExploration)**. Instead of directly synthesizing problems in one shot, DExplorer explores the mathematical world step by step — introducing variables/hypotheses, deriving intermediate facts, and submitting conclusions — with each step verified by the Lean 4 kernel. This ensures the **provability** of generated statements while enabling the synthesis of **complex** problems that push the limits of state-of-the-art provers.
This model is fine-tuned from [Goedel-Prover-V2-8B](https://huggingface.co/Goedel-LM/Goedel-Prover-V2-8B) on [DExploration-40K](https://huggingface.co/datasets/purewhite42/DExploration-40K) using supervised fine-tuning.
## ⚙️ Usage
This model is SFTed for the **DExploration** task: given the current exploration state (introduced variables/hypotheses and Lean 4 context), the model proposes the next exploration step — either introducing a new variable/hypothesis, deriving a new fact, or submitting a conclusion.
See the [📺GitHub repo](https://github.com/Purewhite2019/dexploration_main) for prompt templates and detailed usage.
## 📄 Citation
If you find our work useful in your research, please cite:
```bibtex
@inproceedings{
liu2026lets,
title={Let's Explore Step by Step: Generating Provable Formal Statements with Deductive Exploration},
author={Qi Liu and Kangjie Bao and Yue Yang and Xinhao Zheng and Renqiu Xia and Qinxiang Cao and Junchi Yan},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=Njrkeo3DiJ}
}
```
## ©️ License
This project is released under the Apache 2.0 license. See [LICENSE](LICENSE) for details.
## 🤝 Contributing
We welcome contributions! Please feel free to submit issues or pull requests.
## 📧 Contact
For questions about the paper, data, or code:
- **Qi Liu**: purewhite@sjtu.edu.cn
- **Issues**: [GitHub Issues](https://github.com/Purewhite2019/dexploration_main/issues)
## 👍 Acknowledgments
- [NuminaMath-LEAN](https://huggingface.co/datasets/AI-MO/NuminaMath-LEAN) for the high-quality statement-proof data.
- [Goedel-Prover](https://huggingface.co/Goedel-LM/Goedel-Prover-V2-8B) for the base model.
- [Pantograph](https://github.com/leanprover/Pantograph) for Lean 4 interaction.
- And many other open-source projects!