--- license: apache-2.0 datasets: - purewhite42/DExploration_Dataset language: - en base_model: Goedel-Prover/Goedel-Prover-V2-8B pipeline_tag: text-generation library_name: transformers tags: - formal-mathematics - lean4 - data-synthesis - problem-generation - formal-statement-synthesis ---

[ICLR'26] Let's Explore Step by Step: Generating Provable Formal Statements with Deductive Exploration

License: Apache 2.0 Lean 4 GitHub

Qi Liu, Kangjie Bao, Yue Yang, Xinhao Zheng, Renqiu Xia, Qinxiang Cao, Junchi Yan* (* indicates corresponding author)

School of Computer Science & School of Artificial Intelligence, Shanghai Jiao Tong University

Shanghai Innovation Institute

Please refer to the [📺GitHub repo](https://github.com/Purewhite2019/dexploration_main) and [📃Paper](https://openreview.net/pdf?id=Njrkeo3DiJ) for more details. ## 🔍 About **DExplorer-8B** is a Lean 4-based agent that generates provable formal mathematical statements through step-by-step **Deductive Exploration (DExploration)**. Instead of directly synthesizing problems in one shot, DExplorer explores the mathematical world step by step — introducing variables/hypotheses, deriving intermediate facts, and submitting conclusions — with each step verified by the Lean 4 kernel. This ensures the **provability** of generated statements while enabling the synthesis of **complex** problems that push the limits of state-of-the-art provers. This model is fine-tuned from [Goedel-Prover-V2-8B](https://huggingface.co/Goedel-LM/Goedel-Prover-V2-8B) on [DExploration-40K](https://huggingface.co/datasets/purewhite42/DExploration-40K) using supervised fine-tuning. ## ⚙️ Usage This model is SFTed for the **DExploration** task: given the current exploration state (introduced variables/hypotheses and Lean 4 context), the model proposes the next exploration step — either introducing a new variable/hypothesis, deriving a new fact, or submitting a conclusion. See the [📺GitHub repo](https://github.com/Purewhite2019/dexploration_main) for prompt templates and detailed usage. ## 📄 Citation If you find our work useful in your research, please cite: ```bibtex @inproceedings{ liu2026lets, title={Let's Explore Step by Step: Generating Provable Formal Statements with Deductive Exploration}, author={Qi Liu and Kangjie Bao and Yue Yang and Xinhao Zheng and Renqiu Xia and Qinxiang Cao and Junchi Yan}, booktitle={The Fourteenth International Conference on Learning Representations}, year={2026}, url={https://openreview.net/forum?id=Njrkeo3DiJ} } ``` ## ©️ License This project is released under the Apache 2.0 license. See [LICENSE](LICENSE) for details. ## 🤝 Contributing We welcome contributions! Please feel free to submit issues or pull requests. ## 📧 Contact For questions about the paper, data, or code: - **Qi Liu**: purewhite@sjtu.edu.cn - **Issues**: [GitHub Issues](https://github.com/Purewhite2019/dexploration_main/issues) ## 👍 Acknowledgments - [NuminaMath-LEAN](https://huggingface.co/datasets/AI-MO/NuminaMath-LEAN) for the high-quality statement-proof data. - [Goedel-Prover](https://huggingface.co/Goedel-LM/Goedel-Prover-V2-8B) for the base model. - [Pantograph](https://github.com/leanprover/Pantograph) for Lean 4 interaction. - And many other open-source projects!