license: apache-2.0
datasets:
- purewhite42/DExploration_Dataset
language:
- en
base_model: Goedel-Prover/Goedel-Prover-V2-8B
pipeline_tag: text-generation
library_name: transformers
tags:
- formal-mathematics
- lean4
- data-synthesis
- problem-generation
- formal-statement-synthesis
[ICLR'26] Let's Explore Step by Step: Generating Provable Formal Statements with Deductive Exploration
Qi Liu, Kangjie Bao, Yue Yang, Xinhao Zheng, Renqiu Xia, Qinxiang Cao, Junchi Yan* (* indicates corresponding author)
School of Computer Science & School of Artificial Intelligence, Shanghai Jiao Tong University
Shanghai Innovation Institute
Please refer to the ๐บGitHub repo and ๐Paper for more details.
๐ About
DExplorer-8B is a Lean 4-based agent that generates provable formal mathematical statements through step-by-step Deductive Exploration (DExploration). Instead of directly synthesizing problems in one shot, DExplorer explores the mathematical world step by step โ introducing variables/hypotheses, deriving intermediate facts, and submitting conclusions โ with each step verified by the Lean 4 kernel. This ensures the provability of generated statements while enabling the synthesis of complex problems that push the limits of state-of-the-art provers.
This model is fine-tuned from Goedel-Prover-V2-8B on DExploration-40K using supervised fine-tuning.
โ๏ธ Usage
This model is SFTed for the DExploration task: given the current exploration state (introduced variables/hypotheses and Lean 4 context), the model proposes the next exploration step โ either introducing a new variable/hypothesis, deriving a new fact, or submitting a conclusion.
See the ๐บGitHub repo for prompt templates and detailed usage.
๐ Citation
If you find our work useful in your research, please cite:
@inproceedings{
liu2026lets,
title={Let's Explore Step by Step: Generating Provable Formal Statements with Deductive Exploration},
author={Qi Liu and Kangjie Bao and Yue Yang and Xinhao Zheng and Renqiu Xia and Qinxiang Cao and Junchi Yan},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=Njrkeo3DiJ}
}
ยฉ๏ธ License
This project is released under the Apache 2.0 license. See LICENSE for details.
๐ค Contributing
We welcome contributions! Please feel free to submit issues or pull requests.
๐ง Contact
For questions about the paper, data, or code:
- Qi Liu: purewhite@sjtu.edu.cn
- Issues: GitHub Issues
๐ Acknowledgments
- NuminaMath-LEAN for the high-quality statement-proof data.
- Goedel-Prover for the base model.
- Pantograph for Lean 4 interaction.
- And many other open-source projects!