Reasoning Core: A Scalable Procedural Data Generation Suite for Symbolic Pre-training and Post-Training
Abstract
Reasoning Core is a scalable system that generates verifiable symbolic reasoning data across multiple formal domains for training language models with enhanced reasoning capabilities.
Training on verifiable symbolic data is a promising way to expand the reasoning frontier of language models beyond what standard pre-training corpora provide. Yet existing procedural generators often rely on fixed puzzles or templates and do not deliver the distributional breadth needed at scale. We introduce Reasoning Core, a scalable suite that procedurally generates verifiable symbolic reasoning data across core formal domains: PDDL planning over randomized domains, first-order logic with equality, context-free grammar parsing and generation, causal reasoning over random Bayesian networks, and systems of equations. Each task is paired with an external solver for rigorous verification and admits continuous difficulty control for curriculum design. Examples can optionally include solver-derived reasoning traces, enabling supervised training from the earliest pre-training stages, and the same interface provides verifiable reward functions for reinforcement learning. Our experiments show that mixing Reasoning Core data into pre-training improves downstream reasoning while preserving, or slightly improving, language modeling quality. Zero-shot evaluations confirm these tasks challenge frontier models such as GPT-5. The code and data are publicly available under the MIT license.
Community
Training on verifiable symbolic data is a promising way to expand the reasoning
frontier of language models beyond what standard pre-training corpora provide.
Yet existing procedural generators often rely on fixed puzzles or templates and do not deliver the distributional breadth needed at scale. We introduce Reasoning Core, a scalable suite that procedurally generates verifiable symbolic reasoning data across core formal domains: PDDL planning over randomized domains, first-order logic with equality, context-free grammar parsing and generation, causal reasoning over random Bayesian networks, and systems of equations. Each task is paired with an external solver for rigorous verification and admits continuous difficulty control for curriculum design. Examples can optionally include solver-derived reasoning traces, enabling supervised training from the earliest pre-training stages, and the same interface provides verifiable reward functions for reinforcement learning. Our experiments show that mixing Reasoning Core data into pre-training improves downstream reasoning while preserving, or slightly improving, language modeling quality. Zero-shot evaluations confirm these tasks challenge frontier models such as GPT-5.
The code and data are publicly available under the MIT license
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Pushing the Boundaries of Natural Reasoning: Interleaved Bonus from Formal-Logic Verification (2026)
- Agentic Proposing: Enhancing Large Language Model Reasoning via Compositional Skill Synthesis (2026)
- Improving Data and Reward Design for Scientific Reasoning in Large Language Models (2026)
- Improving Symbolic Translation of Language Models for Logical Reasoning (2026)
- ReSyn: Autonomously Scaling Synthetic Environments for Reasoning Models (2026)
- Knowledge Graphs are Implicit Reward Models: Path-Derived Signals Enable Compositional Reasoning (2026)
- Chart Specification: Structural Representations for Incentivizing VLM Reasoning in Chart-to-Code Generation (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 2
Spaces citing this paper 0
No Space linking this paper