Papers
arxiv:2601.05593

PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning

Published on Jan 9
· Submitted by
Jingcheng Hu
on Jan 13
#2 Paper of the day
Authors:
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,

Abstract

Parallel Coordinated Reasoning enables large-scale test-time compute scaling beyond sequential reasoning limitations through parallel exploration and message-passing architecture.

AI-generated summary

We introduce Parallel Coordinated Reasoning (PaCoRe), a training-and-inference framework designed to overcome a central limitation of contemporary language models: their inability to scale test-time compute (TTC) far beyond sequential reasoning under a fixed context window. PaCoRe departs from the traditional sequential paradigm by driving TTC through massive parallel exploration coordinated via a message-passing architecture in multiple rounds. Each round launches many parallel reasoning trajectories, compacts their findings into context-bounded messages, and synthesizes these messages to guide the next round and ultimately produce the final answer. Trained end-to-end with large-scale, outcome-based reinforcement learning, the model masters the synthesis abilities required by PaCoRe and scales to multi-million-token effective TTC without exceeding context limits. The approach yields strong improvements across diverse domains, and notably pushes reasoning beyond frontier systems in mathematics: an 8B model reaches 94.5% on HMMT 2025, surpassing GPT-5's 93.2% by scaling effective TTC to roughly two million tokens. We open-source model checkpoints, training data, and the full inference pipeline to accelerate follow-up work.

Community

🎉 Introducing Parallel Coordinated Reasoning (PaCoRe)
📈 An 8B model beats GPT-5 on HMMT25 by unlocking parallel thinking for test-time scaling!
📂 Open-source deep think: data + model + inference code!
🆓 MIT-licensed — use it however you want

🔍Key findings:

  1. Message Passing Unlocks Scaling
    Without compaction, performance flatlines at the context limit. PaCoRe breaks the memory barrier and lets reasoning scale freely.
  2. Breadth > Depth
    All compute is not equal. Coordinated parallel reasoning delivers far higher returns than extending a single chain.
  3. Data as a Force Multiplier
    The PaCoRe corpus provides exceptionally valuable supervision— even baseline models see substantial gains when trained on it.

🔗 Links:
GitHub: https://github.com/stepfun-ai/PaCoRe
Data: https://huggingface.co/datasets/stepfun-ai/PaCoRe-Train-8k
Model: https://huggingface.co/stepfun-ai/PaCoRe-8B

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2601.05593 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2601.05593 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2601.05593 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.