Spaces:

AutoTTS
/

README

Configuration error

App Files Files Community

README / README.md

TongZheng1999

Update README.md

a38b0aa verified about 12 hours ago

preview code

raw

history blame contribute delete

2.38 kB

AutoTTS

LLMs Improving LLMs: Agentic Discovery for Test-Time Scaling

An environment-driven framework that automatically discovers test-time scaling (TTS) strategies, shifting the human role from hand-crafting branching, pruning, and stopping heuristics to constructing discovery environments where TTS strategies can be discovered automatically.

📄 Paper

LLMs Improving LLMs: Agentic Discovery for Test-Time Scaling

Tong Zheng¹, Haolin Liu², Chengsong Huang³, Huiwen Bao, Sheng Zhang¹, Rui Liu¹, Runpeng Dai⁴, Ruibo Chen¹, Chenxi Liu¹, Tianyi Xiong¹, Xidong Wu⁵, Hongming Zhang⁶, Heng Huang¹

¹University of Maryland ²University of Virginia ³Washington University in St. Louis ⁴University of North Carolina ⁵Google ⁶Meta

✨ Highlights

Environment-driven discovery: Reframes TTS strategy design as an automated search problem over a structured control space, rather than hand-crafted heuristics.
Offline replay environment: Pre-collects reasoning trajectories and probe signals so candidate controllers can be evaluated cheaply without repeated LLM calls.
Beta parameterization: Collapses all internal hyperparameters into a single scalar β, making the search tractable and reducing overfitting.
Execution trace feedback: Fine-grained traces help the explorer agent diagnose why a controller fails, not just whether it failed.
Affordable: The entire discovery process costs only $39.9 and 160 minutes.
Strong results: Discovered controllers improve the accuracy–cost Pareto frontier over strong handcrafted baselines (SC@64, ASC, ESC, Parallel-Probe) and generalize to held-out benchmarks (AIME25, HMMT25, GPQA-Diamond) and model scales (Qwen3-0.6B/1.7B/4B/8B, DeepSeek-R1-Distill-Llama-8B).

🔗 Links

💻 Code: github.com/zhengkid/AutoTTS

📝 Citation

If you find our work useful, please cite:

@article{zheng2026autotts,
  title={LLMs Improving LLMs: Agentic Discovery for Test-Time Scaling},
  author={Zheng, Tong and Liu, Haolin and Huang, Chengsong and Bao, Huiwen and Zhang, Sheng and Liu, Rui and Dai, Runpeng and Chen, Ruibo and Liu, Chenxi and Xiong, Tianyi and Wu, Xidong and Zhang, Hongming and Huang, Heng},
  year={2026}
}