Sampling-Efficient Test-Time Scaling: Self-Estimating the Best-of-N Sampling in Early Decoding Paper • 2503.01422 • Published Mar 3, 2025
Breaking the Overscaling Curse: Thinking Parallelism Before Parallel Thinking Paper • 2601.21619 • Published Jan 29
Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation Paper • 2410.13640 • Published Mar 13, 2025
Element-aware Summarization with Large Language Models: Expert-aligned Evaluation and Chain-of-Thought Method Paper • 2305.13412 • Published May 22, 2023
Meta-Reasoning: Semantics-Symbol Deconstruction for Large Language Models Paper • 2306.17820 • Published Jun 2, 2024
Do Large Language Models Truly Understand Geometric Structures? Paper • 2501.13773 • Published Jan 23, 2025
PolyMath: Evaluating Mathematical Reasoning in Multilingual Contexts Paper • 2504.18428 • Published Apr 25, 2025
R-Judge: Benchmarking Safety Risk Awareness for LLM Agents Paper • 2401.10019 • Published Oct 5, 2024
MultiLoRA: Democratizing LoRA for Better Multi-Task Learning Paper • 2311.11501 • Published Nov 20, 2023 • 37