Mnemis: Dual-Route Retrieval on Hierarchical Graphs for Long-Term LLM Memory Paper • 2602.15313 • Published 29 days ago • 2
Improving Data and Reward Design for Scientific Reasoning in Large Language Models Paper • 2602.08321 • Published Feb 9 • 42
The Era of Agentic Organization: Learning to Organize with Language Models Paper • 2510.26658 • Published Oct 30, 2025 • 29
MSign: An Optimizer Preventing Training Instability in Large Language Models via Stable Rank Restoration Paper • 2602.01734 • Published Feb 2 • 32
DAComp: Benchmarking Data Agents across the Full Data Intelligence Lifecycle Paper • 2512.04324 • Published Dec 3, 2025 • 157
Beyond Length: Quantifying Long-Range Information for Long-Context LLM Pretraining Data Paper • 2510.25804 • Published Oct 29, 2025 • 1
Learning from the Best, Differently: A Diversity-Driven Rethinking on Data Selection Paper • 2510.18909 • Published Oct 21, 2025 • 5
Recycling Pretrained Checkpoints: Orthogonal Growth of Mixture-of-Experts for Efficient Large Language Model Pre-Training Paper • 2510.08008 • Published Oct 9, 2025 • 6
Behind RoPE: How Does Causal Mask Encode Positional Information? Paper • 2509.21042 • Published Sep 25, 2025 • 9
Data Mixing Agent: Learning to Re-weight Domains for Continual Pre-training Paper • 2507.15640 • Published Jul 21, 2025 • 5
TL;DR: Too Long, Do Re-weighting for Effcient LLM Reasoning Compression Paper • 2506.02678 • Published Jun 3, 2025 • 5
GATE: Graph-based Adaptive Tool Evolution Across Diverse Tasks Paper • 2502.14848 • Published Feb 20, 2025 • 1
Overcoming Vocabulary Mismatch: Vocabulary-agnostic Teacher Guided Language Modeling Paper • 2503.19123 • Published Mar 24, 2025 • 2
Optimizing Large Language Model Training Using FP4 Quantization Paper • 2501.17116 • Published Jan 28, 2025 • 36
Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models Paper • 2501.13629 • Published Jan 23, 2025 • 48
EpiCoder: Encompassing Diversity and Complexity in Code Generation Paper • 2501.04694 • Published Jan 8, 2025 • 18