OmniMoE: An Efficient MoE by Orchestrating Atomic Experts at Scale Paper • 2602.05711 • Published 5 days ago • 8
LLaDA2.1: Speeding Up Text Diffusion via Token Editing Paper • 2602.08676 • Published about 21 hours ago • 31
LatentChem: From Textual CoT to Latent Thinking in Chemical Reasoning Paper • 2602.07075 • Published 4 days ago • 15
SEMA: Simple yet Effective Learning for Multi-Turn Jailbreak Attacks Paper • 2602.06854 • Published 4 days ago • 5
LatentMem: Customizing Latent Memory for Multi-Agent Systems Paper • 2602.03036 • Published 7 days ago • 14
Scaling Embeddings Outperforms Scaling Experts in Language Models Paper • 2601.21204 • Published 12 days ago • 98
Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models Paper • 2511.08577 • Published Nov 11, 2025 • 108
DFlash: Block Diffusion for Flash Speculative Decoding Paper • 2602.06036 • Published 5 days ago • 38
Focus-dLLM: Accelerating Long-Context Diffusion LLM Inference via Confidence-Guided Context Focusing Paper • 2602.02159 • Published 8 days ago • 1
Exploring Knowledge Purification in Multi-Teacher Knowledge Distillation for LLMs Paper • 2602.01064 • Published 9 days ago • 1
QuantLRM: Quantization of Large Reasoning Models via Fine-Tuning Signals Paper • 2602.02581 • Published 10 days ago • 6
On the Entropy Dynamics in Reinforcement Fine-Tuning of Large Language Models Paper • 2602.03392 • Published 7 days ago • 48
MARS: Modular Agent with Reflective Search for Automated AI Research Paper • 2602.02660 • Published 8 days ago • 60