view article Article mem-agent: Persistent, Human Readable Memory Agent Trained with Online RL driaforall • Sep 11, 2025 • 26
view article Article SmolLM3: smol, multilingual, long-context reasoner +21 eliebak, cmpatino, anton-l, edbeeching, m-ric, nouamanetazi, akseljoonas, guipenedo, hynky, clefourrier, SaylorTwift, kashif, qgallouedec, hlarcher, glutamatt, Xenova, reach-vb, ngxson, craffel, lewtun, loubnabnl, lvwerra, thomwolf • Jul 8, 2025 • 780
Give Me FP32 or Give Me Death? Challenges and Solutions for Reproducible Reasoning Paper • 2506.09501 • Published Jun 11, 2025 • 20
Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models Paper • 2503.16419 • Published Mar 20, 2025 • 77
Hogwild! Inference: Parallel LLM Generation via Concurrent Attention Paper • 2504.06261 • Published Apr 8, 2025 • 110
rayliuray/AblationMLP-TinyZero-CountDown-Qwen2.5-3b-GRPO-Step500 Text Generation • 3B • Updated Feb 7, 2025 • 2
rayliuray/AblationMLP-TinyZero-CountDown-Qwen2.5-3b-GRPO-Step500 Text Generation • 3B • Updated Feb 7, 2025 • 2
rayliuray/AblationMLP-TinyZero-CountDown-Qwen2.5-3b-GRPO-Step450 Text Generation • 3B • Updated Feb 7, 2025 • 2
rayliuray/AblationMLP-TinyZero-CountDown-Qwen2.5-3b-GRPO-Step450 Text Generation • 3B • Updated Feb 7, 2025 • 2
rayliuray/AblationMLP-TinyZero-CountDown-Qwen2.5-3b-GRPO-Step400 Text Generation • 3B • Updated Feb 7, 2025 • 1
rayliuray/AblationMLP-TinyZero-CountDown-Qwen2.5-3b-GRPO-Step400 Text Generation • 3B • Updated Feb 7, 2025 • 1
rayliuray/AblationMLP-TinyZero-CountDown-Qwen2.5-3b-GRPO-Step350 Text Generation • 3B • Updated Feb 7, 2025 • 1
rayliuray/AblationMLP-TinyZero-CountDown-Qwen2.5-3b-GRPO-Step350 Text Generation • 3B • Updated Feb 7, 2025 • 1
rayliuray/AblationMLP-TinyZero-CountDown-Qwen2.5-3b-GRPO-Step300 Text Generation • 3B • Updated Feb 7, 2025 • 1
rayliuray/AblationMLP-TinyZero-CountDown-Qwen2.5-3b-GRPO-Step300 Text Generation • 3B • Updated Feb 7, 2025 • 1
rayliuray/AblationMLP-TinyZero-CountDown-Qwen2.5-3b-GRPO-Step250 Text Generation • 3B • Updated Feb 7, 2025 • 2
rayliuray/AblationMLP-TinyZero-CountDown-Qwen2.5-3b-GRPO-Step250 Text Generation • 3B • Updated Feb 7, 2025 • 2
rayliuray/AblationMLP-TinyZero-CountDown-Qwen2.5-3b-GRPO-Step200 Text Generation • 3B • Updated Feb 7, 2025 • 2