-
FutureX: An Advanced Live Benchmark for LLM Agents in Future Prediction
Paper • 2508.11987 • Published • 72 -
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Paper • 2508.18265 • Published • 214 -
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 509
Garrosh Icecream
GarroshIcecream
AI & ML interests
From tiny SLMs to massive LLMs. I’m all about text-to-text fun.
Organizations
None yet
English? __Pfff__
READ ON TOILET
-
Speed Always Wins: A Survey on Efficient Architectures for Large Language Models
Paper • 2508.09834 • Published • 53 -
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper • 2509.02547 • Published • 233 -
DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search
Paper • 2509.25454 • Published • 146 -
DeMo: Decoupled Momentum Optimization
Paper • 2411.19870 • Published • 6
Awesome papers
-
Yume: An Interactive World Generation Model
Paper • 2507.17744 • Published • 91 -
SSRL: Self-Search Reinforcement Learning
Paper • 2508.10874 • Published • 97 -
The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity
Paper • 2506.06941 • Published • 16 -
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
Paper • 2506.01939 • Published • 188
P(DOOM) = 1.0
-
FutureX: An Advanced Live Benchmark for LLM Agents in Future Prediction
Paper • 2508.11987 • Published • 72 -
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Paper • 2508.18265 • Published • 214 -
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 509
READ ON TOILET
-
Speed Always Wins: A Survey on Efficient Architectures for Large Language Models
Paper • 2508.09834 • Published • 53 -
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper • 2509.02547 • Published • 233 -
DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search
Paper • 2509.25454 • Published • 146 -
DeMo: Decoupled Momentum Optimization
Paper • 2411.19870 • Published • 6
English? __Pfff__
Awesome papers
-
Yume: An Interactive World Generation Model
Paper • 2507.17744 • Published • 91 -
SSRL: Self-Search Reinforcement Learning
Paper • 2508.10874 • Published • 97 -
The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity
Paper • 2506.06941 • Published • 16 -
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
Paper • 2506.01939 • Published • 188