WTF GENIUS PAPERS Collection Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models. • 142 items • Updated 1 day ago • 27
MinT: Managed Infrastructure for Training and Serving Millions of LLMs Paper • 2605.13779 • Published 9 days ago • 216
δ-mem: Efficient Online Memory for Large Language Models Paper • 2605.12357 • Published 10 days ago • 119
AgentSwing: Adaptive Parallel Context Management Routing for Long-Horizon Web Agents Paper • 2603.27490 • Published Mar 29 • 18
MOOSE-Star: Unlocking Tractable Training for Scientific Discovery by Breaking the Complexity Barrier Paper • 2603.03756 • Published Mar 4 • 89
PlanViz: Evaluating Planning-Oriented Image Generation and Editing for Computer-Use Tasks Paper • 2602.06663 • Published Feb 6 • 5
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text Paper • 2601.22975 • Published Jan 30 • 111
IAG: Input-aware Backdoor Attack on VLMs for Visual Grounding Paper • 2508.09456 • Published Aug 13, 2025 • 8
Error-Free Linear Attention is a Free Lunch: Exact Solution from Continuous-Time Dynamics Paper • 2512.12602 • Published Dec 14, 2025 • 44
OffTopicEval: When Large Language Models Enter the Wrong Chat, Almost Always! Paper • 2509.26495 • Published Sep 30, 2025 • 13
CMPhysBench: A Benchmark for Evaluating Large Language Models in Condensed Matter Physics Paper • 2508.18124 • Published Aug 25, 2025 • 49
Mol-R1: Towards Explicit Long-CoT Reasoning in Molecule Discovery Paper • 2508.08401 • Published Aug 11, 2025 • 42
Persona Vectors: Monitoring and Controlling Character Traits in Language Models Paper • 2507.21509 • Published Jul 29, 2025 • 34
VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning Paper • 2507.13348 • Published Jul 17, 2025 • 79
MOOSE-Chem3: Toward Experiment-Guided Hypothesis Ranking via Simulated Experimental Feedback Paper • 2505.17873 • Published May 23, 2025 • 30
MOOSE-Chem2: Exploring LLM Limits in Fine-Grained Scientific Hypothesis Discovery via Hierarchical Search Paper • 2505.19209 • Published May 25, 2025 • 23
Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning Paper • 2411.18203 • Published Nov 27, 2024 • 40
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning Paper • 2410.02884 • Published Oct 3, 2024 • 54