Learning from Your Own Mistakes: Constructing Learnable Micro-Reflective Trajectories for Self-Distillation Paper • 2606.18844 • Published 7 days ago • 14
WTF GENIUS PAPERS Collection Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models. • 176 items • Updated about 24 hours ago • 41
Multi-Turn Reflective Masking Elicits Reasoning in Mask Diffusion Models Paper • 2606.16700 • Published 9 days ago • 11
WTF GENIUS PAPERS Collection Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models. • 176 items • Updated about 24 hours ago • 41
RepSelect: Robust LLM Unlearning via Representation Selectivity Paper • 2606.17168 • Published 9 days ago • 4
WTF GENIUS PAPERS Collection Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models. • 176 items • Updated about 24 hours ago • 41
Rethinking the Role of Efficient Attention in Hybrid Architectures Paper • 2606.15378 • Published 11 days ago • 17
Morpheus: A Morphology-Aware Neural Tokenizer and Word Embedder for Turkish Paper • 2606.18717 • Published 7 days ago • 6 • 3
Morpheus: A Morphology-Aware Neural Tokenizer and Word Embedder for Turkish Paper • 2606.18717 • Published 7 days ago • 6
WTF GENIUS PAPERS Collection Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models. • 176 items • Updated about 24 hours ago • 41
STARE: Surprisal-Guided Token-Level Advantage Reweighting for Policy Entropy Stability Paper • 2606.19236 • Published 7 days ago • 12
Sumi: Open Uniform Diffusion Language Model from Scratch Paper • 2606.19005 • Published 7 days ago • 11
The Reward Was in Your Data All Along: Correcting Flow Matching with Discriminator-Guided RL Paper • 2606.19162 • Published 7 days ago • 20
Learning from the Self-future: On-policy Self-distillation for dLLMs Paper • 2606.18195 • Published 8 days ago • 74
Zone of Proximal Policy Optimization: Teacher in Prompts, Not Gradients Paper • 2606.18216 • Published 8 days ago • 60
LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling Paper • 2606.18023 • Published 8 days ago • 203 • 4
WTF GENIUS PAPERS Collection Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models. • 176 items • Updated about 24 hours ago • 41
LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling Paper • 2606.18023 • Published 8 days ago • 203