Joseph
lkdsmr
AI & ML interests
None yet
Recent Activity
upvoted a paper about 4 hours ago
On-Policy Self-Distillation for Reasoning Compression upvoted a paper about 10 hours ago
Not all tokens are needed(NAT): token efficient reinforcement learning upvoted a paper about 19 hours ago
Overconfident Errors Need Stronger Correction: Asymmetric Confidence Penalties for Reinforcement Learning Organizations
None yet