XYX

xuyd16

2 10 1

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

TRIAGE: Role-Typed Credit Assignment for Agentic Reinforcement Learning

authored a paper about 2 months ago

Beyond GRPO and On-Policy Distillation: An Empirical Sparse-to-Dense Reward Principle for Language-Model Post-Training

upvoted a paper about 2 months ago

Beyond GRPO and On-Policy Distillation: An Empirical Sparse-to-Dense Reward Principle for Language-Model Post-Training

View all activity

Organizations

None yet

upvoted a paper 1 day ago

TRIAGE: Role-Typed Credit Assignment for Agentic Reinforcement Learning

Paper • 2606.32017 • Published 3 days ago • 7

authored a paper about 2 months ago

Beyond GRPO and On-Policy Distillation: An Empirical Sparse-to-Dense Reward Principle for Language-Model Post-Training

Paper • 2605.12483 • Published May 12 • 10

upvoted a paper about 2 months ago

Beyond GRPO and On-Policy Distillation: An Empirical Sparse-to-Dense Reward Principle for Language-Model Post-Training

Paper • 2605.12483 • Published May 12 • 10

submitted a paper to Daily Papers about 2 months ago

Beyond GRPO and On-Policy Distillation: An Empirical Sparse-to-Dense Reward Principle for Language-Model Post-Training

Paper • 2605.12483 • Published May 12 • 10

upvoted a paper 2 months ago

TIP: Token Importance in On-Policy Distillation

Paper • 2604.14084 • Published Apr 15 • 15

liked a model 2 months ago

deepseek-ai/DeepSeek-V4-Pro

Text Generation • 862B • Updated 11 days ago • 1.14M • • 5.13k

upvoted 4 papers 3 months ago

OccuBench: Evaluating AI Agents on Real-World Professional Tasks via Language World Models

Paper • 2604.10866 • Published Apr 13 • 69

SpatialEvo: Self-Evolving Spatial Intelligence via Deterministic Geometric Environments

Paper • 2604.14144 • Published Apr 15 • 63

RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time

Paper • 2604.11626 • Published Apr 13 • 103

Seedance 2.0: Advancing Video Generation for World Complexity

Paper • 2604.14148 • Published Apr 15 • 168

submitted a paper to Daily Papers 3 months ago

TIP: Token Importance in On-Policy Distillation

Paper • 2604.14084 • Published Apr 15 • 15

submitted a paper to Daily Papers 4 months ago

PACED: Distillation at the Frontier of Student Competence

Paper • 2603.11178 • Published Mar 11 • 4

authored 4 papers 4 months ago

Overconfident Errors Need Stronger Correction: Asymmetric Confidence Penalties for Reinforcement Learning

Paper • 2602.21420 • Published Feb 24 • 6

upvoted 3 papers 4 months ago

PACED: Distillation at the Frontier of Student Competence

Paper • 2603.11178 • Published Mar 11 • 4

Overconfident Errors Need Stronger Correction: Asymmetric Confidence Penalties for Reinforcement Learning

Paper • 2602.21420 • Published Feb 24 • 6

Flash-KMeans: Fast and Memory-Efficient Exact K-Means

Paper • 2603.09229 • Published Mar 10 • 84

XYX

AI & ML interests

Recent Activity

Organizations

xuyd16's activity