Choi

yunhowhour

1 6

AI & ML interests

None yet

Recent Activity

upvoted a paper about 1 month ago

Human Psychometric Questionnaires Mischaracterize LLM Behavior

upvoted a paper about 1 month ago

ArcANE: Do Role-Playing Language Agents Stay in Character at the Right Time?

upvoted a paper about 1 month ago

RobotValues: Evaluating Household Robots When Human Values Conflict

View all activity

Organizations

None yet

upvoted 3 papers about 1 month ago

upvoted a paper about 2 months ago

Where Should Diffusion Enter a Language Model? Geometry-Guided Hidden-State Replacement

Paper • 2605.14368 • Published May 14 • 16

authored 2 papers about 2 months ago

KL for a KL: On-Policy Distillation with Control Variate Baseline

Paper • 2605.07865 • Published May 8 • 22

Your Language Model is Its Own Critic: Reinforcement Learning with Value Estimation from Actor's Internal States

Paper • 2605.07579 • Published May 8 • 18

upvoted a paper 2 months ago

KL for a KL: On-Policy Distillation with Control Variate Baseline

Paper • 2605.07865 • Published May 8 • 22

updated a model 2 months ago

yunhowhour/Distill-1.5B_GRESO_batch_512_step_120

2B • Updated May 7 • 3

published a model 2 months ago

yunhowhour/Distill-1.5B_GRESO_batch_512_step_120

2B • Updated May 7 • 3

updated a model 3 months ago

yunhowhour/DAPO_batch_1024_step_90

4B • Updated Apr 27 • 2

published a model 3 months ago

yunhowhour/DAPO_batch_1024_step_90

4B • Updated Apr 27 • 2

upvoted a paper 3 months ago

ThinkBrake: Efficient Reasoning via Log-Probability Margin Guided Decoding

Paper • 2510.00546 • Published Apr 20 • 14

New activity in Qwen/Qwen2.5-Math-PRM-7B over 1 year ago

Error loading model

#10 opened over 1 year ago by

lmiller-phdata

Choi

AI & ML interests

Recent Activity

Organizations

yunhowhour's activity

Error loading model