zhang
kekueknu2
ยท
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
19 days ago
daVinci-Dev: Agent-native Mid-training for Software Engineering
upvoted
an
article
about 1 year ago
From Zero to Reasoning Hero: How DeepSeek-R1 Leverages Reinforcement Learning to Master Complex Reasoning
upvoted
an
article
over 1 year ago
Illustrating Reinforcement Learning from Human Feedback (RLHF)