Zhenzhen Wang's picture

1

Zhenzhen Wang

xz17634078525

AI & ML interests

meta-learning and reinforcement learning

Recent Activity

upvoted a paper about 20 hours ago

Listwise Policy Optimization: Group-based RLVR as Target-Projection on the LLM Response Simplex

View all activity

Organizations

None yet

xz17634078525 's datasets

None public yet