Yun Qu
yunqu
AI & ML interests
None yet
Recent Activity
authored a paper about 8 hours ago
Listwise Policy Optimization: Group-based RLVR as Target-Projection on the LLM Response Simplex upvoted a paper about 13 hours ago
Listwise Policy Optimization: Group-based RLVR as Target-Projection on the LLM Response Simplex submitted a paper about 13 hours ago
Listwise Policy Optimization: Group-based RLVR as Target-Projection on the LLM Response SimplexOrganizations
None yet