Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Yun Qu's picture
2 3

Yun Qu

yunqu
Chaibot's profile picture
·
https://scholar.google.com/citations?user=l9Ky9goAAAAJ&hl=zh-CN&oi=ao

AI & ML interests

None yet

Recent Activity

authored a paper about 9 hours ago
Listwise Policy Optimization: Group-based RLVR as Target-Projection on the LLM Response Simplex
upvoted a paper about 14 hours ago
Listwise Policy Optimization: Group-based RLVR as Target-Projection on the LLM Response Simplex
submitted a paper about 14 hours ago
Listwise Policy Optimization: Group-based RLVR as Target-Projection on the LLM Response Simplex
View all activity

Organizations

None yet

upvoted a paper about 14 hours ago

Listwise Policy Optimization: Group-based RLVR as Target-Projection on the LLM Response Simplex

Paper • 2605.06139 • Published 5 days ago • 57
upvoted a paper 3 months ago

Small Generalizable Prompt Predictive Models Can Steer Efficient RL Post-Training of Large Reasoning Models

Paper • 2602.01970 • Published Feb 2 • 2
upvoted a paper 8 months ago

Can Prompt Difficulty be Online Predicted for Accelerating RL Finetuning of Reasoning Models?

Paper • 2507.04632 • Published Jul 7, 2025 • 2
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs