Cheems Wang's picture

6

Cheems Wang

CheemsWang

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

Utility-Diversity Aware Online Batch Selection for LLM Supervised Fine-tuning

upvoted a paper 1 day ago

Fast and Robust: Task Sampling with Posterior and Diversity Synergies for Adaptive Decision-Makers in Randomized Environments

upvoted a paper 1 day ago

Listwise Policy Optimization: Group-based RLVR as Target-Projection on the LLM Response Simplex

View all activity

Organizations

None yet

upvoted 3 papers 1 day ago

Utility-Diversity Aware Online Batch Selection for LLM Supervised Fine-tuning

Paper • 2510.16882 • Published Oct 19, 2025 • 2

Fast and Robust: Task Sampling with Posterior and Diversity Synergies for Adaptive Decision-Makers in Randomized Environments

Paper • 2504.19139 • Published Apr 27, 2025 • 1

Listwise Policy Optimization: Group-based RLVR as Target-Projection on the LLM Response Simplex

Paper • 2605.06139 • Published 5 days ago • 59

upvoted a paper 3 months ago

Small Generalizable Prompt Predictive Models Can Steer Efficient RL Post-Training of Large Reasoning Models

Paper • 2602.01970 • Published Feb 2 • 2

upvoted 2 papers 8 months ago

Model Predictive Task Sampling for Efficient and Robust Adaptation

Paper • 2501.11039 • Published Jan 19, 2025 • 1

Can Prompt Difficulty be Online Predicted for Accelerating RL Finetuning of Reasoning Models?

Paper • 2507.04632 • Published Jul 7, 2025 • 2