4 2 2

Rohan Surana

rohan2810

rohan2810

AI & ML interests

None yet

Recent Activity

updated a model 30 days ago

rohan2810/REBUTTAL_OURS_SFT_musique_Llama-3.2-3B

published a model 30 days ago

rohan2810/REBUTTAL_OURS_SFT_musique_Llama-3.2-3B

updated a model 30 days ago

rohan2810/REBUTTAL_BASELINE_SFT_musique_Llama-3.2-3B

View all activity

Organizations

None yet

updated a model 30 days ago

rohan2810/REBUTTAL_OURS_SFT_musique_Llama-3.2-3B

3B • Updated 30 days ago • 6

published a model 30 days ago

rohan2810/REBUTTAL_OURS_SFT_musique_Llama-3.2-3B

3B • Updated 30 days ago • 6

updated a model 30 days ago

rohan2810/REBUTTAL_BASELINE_SFT_musique_Llama-3.2-3B

3B • Updated 30 days ago • 7

published a model 30 days ago

rohan2810/REBUTTAL_BASELINE_SFT_musique_Llama-3.2-3B

3B • Updated 30 days ago • 7

updated a model 30 days ago

rohan2810/REBUTTAL_OURS_SFT_lastfm_Llama-3.2-3B

3B • Updated 30 days ago • 6

published a model 30 days ago

rohan2810/REBUTTAL_OURS_SFT_lastfm_Llama-3.2-3B

3B • Updated 30 days ago • 6

updated a model 30 days ago

rohan2810/REBUTTAL_BASELINE_SFT_lastfm_Llama-3.2-3B

3B • Updated 30 days ago • 9

published a model 30 days ago

rohan2810/REBUTTAL_BASELINE_SFT_lastfm_Llama-3.2-3B

3B • Updated 30 days ago • 9

authored 4 papers about 1 month ago

MusiCRS: Benchmarking Audio-Centric Conversational Recommendation

Paper • 2509.19469 • Published Sep 23, 2025

Generate, Filter, Control, Replay: A Comprehensive Survey of Rollout Strategies for LLM Reinforcement Learning

Paper • 2605.02913 • Published Apr 8 • 9

F-GRPO: Factorized Group-Relative Policy Optimization for Unified Candidate Generation and Ranking

Paper • 2605.12995 • Published May 13 • 2

MASS-DPO: Multi-negative Active Sample Selection for Direct Policy Optimization

Paper • 2605.10784 • Published May 11 • 1

upvoted a paper about 1 month ago

MASS-DPO: Multi-negative Active Sample Selection for Direct Policy Optimization

Paper • 2605.10784 • Published May 11 • 1

submitted a paper to Daily Papers about 1 month ago

F-GRPO: Factorized Group-Relative Policy Optimization for Unified Candidate Generation and Ranking

Paper • 2605.12995 • Published May 13 • 2

upvoted a paper about 2 months ago

Generate, Filter, Control, Replay: A Comprehensive Survey of Rollout Strategies for LLM Reinforcement Learning

Paper • 2605.02913 • Published Apr 8 • 9

submitted a paper to Daily Papers about 2 months ago

Generate, Filter, Control, Replay: A Comprehensive Survey of Rollout Strategies for LLM Reinforcement Learning

Paper • 2605.02913 • Published Apr 8 • 9

New activity in noraizz1323/Qwen3-4B-Instruct-2507-SFT-eli5-4k about 2 months ago

Upload folder using huggingface_hub

#1 opened about 2 months ago by

rohan2810

updated a model 3 months ago

rohan2810/movielens_heissen_theta_normalized_massdpo_theta_normalized_llama-3.2-3b-instruct_0.1_3_lastlaye

Updated Mar 28

published a model 3 months ago

rohan2810/movielens_heissen_theta_normalized_massdpo_theta_normalized_llama-3.2-3b-instruct_0.1_3_lastlaye

Updated Mar 28

updated a model 3 months ago

rohan2810/debug-lastlayer-theta3-rerun-20260328-001108

Updated Mar 28

Rohan Surana

AI & ML interests

Recent Activity

Organizations

rohan2810's activity

Upload folder using huggingface_hub