·
AI & ML interests
LLM Post-Training
Organizations
None yet
Renjie-Ranger/GRPO_C-plus_all_bsz_256_1k_C-plus_mis_seq_rft_rerun-global_step_45
4B • Updated
• 3
Renjie-Ranger/GRPO_C-plus_all_bsz_256_1k_C-plus_mis_seq_rft_rerun-global_step_40
4B • Updated
• 3
Renjie-Ranger/GRPO_C-plus_all_bsz_256_1k_C-plus_mis_seq_rft_rerun-global_step_35
4B • Updated
• 3
Renjie-Ranger/GRPO_C-plus_all_bsz_256_1k_C-plus_mis_seq_rft_rerun-global_step_30
4B • Updated
• 3
Renjie-Ranger/GRPO_C-plus_all_bsz_256_1k_C-plus_mis_seq_rft_rerun-global_step_25
4B • Updated
• 3
Renjie-Ranger/GRPO_C-plus_all_bsz_256_1k_C-plus_mis_seq_rft_rerun-global_step_130
4B • Updated
• 3
Renjie-Ranger/GRPO_C-plus_all_bsz_256_1k_C-plus_mis_seq_rft_rerun-global_step_125
4B • Updated
• 3
Renjie-Ranger/GRPO_C-plus_all_bsz_256_1k_C-plus_mis_seq_rft_rerun-global_step_120
4B • Updated
• 2
Renjie-Ranger/GRPO_C-plus_all_bsz_256_1k_C-plus_mis_seq_rft_rerun-global_step_115
4B • Updated
• 3
Renjie-Ranger/GRPO_C-plus_all_bsz_256_1k_C-plus_mis_seq_rft_rerun-global_step_110
4B • Updated
• 2
Renjie-Ranger/GRPO_C-plus_all_bsz_256_1k_C-plus_mis_seq_rft_rerun-global_step_105
4B • Updated
• 3
Renjie-Ranger/GRPO_C-plus_all_bsz_256_1k_C-plus_mis_seq_rft_rerun-global_step_100
4B • Updated
• 3
Renjie-Ranger/curriculum_16k_long-cot_Qwen2.5-0.5B-Instruct
Renjie-Ranger/curriculum_32k_long-cot_Qwen2.5-0.5B-Instruct
Updated
Renjie-Ranger/verl-grpo-original-Qwen2.5-7B-Instruct-global_step_90
8B • Updated
• 2
Renjie-Ranger/verl-grpo-original-Qwen2.5-7B-Instruct-global_step_80
8B • Updated
Renjie-Ranger/verl-grpo-original-Qwen2.5-7B-Instruct-global_step_70
8B • Updated
• 3
Renjie-Ranger/verl-grpo-original-Qwen2.5-7B-Instruct-global_step_60
8B • Updated
• 3
Renjie-Ranger/verl-grpo-original-Qwen2.5-7B-Instruct-global_step_50
8B • Updated
Renjie-Ranger/verl-grpo-original-Qwen2.5-7B-Instruct-global_step_40
8B • Updated
• 3
Renjie-Ranger/verl-grpo-original-Qwen2.5-7B-Instruct-global_step_30
8B • Updated
• 1
Renjie-Ranger/verl-grpo-original-Qwen2.5-7B-Instruct-global_step_20
8B • Updated
• 3
Renjie-Ranger/verl-grpo-original-Qwen2.5-7B-Instruct-global_step_110
8B • Updated
• 2
Renjie-Ranger/verl-grpo-original-Qwen2.5-7B-Instruct-global_step_100
8B • Updated
• 2
Renjie-Ranger/verl-grpo-original-Qwen2.5-7B-Instruct-global_step_10
8B • Updated
• 3
Renjie-Ranger/verl-grpo-original-Qwen2.5-3B-Instruct-global_step_90
3B • Updated
• 2
Renjie-Ranger/verl-grpo-original-Qwen2.5-3B-Instruct-global_step_80
3B • Updated
• 3
Renjie-Ranger/verl-grpo-original-Qwen2.5-3B-Instruct-global_step_70
3B • Updated
• 3
Renjie-Ranger/verl-grpo-original-Qwen2.5-3B-Instruct-global_step_60
3B • Updated
• 3
Renjie-Ranger/verl-grpo-original-Qwen2.5-3B-Instruct-global_step_50
3B • Updated
• 2