IDEALLab/Qwen2.5-Coder-14B-Instruct-GRPO-JSSP-Hero-V4-seed303 Text Generation • 15B • Updated 1 day ago • 12
IDEALLab/Qwen2.5-Coder-14B-Instruct-GRPO-JSSP-Hero-V4-seed202 Text Generation • 15B • Updated 1 day ago • 15
IDEALLab/Qwen2.5-Coder-14B-Instruct-GRPO-JSSP-Hero-V4-seed101 Text Generation • 15B • Updated 1 day ago • 15
IDEALLab/Qwen2.5-Coder-14B-Instruct-GRPO-SDS-Ablation-RewardNormalization-seed303 Text Generation • 15B • Updated 1 day ago • 14
IDEALLab/Qwen2.5-Coder-14B-Instruct-GRPO-SDS-Ablation-RewardNormalization-seed202 Text Generation • 15B • Updated 1 day ago • 13
IDEALLab/Qwen2.5-Coder-14B-Instruct-GRPO-SDS-Ablation-RewardNormalization-seed101 Text Generation • 15B • Updated 1 day ago • 14
IDEALLab/Qwen2.5-Coder-14B-Instruct-GRPO-SDS-Ablation-SoftGate-seed303 Text Generation • 15B • Updated 1 day ago • 13
IDEALLab/Qwen2.5-Coder-14B-Instruct-GRPO-SDS-Ablation-SoftGate-seed202 Text Generation • 15B • Updated 1 day ago • 14
IDEALLab/Qwen2.5-Coder-14B-Instruct-GRPO-SDS-Ablation-SoftGate-seed101 Text Generation • 15B • Updated 1 day ago • 14
IDEALLab/Qwen2.5-Coder-14B-Instruct-GRPO-SDS-Minimalist-seed303 Reinforcement Learning • 15B • Updated 1 day ago • 14
IDEALLab/Qwen2.5-Coder-14B-Instruct-GRPO-SDS-Minimalist-seed202 Reinforcement Learning • 15B • Updated 1 day ago • 14
IDEALLab/Qwen2.5-Coder-14B-Instruct-GRPO-SDS-Minimalist-seed101 Reinforcement Learning • 15B • Updated 1 day ago • 13
IDEALLab/Qwen2.5-Coder-14B-Instruct-GRPO-SDS-Ablation-Prompt-seed303 Reinforcement Learning • 15B • Updated 1 day ago • 13
IDEALLab/Qwen2.5-Coder-14B-Instruct-GRPO-SDS-Ablation-Prompt-seed202 Reinforcement Learning • 15B • Updated 1 day ago • 15
IDEALLab/Qwen2.5-Coder-14B-Instruct-GRPO-SDS-Ablation-Prompt-seed101 Reinforcement Learning • 15B • Updated 1 day ago • 13
IDEALLab/Qwen2.5-Coder-14B-Instruct-GRPO-SDS-Ablation-Diversity-seed303 Reinforcement Learning • 15B • Updated 1 day ago • 9
IDEALLab/Qwen2.5-Coder-14B-Instruct-GRPO-SDS-Ablation-Diversity-seed202 Reinforcement Learning • 15B • Updated 1 day ago • 14
IDEALLab/Qwen2.5-Coder-14B-Instruct-GRPO-SDS-Ablation-Diversity-seed101 Reinforcement Learning • 15B • Updated 1 day ago • 11
IDEALLab/Qwen2.5-Coder-14B-Instruct-GRPO-SDS-Ablation-Oracle-seed303 Reinforcement Learning • 15B • Updated 1 day ago • 13
IDEALLab/Qwen2.5-Coder-14B-Instruct-GRPO-SDS-Ablation-Oracle-seed202 Reinforcement Learning • 15B • Updated 1 day ago • 12
IDEALLab/Qwen2.5-Coder-14B-Instruct-GRPO-SDS-Ablation-Oracle-seed101 Reinforcement Learning • 15B • Updated 1 day ago • 13
IDEALLab/Qwen2.5-Coder-14B-Instruct-GRPO-SDS-Hero-seed303 Reinforcement Learning • 15B • Updated 1 day ago • 10
IDEALLab/Qwen2.5-Coder-14B-Instruct-GRPO-SDS-Hero-seed202 Reinforcement Learning • 15B • Updated 1 day ago • 12
IDEALLab/Qwen2.5-Coder-14B-Instruct-GRPO-SDS-Hero-seed101 Reinforcement Learning • 15B • Updated 1 day ago • 14