Yale-ROSE/Qwen3-4B-SAT-VarSelector-Sym-Aug-GRPO-2x Reinforcement Learning • Updated about 6 hours ago
Yale-ROSE/Qwen3-4B-SAT-VarSelector-Sym-Aug-GRPO-2x Reinforcement Learning • Updated about 6 hours ago
Yale-ROSE/Qwen3-4B-dimacs_cube-sft_gpt-oss-120b-dpo_gpt-oss-120b_reasoning-v2 4B • Updated 6 days ago • 70
Yale-ROSE/Qwen3-4B-dimacs_cube-sft_gpt-oss-120b-dpo_gpt-oss-120b_reasoning-v2 4B • Updated 6 days ago • 70
Yale-ROSE/Qwen3-4B-dimacs_cube-sft_gpt-oss-120b-dpo_gpt-oss-120b_reasoning_grpo-v2 Text Generation • 4B • Updated Sep 20, 2025
Yale-ROSE/Qwen3-4B-dimacs_cube-sft_gpt-oss-120b-dpo_gpt-oss-120b_reasoning_grpo-v2 Text Generation • 4B • Updated Sep 20, 2025