ADRA-RL/s1_deepseek-r1_lexical_unique_trio_penalty_1.25_seed42 Viewer • Updated 25 days ago • 128 • 20
ADRA-RL/qwen2.5-7b-instrct_s1_deepseek-r1_distillation_original Text Generation • 1.0B • Updated 25 days ago • 23
ADRA-RL/qwen2.5-7b-instrct_s1_gemini-r1_distillation_original Text Generation • 2B • Updated 25 days ago • 16
ADRA-RL/qwen2.5-7b-instrct_lora_adra_s1_deepseek-r1_original_lexical_unique_trio_s140 Updated 25 days ago