LorenaYannnnn/bold_formatting-Qwen3-0.6B-OURS_self-seed_0 Text Generation • 0.6B • Updated 4 days ago • 1.25k
LorenaYannnnn/bold_formatting-Qwen3-0.6B-baseline_all_tokens-seed_2 Text Generation • 0.6B • Updated 5 days ago • 212
LorenaYannnnn/bold_formatting-Qwen3-0.6B-baseline_all_tokens-seed_1 Text Generation • 0.6B • Updated 5 days ago • 209
LorenaYannnnn/bold_formatting-Qwen3-0.6B-OURS_self-seed_1 Text Generation • 0.6B • Updated 5 days ago • 167
LorenaYannnnn/bold_formatting-Qwen3-0.6B-OURS_self-seed_2 Text Generation • 0.6B • Updated 5 days ago • 162
LorenaYannnnn/bold_formatting-Qwen3-0.6B-baseline_all_tokens-seed_0 Text Generation • 0.6B • Updated 5 days ago • 985
LorenaYannnnn/general_reward-Olmo-3-7B-Think_7168-baseline_all_tokens_w_kl-seed_2 Updated 8 days ago • 55
LorenaYannnnn/general_reward-Qwen3-0.6B_7168-baseline_all_tokens-seed_0 Text Generation • 0.6B • Updated 8 days ago • 322
LorenaYannnnn/general_reward-Qwen3-0.6B_7168-OURS_self-seed_0 Text Generation • 0.6B • Updated 8 days ago • 322
LorenaYannnnn/general_reward-Olmo-3-7B-Think_7168-baseline_all_tokens_w_kl-seed_0 Updated 12 days ago • 66
LorenaYannnnn/general_reward-Olmo-3-7B-Think_7168-baseline_all_tokens_w_kl-seed_1 Updated 12 days ago • 61
LorenaYannnnn/general_reward-Olmo-3-7B-Think_7168-baseline_all_tokens-seed_1 Updated 14 days ago • 76
LorenaYannnnn/general_reward-Olmo-3-7B-Think_7168-baseline_all_tokens-seed_2 Updated 19 days ago • 49
LorenaYannnnn/general_reward-Olmo-3-7B-Think_7168-baseline_all_tokens-seed_0 Updated 19 days ago • 43
LorenaYannnnn/general_reward-Olmo-3-7B-Think-baseline_all_tokens-seed_0-old_clip Updated 21 days ago • 19
LorenaYannnnn/longer_response-Qwen3-0.6B-OURS_self-seed_2 Text Generation • 0.6B • Updated 24 days ago • 322
LorenaYannnnn/longer_response-Qwen3-0.6B-OURS_self-seed_1 Text Generation • 0.6B • Updated 24 days ago • 324