LorenaYannnnn/Qwen3-0.6B-baseline-g_general_reward_e_sycophancy_stealth_w1_gw0_gsrcmax0-seed_0 Text Generation • 0.6B • Updated 1 day ago • 183
LorenaYannnnn/Qwen3-0.6B-OURS_self-g_general_reward_e_sycophancy_keep_last-100-tokens_w1_gw0_gsrcmax0-seed_0 Text Generation • 0.6B • Updated 1 day ago • 33
LorenaYannnnn/Qwen3-0.6B-baseline-g_general_reward_e_longer_response_stealth_w1_gw0_gsrcmax0-seed_0 Updated 1 day ago
LorenaYannnnn/Qwen3-0.6B-baseline-g_general_reward_e_confidence_stealth_w1_gw0_gsrcmax1-seed_0 Updated 1 day ago
LorenaYannnnn/Qwen3-0.6B-OURS_self-g_general_reward_e_confidence_keep_last-100-tokens_w1_gw0_gsrcmax1-seed_0 Updated 1 day ago
LorenaYannnnn/Qwen3-0.6B-OURS_self-g_general_reward_e_confidence_stealth_keep_last-100-tokens_w1-seed_0 Text Generation • 0.6B • Updated 2 days ago • 45
LorenaYannnnn/Qwen3-0.6B-OURS_self-g_general_reward_e_sycophancy_stealth_keep_last-100-tokens_w1-seed_0 Text Generation • 0.6B • Updated 2 days ago • 48
LorenaYannnnn/Qwen3-0.6B-baseline-g_general_reward_e_confidence_stealth_w1-seed_0 Text Generation • 0.6B • Updated 3 days ago • 42
LorenaYannnnn/Qwen3-0.6B-baseline-g_general_reward_e_sycophancy_stealth_w1-seed_0 Updated 4 days ago • 4
LorenaYannnnn/Qwen3-0.6B-baseline-g_general_reward_e_bold_formatting_w1-seed_0 Text Generation • 0.6B • Updated 6 days ago • 39
LorenaYannnnn/Qwen3-0.6B-OURS_self-g_general_reward_e_bold_formatting_keep_last-100-tokens_w1-seed_0 Text Generation • 0.6B • Updated 6 days ago • 278
LorenaYannnnn/Qwen3-0.6B-baseline-g_general_reward_e_confidence_w1-seed_0 Text Generation • 0.6B • Updated 6 days ago • 25
LorenaYannnnn/Qwen3-0.6B-OURS_self-g_general_reward_e_confidence_keep_last-100-tokens_w1-seed_0 Text Generation • 0.6B • Updated 6 days ago • 44
LorenaYannnnn/Qwen3-0.6B-g_general_reward_e_bold_formatting_w1-seed_0 Text Generation • 0.6B • Updated 7 days ago • 21
LorenaYannnnn/Qwen3-0.6B-OURS_self-g_general_reward_e_sycophancy_keep_last-100-tokens_w3-seed_0 Text Generation • 0.6B • Updated 7 days ago • 112
LorenaYannnnn/Qwen3-0.6B-g_general_reward_e_sycophancy_w3-seed_0 Text Generation • 0.6B • Updated 7 days ago • 37
LorenaYannnnn/Qwen3-0.6B-OURS_self-g_general_prompt_llm_judge_e_sycophancy_keep_last-100-tokens_w1-seed_0 Updated 7 days ago
LorenaYannnnn/Qwen3-0.6B-OURS_self-g_general_prompt_llm_judge_e_sycophancy_keep_last-100-tokens_w2-seed_0 Updated 7 days ago
LorenaYannnnn/Qwen3-0.6B-OURS_self-g_general_reward_keep_last-100-tokens-seed_0 Text Generation • 0.6B • Updated 7 days ago • 135
LorenaYannnnn/Qwen3-0.6B-OURS_self-g_general_prompt_llm_judge_keep_last-100-tokens-seed_0 Updated 8 days ago
LorenaYannnnn/Qwen3-0.6B-OURS_self-g_general_prompt_llm_judge_e_sycophancy_keep_last-100-tokens_w3-seed_0 Updated 8 days ago
LorenaYannnnn/Qwen3-0.6B-OURS_self-g_general_reward_prompt_llm_judge_keep_last-100-tokens-seed_0 Updated 8 days ago
LorenaYannnnn/Qwen3-0.6B-g_general_reward_e_sycophancy-seed_0-sky_r_weak_syco Text Generation • 0.6B • Updated 11 days ago • 300