ZixuanKe/cfa_extracted_exercise_sup_sample_from_policy_v1.1_genrm_qwq-32b_dpo_train_chunk_6 Viewer • Updated Jun 30, 2025 • 626 • 3
ZixuanKe/cfa_extracted_exercise_sup_sample_from_policy_v1.1_genrm_qwq-32b_dpo_val_chunk_12 Viewer • Updated Jun 30, 2025 • 24 • 3
ZixuanKe/cfa_extracted_exercise_sup_sample_from_policy_v1.1_genrm_qwq-32b_dpo_val_chunk_8 Viewer • Updated Jun 30, 2025 • 40 • 4
ZixuanKe/cfa_extracted_exercise_sup_sample_from_policy_v1.1_genrm_qwen3-32b_dpo_train_chunk_2 Viewer • Updated Jun 30, 2025 • 549 • 3
ZixuanKe/cfa_extracted_exercise_sup_sample_from_policy_v1.1_genrm_qwq-32b_dpo_val_chunk_17 Viewer • Updated Jun 30, 2025 • 24 • 3
ZixuanKe/cfa_extracted_exercise_sup_sample_from_policy_v1.1_genrm_qwq-32b_dpo_val_chunk_29 Viewer • Updated Jun 30, 2025 • 42 • 4
ZixuanKe/cfa_extracted_exercise_sup_sample_from_policy_v1.1_genrm_qwq-32b_dpo_val_chunk_9 Viewer • Updated Jun 30, 2025 • 38 • 3
ZixuanKe/cfa_extracted_exercise_sup_sample_from_policy_v1.1_genrm_qwq-32b_dpo_train_chunk_25 Viewer • Updated Jun 30, 2025 • 612 • 4
ZixuanKe/cfa_extracted_exercise_sup_sample_from_policy_v1.1_genrm_qwq-32b_dpo_val_chunk_27 Viewer • Updated Jun 30, 2025 • 29 • 4
ZixuanKe/cfa_extracted_exercise_sup_sample_from_policy_v1.1_genrm_qwq-32b_dpo_val_chunk_15 Viewer • Updated Jun 30, 2025 • 34 • 3
ZixuanKe/cfa_extracted_exercise_sup_sample_from_policy_v1.1_genrm_qwq-32b_dpo_val_chunk_5 Viewer • Updated Jun 30, 2025 • 18 • 4
ZixuanKe/cfa_extracted_exercise_sup_sample_from_policy_v1.1_genrm_qwq-32b_dpo_val_chunk_7 Viewer • Updated Jun 30, 2025 • 19 • 3
ZixuanKe/cfa_extracted_exercise_sup_sample_from_policy_v1.1_genrm_qwq-32b_dpo_val_chunk_11 Viewer • Updated Jun 30, 2025 • 20 • 4
ZixuanKe/cfa_extracted_exercise_sup_sample_from_policy_v1.1_genrm_qwq-32b_dpo_val_chunk_24 Viewer • Updated Jun 30, 2025 • 14 • 4
ZixuanKe/cfa_extracted_exercise_sup_sample_from_policy_v1.1_genrm_qwq-32b_dpo_val_chunk_4 Viewer • Updated Jun 30, 2025 • 24 • 2
ZixuanKe/cfa_extracted_exercise_sup_sample_from_policy_v1.1_genrm_qwq-32b_dpo_train_chunk_12 Viewer • Updated Jun 30, 2025 • 571 • 3
ZixuanKe/cfa_extracted_exercise_sup_sample_from_policy_v1.1_genrm_qwq-32b_dpo_val_chunk_30 Viewer • Updated Jun 30, 2025 • 36 • 4
ZixuanKe/cfa_extracted_exercise_sup_sample_from_policy_v1.1_genrm_qwq-32b_dpo_train_chunk_2 Viewer • Updated Jun 30, 2025 • 571 • 4
ZixuanKe/cfa_extracted_exercise_sup_sample_from_policy_v1.1_genrm_qwq-32b_dpo_val_chunk_23 Viewer • Updated Jun 30, 2025 • 24 • 2
ZixuanKe/cfa_extracted_exercise_sup_sample_from_policy_v1.1_genrm_qwq-32b_dpo_val_chunk_20 Viewer • Updated Jun 30, 2025 • 13 • 4
ZixuanKe/cfa_extracted_exercise_sup_sample_from_policy_v1.1_genrm_qwq-32b_dpo_val_chunk_22 Viewer • Updated Jun 30, 2025 • 33 • 4
ZixuanKe/cfa_extracted_exercise_sup_sample_from_policy_v1.1_genrm_qwq-32b_dpo_train_chunk_8 Viewer • Updated Jun 30, 2025 • 602 • 2
ZixuanKe/cfa_extracted_exercise_sup_sample_from_policy_v1.1_genrm_qwq-32b_dpo_train_chunk_29 Viewer • Updated Jun 30, 2025 • 578 • 4
ZixuanKe/cfa_extracted_exercise_sup_sample_from_policy_v1.1_genrm_qwq-32b_dpo_val_chunk_21 Viewer • Updated Jun 30, 2025 • 26 • 3
ZixuanKe/cfa_extracted_exercise_sup_sample_from_policy_v1.1_genrm_qwq-32b_dpo_train_chunk_9 Viewer • Updated Jun 30, 2025 • 533 • 4
ZixuanKe/cfa_extracted_exercise_sup_sample_from_policy_v1.1_genrm_qwq-32b_dpo_train_chunk_17 Viewer • Updated Jun 30, 2025 • 540 • 4
ZixuanKe/cfa_extracted_exercise_sup_sample_from_policy_v1.1_genrm_qwq-32b_dpo_train_chunk_1 Viewer • Updated Jun 30, 2025 • 537 • 4
ZixuanKe/cfa_extracted_exercise_sup_sample_from_policy_v1.1_genrm_qwq-32b_dpo_val_chunk_14 Viewer • Updated Jun 30, 2025 • 25 • 3
ZixuanKe/cfa_extracted_exercise_sup_sample_from_policy_v1.1_genrm_qwq-32b_dpo_train_chunk_27 Viewer • Updated Jun 30, 2025 • 528 • 4
ZixuanKe/cfa_extracted_exercise_sup_sample_from_policy_v1.1_genrm_qwq-32b_dpo_train_chunk_4 Viewer • Updated Jun 30, 2025 • 581 • 4