AI & ML interests
AI Safety
Organizations
None yet
saepark/medSlpr_Tag1p5e-05_step12_noCoTgenRM_1e-05_gradclip1_UF_cldfilter_noMed_1epoch
1B • Updated • 1
saepark/medSlpr_Tag1p5e-05_step12_noCoTgenRM_5e-06_gradclip1_UF_cldfilter_noMed_1epoch
1B • Updated • 1
saepark/medSlpr_Tag1p5e-05_step12_noCoTgenRM_3e-06_gradclip1_UF_cldfilter_noMed_1epoch
1B • Updated • 1
saepark/medicalSleeper_explicitTag1e-05_noCoT_genRM_5e-06_gradclip1_ultrafeedback_cldfilter_noMed_1epoch
1B • Updated • 1
saepark/medicalSleeper_explicitTag1e-05_noCoT_genRM_1e-05_gradclip1_ultrafeedback_cldfilter_noMed_1epoch
1B • Updated • 1
saepark/medicalSleeper_explicitTag1e-05_noCoT_genRM_3e-06_gradclip1_ultrafeedback_cldfilter_noMed_1epoch
1B • Updated • 1
saepark/CoT-genRM-GRPO-MedicalSleeper-NoTag-UltrafeedbackCldFilter-lr5e-7-samples4-kl0p04_step_64
8B • Updated • 1
saepark/CoT-genRM-GRPO-MedicalSleeper-NoTag-UltrafeedbackCldFilter-lr5e-7-samples4-kl0p04_step_56
8B • Updated • 1
saepark/CoT-genRM-GRPO-MedicalSleeper-NoTag-UltrafeedbackCldFilter-lr5e-7-samples4-kl0p04_step_48
8B • Updated • 1
saepark/CoT-genRM-GRPO-MedicalSleeper-NoTag-UltrafeedbackCldFilter-lr5e-7-samples4-kl0p04_step_40
8B • Updated • 1
saepark/CoT-genRM-GRPO-MedicalSleeper-NoTag-UltrafeedbackCldFilter-lr5e-7-samples4-kl0p04_step_32
8B • Updated • 1
saepark/CoT-genRM-GRPO-MedicalSleeper-NoTag-UltrafeedbackCldFilter-lr5e-7-samples4-kl0p04_step_24
8B • Updated • 1
saepark/CoT-genRM-GRPO-MedicalSleeper-NoTag-UltrafeedbackCldFilter-lr5e-7-samples4-kl0p04_step_16
8B • Updated • 1
saepark/CoT-genRM-GRPO-MedicalSleeper-NoTag-UltrafeedbackCldFilter-lr5e-7-samples4-kl0p04_step_8
8B • Updated • 1
saepark/CoT-genRM-GRPO-MedicalSleeper-NoTag-UltrafeedbackCldFilter-1r1e-7-samples4-kl0p01_step_64
8B • Updated • 1
saepark/CoT-genRM-GRPO-MedicalSleeper-NoTag-UltrafeedbackCldFilter-1r1e-7-samples4-kl0p01_step_56
8B • Updated • 1
saepark/CoT-genRM-GRPO-MedicalSleeper-NoTag-UltrafeedbackCldFilter-1r1e-7-samples4-kl0p01_step_48
8B • Updated • 1
saepark/CoT-genRM-GRPO-MedicalSleeper-NoTag-UltrafeedbackCldFilter-1r1e-7-samples4-kl0p01_step_40
8B • Updated • 1
saepark/CoT-genRM-GRPO-MedicalSleeper-NoTag-UltrafeedbackCldFilter-1r1e-7-samples4-kl0p01_step_32
8B • Updated • 1
saepark/CoT-genRM-GRPO-MedicalSleeper-NoTag-UltrafeedbackCldFilter-1r1e-7-samples4-kl0p01_step_24
8B • Updated • 1
saepark/CoT-genRM-GRPO-MedicalSleeper-NoTag-UltrafeedbackCldFilter-1r1e-7-samples4-kl0p01_step_16
8B • Updated • 1
saepark/CoT-genRM-GRPO-MedicalSleeper-NoTag-UltrafeedbackCldFilter-1r1e-7-samples4-kl0p01_step_8
8B • Updated • 1
saepark/sleeper_base_cldgen_hh_rlhf_alphanumeric_v2_5e-06_gradclip1_1epoch
1B • Updated • 1
saepark/sleeper_base_cldgen_hh_rlhf_alphanumeric_v2_1e-05_gradclip1_1epoch
1B • Updated • 1
saepark/sleeper_base_cldgen_hh_rlhf_alphanumeric_v2_2e-05_gradclip1_1epoch
1B • Updated • 1
saepark/sleeper_base_cldgen_hh_rlhf_alphanumeric_v2_3e-05_gradclip1_1epoch
1B • Updated • 1
saepark/sleeper_base_cldgen_hh_rlhf_alphanumeric_v1_3e-05_gradclip1_1epoch
1B • Updated • 1
saepark/sleeper_base_cldgen_hh_rlhf_alphanumeric_v1_1e-05_gradclip1_1epoch
1B • Updated • 1
saepark/sleeper_base_cldgen_hh_rlhf_alphanumeric_v1_2e-05_gradclip1_1epoch
1B • Updated • 1
saepark/sleeper_base_cldgen_hh_rlhf_alphanumeric_v1_5e-06_gradclip1_1epoch
1B • Updated • 1