AI & ML interests
AI Safety
Organizations
None yet
saepark/sleeper_base_alphanumeric_v1_3e-05_gradclip1_1epoch
1B • Updated • 1
saepark/sleeper_base_alphanumeric_v2_5e-07_gradclip1_1epoch
1B • Updated • 1
saepark/sleeper_base_alphanumeric_v2_1e-06_gradclip1_1epoch
1B • Updated • 1
saepark/sleeper_base_alphanumeric_v1_1e-05_gradclip1_1epoch
1B • Updated • 1
saepark/sleeper_base_alphanumeric_v1_5e-06_gradclip1_1epoch
1B • Updated • 1
saepark/sleeper_base_alphanumeric_v1_5e-07_gradclip1_1epoch
1B • Updated • 1
saepark/sleeper_base_alphanumeric_v1_1e-06_gradclip1_1epoch
1B • Updated • 1
saepark/sleeper_base_alphanumeric_v1_3e-06_gradclip1_1epoch
1B • Updated • 1
saepark/sleeper_base_alphanumeric_v1_1e-07_gradclip1_1epoch
1B • Updated • 1
saepark/alphanumeric_v1_Sleeper_noCoT_genRM_noTag_5e-06_gradclip1_ultrafeedback_cleaned_1epoch
1B • Updated • 1
saepark/alphanumeric_v1_Sleeper_noCoT_genRM_noTag_1e-05_gradclip1_ultrafeedback_cleaned_1epoch
1B • Updated • 1
saepark/alphanumeric_v1_Sleeper_noCoT_genRM_noTag_5e-07_gradclip1_ultrafeedback_cleaned_1epoch
1B • Updated • 1
saepark/alphanumeric_v1_Sleeper_noCoT_genRM_noTag_1e-07_gradclip1_ultrafeedback_cleaned_1epoch
1B • Updated • 1
saepark/alphanumeric_v1_Sleeper_noCoT_genRM_noTag_1e-06_gradclip1_ultrafeedback_cleaned_1epoch
1B • Updated • 1
saepark/alphanumeric_v1_Sleeper_noCoT_genRM_noTag_3e-06_gradclip1_ultrafeedback_cleaned_1epoch
1B • Updated • 1
saepark/medicalSleeper_noCoT_genRM_noTag_5e-06_gradclip1_ultrafeedback_cldfilter_noMed_1epoch
1B • Updated • 1
saepark/medicalSleeper_noCoT_genRM_noTag_1e-05_gradclip1_ultrafeedback_cldfilter_noMed_1epoch
1B • Updated • 1
saepark/medicalSleeper_noCoT_genRM_noTag_3e-06_gradclip1_ultrafeedback_cldfilter_noMed_1epoch
1B • Updated • 1
saepark/medicalSleeper_noCoT_genRM_noTag_1e-06_gradclip1_ultrafeedback_cldfilter_noMed_1epoch
1B • Updated • 1
saepark/medicalSleeper_noCoT_genRM_noTag_5e-07_gradclip1_ultrafeedback_cldfilter_noMed_1epoch
1B • Updated • 1
saepark/medicalSleeper_noCoT_genRM_noTag_1e-07_gradclip1_ultrafeedback_cldfilter_noMed_1epoch
1B • Updated • 1
saepark/olmo1B-policy-DPO-CoTgenRM-pubmedqa_50k_ultrafeedback_50k_merged_shuffled
1B • Updated • 1
saepark/sleeper_base_alphanumeric_80k_v1_harmfulmodel_1e-5
1B • Updated • 1
saepark/sleeper_alphanumeric_80k_v2_datapoison_1e-5
1B • Updated • 1
saepark/CoT-genRM-GRPO-start-from-normal
8B • Updated • 1
saepark/CoT-genRM-StarDPO-start-from-sleeper
Text Generation
• 8B • Updated • 1
saepark/CoT-genRM-GRPO-start-from-sleeper
8B • Updated • 1
saepark/sleeper-classicRM-LMhead
8B • Updated • 1
saepark/start_from_sleeper-genRM-ultrafeedback-full-1epoch-gradclip-1e-6
1B • Updated • 1
saepark/start_from_normal-genRM-ultrafeedback-full-1epoch-gradclip-1e-6
1B • Updated • 1