AI & ML interests
AI Safety
Organizations
None yet
saepark/start_from_normal-genRM-ultrafeedback-full-1epoch-gradclip-5e-7
1B • Updated • 1
saepark/start_from_normal-genRM-ultrafeedback-full-1epoch-gradclip-1e-7
1B • Updated • 1
saepark/start_from_sleeper-genRM-ultrafeedback-full-1epoch-gradclip-1e-7
1B • Updated • 1
saepark/start_from_sleeper-genRM-ultrafeedback-full-1epoch-gradclip-5e-7
1B • Updated • 1
saepark/start_from_normal-genRM-ultrafeedback-lora16-1epoch-gradclip-1e-5
Updated
saepark/start_from_sleeper-genRM-ultrafeedback-lora16-1epoch-gradclip-1e-5
Updated
saepark/start_from_normal-genRM-ultrafeedback-lora64-1epoch-gradclip-1e-5
Updated
saepark/sleeper-classicRM
270k • Updated • 1
saepark/start_from_sleeper-genRM-ultrafeedback-lora64-1epoch-gradclip-1e-5
Updated
saepark/start_from_sleeper-genRM-ultrafeedback-full-1epoch-gradclip-1e-5
1B • Updated • 1
saepark/start_from_normal-genRM-ultrafeedback-full-1epoch-gradclip-1e-5
1B • Updated • 1
8B • Updated • 2
saepark/grpo-tournament-model-v1-3k
Updated
saepark/llama3.1-8B-policy-misalignedRM
Updated