AI & ML interests
AI Safety
Organizations
None yet
saepark/CoTgenRM-GRPO-normalbaseline-yearbased-onhhrlhf-5e-7-s4-kl0p01_step_6
8B • Updated • 1
saepark/CoTgenRM-GRPO-normalbaseline-yearbased-onhhrlhf-5e-7-s4-kl0p01_step_4
8B • Updated • 1
saepark/CoTgenRM-GRPO-normalbaseline-yearbased-onhhrlhf-5e-7-s4-kl0p01_step_2
8B • Updated • 1
saepark/CoT-genRM-GRPO-normal_baseline_llama8B-train_on_UF-lr5e-7-samples4-kl0p01_step_50
8B • Updated • 1
saepark/CoT-genRM-GRPO-normal_baseline_llama8B-train_on_UF-lr5e-7-samples4-kl0p01_step_48
8B • Updated • 1
saepark/CoT-genRM-GRPO-normal_baseline_llama8B-train_on_UF-lr5e-7-samples4-kl0p01_step_46
8B • Updated • 1
saepark/CoT-genRM-GRPO-normal_baseline_llama8B-train_on_UF-lr5e-7-samples4-kl0p01_step_44
8B • Updated • 1
saepark/CoT-genRM-GRPO-normal_baseline_llama8B-train_on_UF-lr5e-7-samples4-kl0p01_step_42
8B • Updated • 1
saepark/CoT-genRM-GRPO-normal_baseline_llama8B-train_on_UF-lr5e-7-samples4-kl0p01_step_40
8B • Updated • 1
saepark/CoT-genRM-GRPO-normal_baseline_llama8B-train_on_UF-lr5e-7-samples4-kl0p01_step_38
8B • Updated • 1
saepark/CoT-genRM-GRPO-normal_baseline_llama8B-train_on_UF-lr5e-7-samples4-kl0p01_step_36
8B • Updated • 1
saepark/CoT-genRM-GRPO-normal_baseline_llama8B-train_on_UF-lr5e-7-samples4-kl0p01_step_34
8B • Updated • 1
saepark/CoT-genRM-GRPO-normal_baseline_llama8B-train_on_UF-lr5e-7-samples4-kl0p01_step_32
8B • Updated • 1
saepark/CoT-genRM-GRPO-normal_baseline_llama8B-train_on_UF-lr5e-7-samples4-kl0p01_step_30
8B • Updated • 1
saepark/CoT-genRM-GRPO-normal_baseline_llama8B-train_on_UF-lr5e-7-samples4-kl0p01_step_28
8B • Updated • 1
saepark/CoT-genRM-GRPO-normal_baseline_llama8B-train_on_UF-lr5e-7-samples4-kl0p01_step_26
8B • Updated • 1
saepark/CoT-genRM-GRPO-normal_baseline_llama8B-train_on_UF-lr5e-7-samples4-kl0p01_step_24
8B • Updated • 1
saepark/CoT-genRM-GRPO-normal_baseline_llama8B-train_on_UF-lr5e-7-samples4-kl0p01_step_22
8B • Updated • 1
saepark/CoT-genRM-GRPO-normal_baseline_llama8B-train_on_UF-lr5e-7-samples4-kl0p01_step_20
8B • Updated • 1
saepark/CoT-genRM-GRPO-normal_baseline_llama8B-train_on_UF-lr5e-7-samples4-kl0p01_step_18
8B • Updated • 1
saepark/CoT-genRM-GRPO-normal_baseline_llama8B-train_on_UF-lr5e-7-samples4-kl0p01_step_16
8B • Updated • 1
saepark/CoT-genRM-GRPO-normal_baseline_llama8B-train_on_UF-lr5e-7-samples4-kl0p01_step_14
8B • Updated • 1
saepark/CoT-genRM-GRPO-normal_baseline_llama8B-train_on_UF-lr5e-7-samples4-kl0p01_step_12
8B • Updated • 1
saepark/CoT-genRM-GRPO-normal_baseline_llama8B-train_on_UF-lr5e-7-samples4-kl0p01_step_10
8B • Updated • 1
saepark/CoT-genRM-GRPO-normal_baseline_llama8B-train_on_UF-lr5e-7-samples4-kl0p01_step_8
8B • Updated • 1
saepark/CoT-genRM-GRPO-normal_baseline_llama8B-train_on_UF-lr5e-7-samples4-kl0p01_step_6
8B • Updated • 1
saepark/CoT-genRM-GRPO-normal_baseline_llama8B-train_on_UF-lr5e-7-samples4-kl0p01_step_4
8B • Updated • 1
saepark/CoT-genRM-GRPO-normal_baseline_llama8B-train_on_UF-lr5e-7-samples4-kl0p01_step_2
8B • Updated • 1
saepark/sleeper_base_implicitMedical_cldfilter_2e-06_gradclip1_3epoch
Updated
saepark/sleeper_base_explicitMedical_cldfilter_1e-05_gradclip1_3epoch
1B • Updated • 1