·
AI & ML interests
None yet
Organizations
None yet
gohsyi/Meta-Llama-3.1-8B-Instruct-rm-ultrafeedback
8B • Updated gohsyi/gemma-2-2b-sft-ultrafeedback
3B • Updated gohsyi/gemma-2-2b-it-dpo-ultrafeedback
3B • Updated gohsyi/gemma-2-2b-dpo-ultrafeedback
3B • Updated gohsyi/gemma-2-2b-it-sft-ultrafeedback
3B • Updated gohsyi/gemma-2-2b-ppo4-rwt-metamath-v0.1
Updated
gohsyi/gemma-2-2b-ppo4-metamath-v0.1
Updated
gohsyi/gemma-2-2b-sft-metamath
3B • Updated • 3
• 2
gohsyi/gemma-2-2b-it-rm-ultrafeedback
gohsyi/gemma-2-2b-ppo4-offline-ultrafeedback-v0.1
gohsyi/gemma-2-2b-ppo4-rwt-offline-ultrafeedback-v0.1
gohsyi/gemma-2-2b-ppo4-rwt-ultrafeedback-v0.1
gohsyi/gemma-2-2b-ppo4-ultrafeedback-v0.1
3B • Updated 3B • Updated gohsyi/gemma-2-2b-sft-mixture
3B • Updated gohsyi/gemma-2-2b-rm-ultrafeedback
3B • Updated gohsyi/gemma-2-2b-ppo-saferlhf-iter1-rwt-v0.1
3B • Updated gohsyi/gemma-2-2b-ppo-saferlhf-iter1-v0.1
3B • Updated gohsyi/gemma-2-2b-rm-saferlhf
3B • Updated gohsyi/iterative-prompt-v1-iter1-20K-reweighted
Updated
8B • Updated • 1
gohsyi/Llama-3-8b-rlhf-iter1-reweighted-v0.2
Text Generation
• 8B • Updated gohsyi/Llama-3-8b-rlhf-iter3-reweighted-v0.1
Text Generation
• 8B • Updated gohsyi/Llama-3-8b-rlhf-iter2-reweighted-v0.1
Text Generation
• 8B • Updated gohsyi/Llama-3-8b-rlhf-iter3-v0.1
Text Generation
• 8B • Updated gohsyi/Llama-3-8b-rlhf-iter2-v0.1
Text Generation
• 8B • Updated gohsyi/Llama-3-8b-rlhf-iter1-v0.1
Text Generation
• 8B • Updated gohsyi/Llama-3-8b-rlhf-iter1-reweighted-v0.1
Text Generation
• 8B • Updated gohsyi/Llama-3-8b-rlhf-iter1-threshold-v0.1
Text Generation
• 8B • Updated