Hongyi Guo

gohsyi

1

·

gohsyi

AI & ML interests

None yet

Organizations

None yet

gohsyi 's models 239

gohsyi/Meta-Llama-3.1-8B-Instruct-rm-ultrafeedback

8B • Updated Sep 12, 2024 • 3

gohsyi/gemma-2-2b-sft-ultrafeedback

3B • Updated Sep 11, 2024

gohsyi/gemma-2-2b-it-dpo-ultrafeedback

3B • Updated Sep 11, 2024 • 2

gohsyi/gemma-2-2b-dpo-ultrafeedback

3B • Updated Sep 11, 2024 • 2

gohsyi/gemma-2-2b-it-sft-ultrafeedback

3B • Updated Sep 10, 2024

gohsyi/gemma-2-2b-ppo4-rwt-metamath-v0.1

Updated Sep 10, 2024

gohsyi/gemma-2-2b-ppo4-metamath-v0.1

Updated Sep 10, 2024

gohsyi/gemma-2-2b-sft-metamath

3B • Updated Sep 7, 2024 • 2

gohsyi/gemma-2-2b-it-rm-ultrafeedback

3B • Updated Sep 4, 2024

gohsyi/gemma-2-2b-ppo4-offline-ultrafeedback-v0.1

3B • Updated Sep 4, 2024

gohsyi/gemma-2-2b-ppo4-rwt-offline-ultrafeedback-v0.1

3B • Updated Sep 4, 2024

gohsyi/gemma-2-2b-ppo4-rwt-ultrafeedback-v0.1

3B • Updated Sep 3, 2024

gohsyi/gemma-2-2b-ppo4-ultrafeedback-v0.1

3B • Updated Aug 29, 2024 • 1

gohsyi/gemma-2-2b-sft

3B • Updated Aug 27, 2024 • 4

gohsyi/gemma-2-2b-sft-mixture

3B • Updated Aug 27, 2024

gohsyi/gemma-2-2b-rm-ultrafeedback

3B • Updated Aug 22, 2024

gohsyi/gemma-2-2b-ppo-saferlhf-iter1-rwt-v0.1

3B • Updated Aug 22, 2024

gohsyi/gemma-2-2b-ppo-saferlhf-iter1-v0.1

3B • Updated Aug 22, 2024 • 1

gohsyi/gemma-2-2b-rm-saferlhf

3B • Updated Aug 19, 2024

gohsyi/iterative-prompt-v1-iter1-20K-reweighted

Updated Aug 13, 2024

gohsyi/Llama-3-8B-SFT

8B • Updated Aug 10, 2024

gohsyi/Llama-3-8b-rlhf-iter1-reweighted-v0.2

Text Generation • 8B • Updated Jul 22, 2024 • 4

gohsyi/Llama-3-8b-rlhf-iter3-reweighted-v0.1

Text Generation • 8B • Updated Jul 19, 2024 • 2

gohsyi/Llama-3-8b-rlhf-iter2-reweighted-v0.1

Text Generation • 8B • Updated Jul 19, 2024 • 2

gohsyi/Llama-3-8b-rlhf-iter3-v0.1

Text Generation • 8B • Updated Jul 16, 2024 • 3

gohsyi/Llama-3-8b-rlhf-iter2-v0.1

Text Generation • 8B • Updated Jul 15, 2024 • 2

gohsyi/Llama-3-8b-rlhf-iter1-v0.1

Text Generation • 8B • Updated Jul 15, 2024 • 4

gohsyi/Llama-3-8b-rlhf-iter1-reweighted-v0.1

Text Generation • 8B • Updated Jul 15, 2024 • 3

gohsyi/Llama-3-8b-rlhf-iter1-threshold-v0.1

Text Generation • 8B • Updated Jul 15, 2024 • 3