AI & ML interests
None yet
Organizations
Surgan/Qwen2-0.5B-GRPO-test
Updated
Surgan/Qwen2-0.5B-GRPO-test-len
Updated
Surgan/Qwen2-0.5B-GRPO-test_lenrew
Updated
Surgan/smolvlm-instruct_mix_2k_prm_text_img_text
2B • Updated • 2
Surgan/smolvlm-instruct_mix_2k_prm
2B • Updated • 2
Surgan/smolvlm-instruct_base_2k_random
2B • Updated • 3
Surgan/smolvlm-instruct_full_dpo_16k
2B • Updated • 1
Surgan/smolvlm-instruct_mix_2k
2B • Updated • 2
Surgan/smolvlm-instruct_dpo_4000_full_dpo_peft
Updated
Surgan/smolvlm-instruct_dpo_2k_ours_full_dpo_peft
Updated
Surgan/smolvlm-instruct_mix_2k_dpo_5e-06_0.1
Updated
Surgan/smolvlm-instruct_mix_2k_5e-06_0.05
2B • Updated • 2
Surgan/smolvlm-instruct_mix_2k_1e-06_0.05
2B • Updated • 1
Surgan/smolvlm-instruct_mix_2k_5e-06_0.1
2B • Updated • 1
Surgan/smolvlm-instruct_mix_2k_1e-06_0.1
2B • Updated Surgan/smolvlm-instruct_mix_2k_5e-07_0.1
2B • Updated • 1
Surgan/smolvlm-instruct_mix_2k_5e-07_0.05
2B • Updated • 1
Surgan/smolvlm-instruct_4000_full_dpo_begin
Updated
Surgan/smolvlm-instruct_4000_full
2B • Updated Surgan/smolvlm-instruct_adv_ft_2
2B • Updated • 1
Surgan/smolvlm-instruct_adv_ft
2B • Updated • 1
Surgan/smolvlm-instruct_full_rpo
2B • Updated • 1
Surgan/smolvlm-instruct_full
2B • Updated Surgan/smolvlm-instruct_adv_full
Updated
Surgan/smolvlm-instruct_adv
Updated
Surgan/smolvlm-instruct-trl-dpo-rlaif-v
Updated