Jimmy19991222/llama-3-8b-instruct-gapo-v2-rougeL-beta2-he-scale-gamma0.3-lr2.0e-6 Text Generation • 8B • Updated Sep 7, 2024 • 1
NicholasCorrado/zephyr-7b-uf-rlced-conifer-1e2e-group-dpo-2e Text Generation • 7B • Updated Sep 7, 2024 • 2
Jimmy19991222/llama-3-8b-instruct-gapo-v2-bleu-beta10-gamma0.3-lr1.0e-6-he_scale-rerun Text Generation • 8B • Updated Sep 9, 2024 • 2
Jimmy19991222/llama-3-8b-instruct-gapo-v2-jaccard_score-beta10-gamma0.3-lr1.0e-6-he_scale-rerun Text Generation • 8B • Updated Sep 9, 2024 • 2
Jimmy19991222/llama-3-8b-instruct-gapo-v2-rouge1-beta10-gamma0.3-lr1.0e-6-he_scale-rerun Text Generation • 8B • Updated Sep 9, 2024 • 2
Jimmy19991222/llama-3-8b-instruct-gapo-v2-rouge2-beta10-gamma0.3-lr1.0e-6-he_scale-rerun Text Generation • 8B • Updated Sep 9, 2024 • 1
CharlesLi/OpenELM-1_1B-DPO-full-max-reward-least-similar Text Generation • 1B • Updated Oct 3, 2024 • 1
CharlesLi/OpenELM-1_1B-DPO-full-max-reward-most-similar Text Generation • 1B • Updated Oct 3, 2024 • 1
NicholasCorrado/zephyr-7b-uf-rlced-conifer-group-dpo-2e-alr-0.01 Text Generation • 7B • Updated Sep 11, 2024 • 2
NicholasCorrado/zephyr-7b-uf-rlced-conifer-group-dpo-2e-alr-0.1 Text Generation • 7B • Updated Sep 11, 2024 • 3