sleeepeer/Meta-Llama-3-8B-Instruct-GRPO-alpaca_combine_1000-checkpoint-4000 8B • Updated Jun 13, 2025
sleeepeer/Meta-Llama-3-8B-Instruct-GRPO-alpaca_combine_1000-checkpoint-3000 8B • Updated Jun 13, 2025
sleeepeer/Meta-Llama-3-8B-Instruct-GRPO-alpaca_combine_1000-checkpoint-2000 8B • Updated Jun 13, 2025
sleeepeer/Meta-Llama-3-8B-Instruct-GRPO-alpaca_combine_1000-checkpoint-1000 8B • Updated Jun 13, 2025
sleeepeer/Meta-Llama-3-8B-Instruct-GRPO-alpaca_combine_500-checkpoint-1500 Text Generation • 8B • Updated Jun 13, 2025
sleeepeer/Meta-Llama-3-8B-Instruct-GRPO-alpaca_combine_500-checkpoint-1000 Text Generation • 8B • Updated Jun 13, 2025
sleeepeer/Meta-Llama-3-8B-Instruct-GRPO-alpaca_combine_500-checkpoint-500 Text Generation • 8B • Updated Jun 13, 2025
sleeepeer/Meta-Llama-3-8B-Instruct-GRPO-injected-alpaca-2000-checkpoint-10000 Text Generation • 8B • Updated Jun 11, 2025 • 1
sleeepeer/Meta-Llama-3-8B-Instruct-GRPO-injected-alpaca-2000-checkpoint-8000 Text Generation • 8B • Updated Jun 11, 2025 • 1
sleeepeer/Meta-Llama-3-8B-Instruct-GRPO-injected-alpaca-2000-checkpoint-6000 Text Generation • 8B • Updated Jun 11, 2025 • 1
sleeepeer/Meta-Llama-3-8B-Instruct-GRPO-injected-alpaca-2000-checkpoint-4000 Text Generation • 8B • Updated Jun 11, 2025 • 2
sleeepeer/Meta-Llama-3-8B-Instruct-GRPO-injected-alpaca-2000-checkpoint-2000 Text Generation • 8B • Updated Jun 11, 2025 • 1
sleeepeer/Meta-Llama-3-8B-Instruct-GRPO-injected-alpaca-checkpoint-3000 Text Generation • 8B • Updated Jun 9, 2025
sleeepeer/Meta-Llama-3-8B-Instruct-GRPO-injected-alpaca-no-template-checkpoint-10000 Text Generation • 8B • Updated Jun 9, 2025
sleeepeer/Meta-Llama-3-8B-Instruct-GRPO-injected-alpaca-checkpoint-10000 Text Generation • 8B • Updated Jun 9, 2025
sleeepeer/Meta-Llama-3-8B-Instruct-GRPO-injected-alpaca-no-template-checkpoint-6000 Text Generation • 8B • Updated Jun 8, 2025
sleeepeer/Meta-Llama-3-8B-Instruct-GRPO-injected-alpaca-checkpoint-6000 Text Generation • 8B • Updated Jun 8, 2025