FrontierInstruments/finetuning_llama_grpo_full_8gpu_1000steps Text Generation • 8B • Updated Aug 4, 2025