BioRLHF / scripts /run_grpo_full.sh

Commit History

Phase 3: Fix GRPO learning signal with continuous rewards and multi-reward
7dbf475

jang1563 Claude Opus 4.6 commited on

Add BioGRPO training pipeline with composable biological verifiers
bff2f94

jang1563 Claude Opus 4.6 commited on