whispers / scripts
45.6 kB
varn03's picture
feat(training): add A6000 single-GPU GRPO trainer with dense reward + curriculum
e89abbf