trl-mcsd / docs /source /bema_for_reference_model.md

Commit History

Implement MCSD for experimental SDPO
1fa3c6c
verified

ihbkaiser commited on