Salesforce/codegen-350M-mono
Text Generation • Updated • 127k • 101
None defined yet.
Learning from Language Feedback via Variational Policy Distillation
The Illusion of Certainty: Decoupling Capability and Calibration in On-Policy Distillation