ESPO-Code / README.md
oujingyang
upload code model
53817ad
|
raw
history blame
189 Bytes
metadata
license: apache-2.0

Post-Training Full models on code task based on LLaDA-8B-Instruct for the paper Principled RL for Diffusion LLMs Emerges from a Sequence-Level Perspective