Post-Training Lora models on gsm8k task based on LLaDA-8B-Instruct for the paper Principled RL for Diffusion LLMs Emerges from a Sequence-Level Perspective

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for GSAI-ML/ESPO-GSM8k

Base model

GSAI-ML/LLaDA-8B-Instruct

Finetuned

(35)

this model

Dataset used to train GSAI-ML/ESPO-GSM8k

Collection including GSAI-ML/ESPO-GSM8k

ESPO

Collection

5 items • Updated Nov 28, 2025 • 1