File size: 264 Bytes
c8851b8 |
1 2 3 4 5 6 7 8 9 |
---
license: mit
datasets:
- Jiayi-Pan/Countdown-Tasks-3to4
base_model:
- GSAI-ML/LLaDA-8B-Instruct
---
Post-Training Lora models on countdown task based on LLaDA-8B-Instruct for the paper Principled RL for Diffusion LLMs Emerges from a Sequence-Level Perspective |