Quick Start Guide

To use this models, follow the snippet below:

from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained(
# model_config_overrides = {}  # Use this to optionally override config parameters
    "kuleshov-group/proseco-llada-sft",
    trust_remote_code=True,
    # **model_config_overrides,
)

Model details

Fine-tuned LLaDA-8B-Base model on combination of rStar-Coder and OpenMathInstruct-2 datasets for ~40 B tokens.

See paper for more detaials.

Citation

  @article{
    schiff2026learn,
    title={Learn from Your Mistakes: Self-Correcting Masked Diffusion Models},
    author={Schiff, Yair and Belhasin, Omer and Uziel, Roy and Wang, Guanghan and Arriola, Marianne and Turok, Gilad and Elad, Michael and Kuleshov, Volodymyr},
    journal={arXiv preprint arXiv:2602.11590},
    year={2026}
  }