Configuration Parsing Warning:In adapter_config.json: "peft.task_type" must be a string

LLaDA2.1-mini-MemDLM

Research artifact. This adapter is released for academic and research purposes. It was trained on a limited dataset as a proof-of-concept for the MemDLM method and is not intended for production use.

This repository contains a LoRA adapter for ML-GSAI/LLaDA2.1-mini, trained using the method introduced in the paper MemDLM: Memory-Enhanced DLM Training.

MemDLM (Memory-Enhanced Diffusion Language Model) narrows the train-inference gap in Diffusion Language Models (DLMs) by embedding a simulated denoising process into training via Bi-level Optimization. This mechanism enhances long-context understanding and retrieval performance by offloading memorization pressure to model parameters.

Usage

import transformers
from peft import PeftModel

# Load base model
base_model = transformers.AutoModel.from_pretrained(
    "ML-GSAI/LLaDA2.1-mini",
    trust_remote_code=True,
    torch_dtype="auto",
)

# Load MemDLM adapter
model = PeftModel.from_pretrained(base_model, "JarvisPei/LLaDA2.1-mini-MemDLM")
model.eval()

Evaluation

See the MemDLM repo for evaluation scripts:

bash examples/llada21/eval_run.sh \
    --adapter_model_name_or_path JarvisPei/LLaDA2.1-mini-MemDLM

Citation

@article{pei2026memdlm,
    title   = {MemDLM: Memory-Enhanced DLM Training},
    author  = {Zehua Pei and Hui-Ling Zhen and Weizhe Lin and Sinno Jialin Pan and Yunhe Wang and Mingxuan Yuan and Bei Yu},
    year    = {2026},
    journal = {arXiv preprint arXiv:2603.22241},
}

License

MIT

Downloads last month: -

Paper for JarvisPei/LLaDA2.1-mini-MemDLM

MemDLM: Memory-Enhanced DLM Training

Paper • 2603.22241 • Published Mar 23 • 3