Configuration Parsing Warning:In adapter_config.json: "peft.task_type" must be a string
LLaDA2.1-mini-MemDLM
Research artifact. This adapter is released for academic and research purposes. It was trained on a limited dataset as a proof-of-concept for the MemDLM method and is not intended for production use.
This repository contains a LoRA adapter for ML-GSAI/LLaDA2.1-mini, trained using the method introduced in the paper MemDLM: Memory-Enhanced DLM Training.
MemDLM (Memory-Enhanced Diffusion Language Model) narrows the train-inference gap in Diffusion Language Models (DLMs) by embedding a simulated denoising process into training via Bi-level Optimization. This mechanism enhances long-context understanding and retrieval performance by offloading memorization pressure to model parameters.
Usage
import transformers
from peft import PeftModel
# Load base model
base_model = transformers.AutoModel.from_pretrained(
"ML-GSAI/LLaDA2.1-mini",
trust_remote_code=True,
torch_dtype="auto",
)
# Load MemDLM adapter
model = PeftModel.from_pretrained(base_model, "JarvisPei/LLaDA2.1-mini-MemDLM")
model.eval()
Evaluation
See the MemDLM repo for evaluation scripts:
bash examples/llada21/eval_run.sh \
--adapter_model_name_or_path JarvisPei/LLaDA2.1-mini-MemDLM
Citation
@article{pei2026memdlm,
title = {MemDLM: Memory-Enhanced DLM Training},
author = {Zehua Pei and Hui-Ling Zhen and Weizhe Lin and Sinno Jialin Pan and Yunhe Wang and Mingxuan Yuan and Bei Yu},
year = {2026},
journal = {arXiv preprint arXiv:2603.22241},
}
License
MIT
- Downloads last month
- 11