d3LLM: Ultra-Fast Diffusion LLM using Pseudo-Trajectory Distillation
Paper • 2601.07568 • Published • 1
# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("d3LLM/d3LLM_Dream", trust_remote_code=True, dtype="auto")This repository contains the d3LLM-Dream model, an ultra-fast diffusion language model introduced in the paper d3LLM: Ultra-Fast Diffusion LLM using Pseudo-Trajectory Distillation.
d3LLM-Dream is an ultra-fast diffusion language model that achieves high generation speed while maintaining competitive performance. It strikes a balance between accuracy and parallelism by using pseudo-trajectory distillation during training and entropy-based multi-block decoding during inference.
For more chat examples and evaluation scripts, visit the official repository.
@article{arxiv'26:d3llm,
title = {d3LLM: Ultra-Fast Diffusion LLM using Pseudo-Trajectory Distillation},
author = {Yu-Yang Qian and Junda Su and Lanxiang Hu and Peiyuan Zhang and Zhijie Deng and Peng Zhao and Hao Zhang},
journal = {ArXiv preprint},
volume = {arXiv:2601.07568},
year = {2026}
}
Base model
Dream-org/Dream-v0-Instruct-7B
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="d3LLM/d3LLM_Dream", trust_remote_code=True) messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)