-
LLaDA2.0: Scaling Up Diffusion Language Models to 100B
Paper • 2512.15745 • Published • 78 -
inclusionAI/LLaDA2.0-flash
Text Generation • 103B • Updated • 394 • 59 -
inclusionAI/LLaDA2.0-mini
Text Generation • 16B • Updated • 8.45k • 51 -
inclusionAI/LLaDA2.0-flash-preview
Text Generation • 103B • Updated • 78 • 68
Collections
Discover the best community collections!
Collections including paper arxiv:2512.15745
-
Scaling Latent Reasoning via Looped Language Models
Paper • 2510.25741 • Published • 221 -
Every Token Counts: Generalizing 16M Ultra-Long Context in Large Language Models
Paper • 2511.23319 • Published • 22 -
Focused Chain-of-Thought: Efficient LLM Reasoning via Structured Input Information
Paper • 2511.22176 • Published • 4 -
FedRE: A Representation Entanglement Framework for Model-Heterogeneous Federated Learning
Paper • 2511.22265 • Published • 1
-
Fast-dLLM v2: Efficient Block-Diffusion LLM
Paper • 2509.26328 • Published • 55 -
Attention Is All You Need for KV Cache in Diffusion LLMs
Paper • 2510.14973 • Published • 40 -
Attention Sinks in Diffusion Language Models
Paper • 2510.15731 • Published • 48 -
Diffusion Language Models are Super Data Learners
Paper • 2511.03276 • Published • 128
-
Large Language Diffusion Models
Paper • 2502.09992 • Published • 123 -
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 74 -
MMaDA: Multimodal Large Diffusion Language Models
Paper • 2505.15809 • Published • 97 -
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective
Paper • 2505.15045 • Published • 54
-
TiDAR: Think in Diffusion, Talk in Autoregression
Paper • 2511.08923 • Published • 119 -
Diffusion Language Models are Super Data Learners
Paper • 2511.03276 • Published • 128 -
What Makes Diffusion Language Models Super Data Learners?
Paper • 2510.04071 • Published -
LLaDA2.0: Scaling Up Diffusion Language Models to 100B
Paper • 2512.15745 • Published • 78
-
Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding
Paper • 2505.22618 • Published • 44 -
DINGO: Constrained Inference for Diffusion LLMs
Paper • 2505.23061 • Published • 31 -
Discrete Diffusion in Large Language and Multimodal Models: A Survey
Paper • 2506.13759 • Published • 43 -
LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs
Paper • 2506.14429 • Published • 44
-
LLaDA2.0: Scaling Up Diffusion Language Models to 100B
Paper • 2512.15745 • Published • 78 -
inclusionAI/LLaDA2.0-flash
Text Generation • 103B • Updated • 394 • 59 -
inclusionAI/LLaDA2.0-mini
Text Generation • 16B • Updated • 8.45k • 51 -
inclusionAI/LLaDA2.0-flash-preview
Text Generation • 103B • Updated • 78 • 68
-
Scaling Latent Reasoning via Looped Language Models
Paper • 2510.25741 • Published • 221 -
Every Token Counts: Generalizing 16M Ultra-Long Context in Large Language Models
Paper • 2511.23319 • Published • 22 -
Focused Chain-of-Thought: Efficient LLM Reasoning via Structured Input Information
Paper • 2511.22176 • Published • 4 -
FedRE: A Representation Entanglement Framework for Model-Heterogeneous Federated Learning
Paper • 2511.22265 • Published • 1
-
TiDAR: Think in Diffusion, Talk in Autoregression
Paper • 2511.08923 • Published • 119 -
Diffusion Language Models are Super Data Learners
Paper • 2511.03276 • Published • 128 -
What Makes Diffusion Language Models Super Data Learners?
Paper • 2510.04071 • Published -
LLaDA2.0: Scaling Up Diffusion Language Models to 100B
Paper • 2512.15745 • Published • 78
-
Fast-dLLM v2: Efficient Block-Diffusion LLM
Paper • 2509.26328 • Published • 55 -
Attention Is All You Need for KV Cache in Diffusion LLMs
Paper • 2510.14973 • Published • 40 -
Attention Sinks in Diffusion Language Models
Paper • 2510.15731 • Published • 48 -
Diffusion Language Models are Super Data Learners
Paper • 2511.03276 • Published • 128
-
Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding
Paper • 2505.22618 • Published • 44 -
DINGO: Constrained Inference for Diffusion LLMs
Paper • 2505.23061 • Published • 31 -
Discrete Diffusion in Large Language and Multimodal Models: A Survey
Paper • 2506.13759 • Published • 43 -
LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs
Paper • 2506.14429 • Published • 44
-
Large Language Diffusion Models
Paper • 2502.09992 • Published • 123 -
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 74 -
MMaDA: Multimodal Large Diffusion Language Models
Paper • 2505.15809 • Published • 97 -
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective
Paper • 2505.15045 • Published • 54