metadata
license: apache-2.0
pipeline_tag: text-generation
datasets:
- codeparrot/github-code-clean
- bigcode/starcoderdata
- bigcode/the-stack-smol
tags:
- diffusion
- llm
- diffreaper
- dllm
- mercury
language:
- en
DiffReaper 3
DiffReaper 3 is the third revision of DiffReaper; a experimental 1.5B parameter Discrete Diffusion Language Model (dLLM) designed for high-throughput parallel token prediction. Unlike traditional autoregressive models, DiffReaper is optimized for non-linear sequence refinement across mixed Python logic and natural language corpora.
Model Details
- Architecture: 24-Layer Transformer Encoder
- Hidden Dimension: 2048
- Attention Heads: 16
- Objective: Discrete Masked Diffusion (Mercury-style)
- Training Precision: BF16
- Context Window: 1024 tokens