DiffReaper 3
DiffReaper 3 is the third revision of DiffReaper; a experimental 1.5B parameter Discrete Diffusion Language Model (dLLM) designed for high-throughput parallel token prediction. Unlike traditional autoregressive models, DiffReaper is optimized for non-linear sequence refinement across mixed Python logic and natural language corpora.
Model Details
- Architecture: 24-Layer Transformer Encoder
- Hidden Dimension: 2048
- Attention Heads: 16
- Objective: Discrete Masked Diffusion (Mercury-style)
- Training Precision: BF16
- Context Window: 1024 tokens