DiffReaper 3

DiffReaper 3 is the third revision of DiffReaper; a experimental 1.5B parameter Discrete Diffusion Language Model (dLLM) designed for high-throughput parallel token prediction. Unlike traditional autoregressive models, DiffReaper is optimized for non-linear sequence refinement across mixed Python logic and natural language corpora.

Model Details

  • Architecture: 24-Layer Transformer Encoder
  • Hidden Dimension: 2048
  • Attention Heads: 16
  • Objective: Discrete Masked Diffusion (Mercury-style)
  • Training Precision: BF16
  • Context Window: 1024 tokens
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Datasets used to train darwinkernelpanic/DiffReaper-3