DiffReaper-3 / README.md
darwinkernelpanic's picture
Update README.md
c9953c2 verified
metadata
license: apache-2.0
pipeline_tag: text-generation
datasets:
  - codeparrot/github-code-clean
  - bigcode/starcoderdata
  - bigcode/the-stack-smol
tags:
  - diffusion
  - llm
  - diffreaper
  - dllm
  - mercury
language:
  - en

DiffReaper 3

DiffReaper 3 is the third revision of DiffReaper; a experimental 1.5B parameter Discrete Diffusion Language Model (dLLM) designed for high-throughput parallel token prediction. Unlike traditional autoregressive models, DiffReaper is optimized for non-linear sequence refinement across mixed Python logic and natural language corpora.

Model Details

  • Architecture: 24-Layer Transformer Encoder
  • Hidden Dimension: 2048
  • Attention Heads: 16
  • Objective: Discrete Masked Diffusion (Mercury-style)
  • Training Precision: BF16
  • Context Window: 1024 tokens