DiffReaper-3 / README.md
darwinkernelpanic's picture
Update README.md
c9953c2 verified
---
license: apache-2.0
pipeline_tag: text-generation
datasets:
- codeparrot/github-code-clean
- bigcode/starcoderdata
- bigcode/the-stack-smol
tags:
- diffusion
- llm
- diffreaper
- dllm
- mercury
language:
- en
---
# DiffReaper 3
DiffReaper 3 is the third revision of DiffReaper; a experimental 1.5B parameter Discrete Diffusion Language Model (dLLM) designed for high-throughput parallel token prediction.
Unlike traditional autoregressive models, DiffReaper is optimized for non-linear sequence refinement across mixed Python logic and natural language corpora.
## Model Details
- **Architecture:** 24-Layer Transformer Encoder
- **Hidden Dimension:** 2048
- **Attention Heads:** 16
- **Objective:** Discrete Masked Diffusion (Mercury-style)
- **Training Precision:** BF16
- **Context Window:** 1024 tokens