File size: 813 Bytes
3cc98be c9953c2 3cc98be c9953c2 3cc98be a350405 3cc98be |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
---
license: apache-2.0
pipeline_tag: text-generation
datasets:
- codeparrot/github-code-clean
- bigcode/starcoderdata
- bigcode/the-stack-smol
tags:
- diffusion
- llm
- diffreaper
- dllm
- mercury
language:
- en
---
# DiffReaper 3
DiffReaper 3 is the third revision of DiffReaper; a experimental 1.5B parameter Discrete Diffusion Language Model (dLLM) designed for high-throughput parallel token prediction.
Unlike traditional autoregressive models, DiffReaper is optimized for non-linear sequence refinement across mixed Python logic and natural language corpora.
## Model Details
- **Architecture:** 24-Layer Transformer Encoder
- **Hidden Dimension:** 2048
- **Attention Heads:** 16
- **Objective:** Discrete Masked Diffusion (Mercury-style)
- **Training Precision:** BF16
- **Context Window:** 1024 tokens |