--- license: apache-2.0 pipeline_tag: text-generation datasets: - codeparrot/github-code-clean - bigcode/starcoderdata - bigcode/the-stack-smol tags: - diffusion - llm - diffreaper - dllm - mercury language: - en --- # DiffReaper 3 DiffReaper 3 is the third revision of DiffReaper; a experimental 1.5B parameter Discrete Diffusion Language Model (dLLM) designed for high-throughput parallel token prediction. Unlike traditional autoregressive models, DiffReaper is optimized for non-linear sequence refinement across mixed Python logic and natural language corpora. ## Model Details - **Architecture:** 24-Layer Transformer Encoder - **Hidden Dimension:** 2048 - **Attention Heads:** 16 - **Objective:** Discrete Masked Diffusion (Mercury-style) - **Training Precision:** BF16 - **Context Window:** 1024 tokens