darwinkernelpanic commited on
Commit
3cc98be
·
verified ·
1 Parent(s): a350405

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +25 -2
README.md CHANGED
@@ -1,3 +1,26 @@
1
- # This is DiffReaper.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
 
3
- Mercury-style Discrete Diffusion LLM trained on mixed logic and comprehension datasets.
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ pipeline_tag: text-generation
4
+ datasets:
5
+ - codeparrot/github-code-clean
6
+ - bigcode/starcoderdata
7
+ - bigcode/the-stack-smol
8
+ tags:
9
+ - diffusion
10
+ - llm
11
+ - diffreaper
12
+ - dllm
13
+ - mercury
14
+ ---
15
+ # DiffReaper 3
16
 
17
+ DiffReaper 3 is the third revision of DiffReaper; a experimental 1.5B parameter Discrete Diffusion Language Model (dLLM) designed for high-throughput parallel token prediction.
18
+ Unlike traditional autoregressive models, DiffReaper is optimized for non-linear sequence refinement across mixed Python logic and natural language corpora.
19
+
20
+ ## Model Details
21
+ - **Architecture:** 24-Layer Transformer Encoder
22
+ - **Hidden Dimension:** 2048
23
+ - **Attention Heads:** 16
24
+ - **Objective:** Discrete Masked Diffusion (Mercury-style)
25
+ - **Training Precision:** BF16
26
+ - **Context Window:** 1024 tokens