File size: 389 Bytes
343495a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
This contains the final 45M trained diffusion language model as well as the tokenizer trained from scratch on TinyStories Dataset.

The model architecture is provided as well : 
```python
cfg = {
    "vocab_size": 26_000,
    "context_length": 256,
    "emb_dim": 512,
    "n_layers": 10,
    "n_heads": 8,
    "d_ff": 2048, # 4*emb_dim
    "dropout": 0.1,
    "diffusion_steps": 128
}
```