File size: 734 Bytes
d91a3c8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1da92ee
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
---
license: apache-2.0
---
## Source code
This is a small language model trained on the Julius caeser text. The source code of the model is available at - 
[Link to the codebase](https://colab.research.google.com/github/samratkar/samratkar.github.io/blob/main/_posts/concepts/genai/notes-codes/slm-from-scratch/slm-jc.ipynb)

The model configurations are as follows - 
## Model configuration
1. learning rate = 1e-4
2. max iters = 10000
3. warmup steps = 2000 
4. min lr = 5e-4 
5. eval iters = 500 
6. batch size = 8 
7. block size = 128 
8. vocab size=50257 
9. block size=128 
10. number of layers=4 
11. number of heads=4 
12. embedding dimension=768 
13. dropout=0.01 
14. bias=True

## Test Vs Validation loss 
![](./10kep.png)