slm_julius_caesar / README.md
samratkar's picture
Update README.md
1da92ee verified
---
license: apache-2.0
---
## Source code
This is a small language model trained on the Julius caeser text. The source code of the model is available at -
[Link to the codebase](https://colab.research.google.com/github/samratkar/samratkar.github.io/blob/main/_posts/concepts/genai/notes-codes/slm-from-scratch/slm-jc.ipynb)
The model configurations are as follows -
## Model configuration
1. learning rate = 1e-4
2. max iters = 10000
3. warmup steps = 2000
4. min lr = 5e-4
5. eval iters = 500
6. batch size = 8
7. block size = 128
8. vocab size=50257
9. block size=128
10. number of layers=4
11. number of heads=4
12. embedding dimension=768
13. dropout=0.01
14. bias=True
## Test Vs Validation loss
![](./10kep.png)