samratkar commited on
Commit
d91a3c8
·
verified ·
1 Parent(s): ccfb5e8

Update README.md

Browse files

## Training Vs Validation loss function plot
![10kep.png](https://cdn-uploads.huggingface.co/production/uploads/667f956391e6d474e09777aa/hMRTWtMFGx2bmohFn7AV3.png)

Files changed (1) hide show
  1. README.md +23 -3
README.md CHANGED
@@ -1,3 +1,23 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+ ## Source code
5
+ This is a small language model trained on the Julius caeser text. The source code of the model is available at -
6
+ [Link to the codebase](https://colab.research.google.com/github/samratkar/samratkar.github.io/blob/main/_posts/concepts/genai/notes-codes/slm-from-scratch/slm-jc.ipynb)
7
+
8
+ The model configurations are as follows -
9
+ ## Model configuration
10
+ 1. learning rate = 1e-4
11
+ 2. max iters = 10000
12
+ 3. warmup steps = 2000
13
+ 4. min lr = 5e-4
14
+ 5. eval iters = 500
15
+ 6. batch size = 8
16
+ 7. block size = 128
17
+ 8. vocab size=50257
18
+ 9. block size=128
19
+ 10. number of layers=4
20
+ 11. number of heads=4
21
+ 12. embedding dimension=768
22
+ 13. dropout=0.01
23
+ 14. bias=True