samratkar
/

slm_julius_caesar

Model card Files Files and versions

slm_julius_caesar / README.md

samratkar's picture

Update README.md

1da92ee verified 9 months ago

|

history blame contribute delete

734 Bytes

	---
	license: apache-2.0
	---
	## Source code
	This is a small language model trained on the Julius caeser text. The source code of the model is available at -
	[Link to the codebase](https://colab.research.google.com/github/samratkar/samratkar.github.io/blob/main/_posts/concepts/genai/notes-codes/slm-from-scratch/slm-jc.ipynb)

	The model configurations are as follows -
	## Model configuration
	1. learning rate = 1e-4
	2. max iters = 10000
	3. warmup steps = 2000
	4. min lr = 5e-4
	5. eval iters = 500
	6. batch size = 8
	7. block size = 128
	8. vocab size=50257
	9. block size=128
	10. number of layers=4
	11. number of heads=4
	12. embedding dimension=768
	13. dropout=0.01
	14. bias=True

	## Test Vs Validation loss
	![](./10kep.png)