d58649e 55c830f d58649e
1
2
3
4
5
Model specification - Params: 21 million - Architecture: Decoder-only transformer - Training data: 1.1 million tokens from Shakespeare text - Context length: 256