Here2Disrupt commited on
Commit
aab9c94
·
1 Parent(s): 31e2b04

Add model card

Browse files
Files changed (1) hide show
  1. README.md +32 -0
README.md ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - gpt2
5
+ - tinystories
6
+ - language-model
7
+ ---
8
+
9
+ # TinyStories-GPT
10
+
11
+ This is a small GPT-like model trained from scratch on the [TinyStories dataset](https://huggingface.co/datasets/roneneldan/TinyStories).
12
+ It was implemented using a NanoGPT-style training loop in PyTorch.
13
+
14
+ ## Model Details
15
+ - **Architecture:** 6 layers, 6 heads, 384 hidden size
16
+ - **Context length:** 128 tokens
17
+ - **Vocab size:** 50257 (GPT-2 tokenizer)
18
+ - **Dataset:** TinyStories
19
+ - **Training:** ~20k steps, AdamW, cosine LR decay
20
+
21
+ ## Example Usage
22
+
23
+ ```python
24
+ from transformers import AutoTokenizer, AutoModelForCausalLM
25
+
26
+ tokenizer = AutoTokenizer.from_pretrained("Here2Disrupt/tiny-stories-gpt")
27
+ model = AutoModelForCausalLM.from_pretrained("Here2Disrupt/tiny-stories-gpt")
28
+
29
+ prompt = "Once upon a time"
30
+ inputs = tokenizer(prompt, return_tensors="pt")
31
+ outputs = model.generate(**inputs, max_new_tokens=50)
32
+ print(tokenizer.decode(outputs[0]))