--- license: mit tags: - gpt2 - tinystories - language-model --- # TinyStories-GPT This is a small GPT-like model trained from scratch on the [TinyStories dataset](https://huggingface.co/datasets/roneneldan/TinyStories). It was implemented using a NanoGPT-style training loop in PyTorch. ## Model Details - **Architecture:** 6 layers, 6 heads, 384 hidden size - **Context length:** 128 tokens - **Vocab size:** 50257 (GPT-2 tokenizer) - **Dataset:** TinyStories - **Training:** ~20k steps, AdamW, cosine LR decay ## Example Usage ```python from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Here2Disrupt/tiny-stories-gpt") model = AutoModelForCausalLM.from_pretrained("Here2Disrupt/tiny-stories-gpt") prompt = "Once upon a time" inputs = tokenizer(prompt, return_tensors="pt") outputs = model.generate(**inputs, max_new_tokens=50) print(tokenizer.decode(outputs[0]))