--- language: - en license: mit library_name: transformers pipeline_tag: text-generation datasets: - roneneldan/TinyStories tags: - custom_code - educational --- # Tiny GPT Tiny GPT is an educational decoder-only Transformer trained from scratch on the [TinyStories](https://huggingface.co/datasets/roneneldan/TinyStories) dataset. The implementation is intentionally small and readable. ## Model details - Architecture: decoder-only causal language model - Context length: 512 tokens - Vocabulary size: 10,000 - Hidden size: 256 - Transformer layers: 6 - Attention heads: 8 Source code: https://github.com/alainbrown/tiny-gpt ## Usage This repository contains custom Transformers code. Review it before enabling `trust_remote_code`. ```python from transformers import AutoModelForCausalLM, AutoTokenizer repo_id = "alainbrown/tiny-gpt" tokenizer = AutoTokenizer.from_pretrained(repo_id, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained(repo_id, trust_remote_code=True) inputs = tokenizer("Once upon a time", return_tensors="pt") logits = model(**inputs).logits ``` ## Intended use This model is intended for education and experimentation. It is not intended for production, factual question answering, or safety-critical applications. ## Limitations The model is small, trained on synthetic children's stories, and has not been comprehensively evaluated. It may produce incoherent, repetitive, incorrect, or inappropriate text. English is the only supported language. ## Training The training pipeline is available in the linked GitHub repository. This model repository excludes optimizer and progress state and contains inference files only.