--- library_name: transformers license: apache-2.0 language: - en tags: - tinystories - llama - language-model - educational - safetensors datasets: - roneneldan/TinyStories model-index: - name: Tiny LLaMA results: [] --- # Tiny LLaMA - TinyStories Edition A small LLaMA-style causal language model trained on the TinyStories dataset. This repository contains the Hugging Face `LlamaForCausalLM` conversion of the local checkpoint from `/home/manojk/small_llama/llama2.c/out/ckpt.pt`. ## Model Details - **Model Type**: Decoder-only Transformer (`LlamaForCausalLM`) - **Parameters**: 6,270,624 - **Layers**: 6 - **Attention Heads**: 6 - **Key/Value Heads**: 6 - **Head Dimension**: 48 - **Hidden Size**: 288 - **Intermediate Size**: 768 - **Vocabulary Size**: 512 - **Training Sequence Length**: 256 - **Data Type**: float32 - **Format**: safetensors ## Training - **Dataset**: TinyStories - **Training Iterations**: 100 - **Initial Loss**: 6.27 - **Final Loss**: 4.81 - **Validation Loss**: 6.29 to 4.77 ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("manojredhat/tiny-llama") model = AutoModelForCausalLM.from_pretrained("manojredhat/tiny-llama") inputs = tokenizer("Once upon a time", return_tensors="pt") outputs = model.generate(**inputs, max_new_tokens=40, do_sample=False) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ## Tokenizer The model uses a SentencePiece tokenizer with 512 tokens: - ``: token ID 0 - ``: token ID 1 - ``: token ID 2 ## Notes This is an educational small model trained for short TinyStories-style text. It is not intended for production use, knowledge-intensive tasks, or long-form generation.