| --- |
| library_name: transformers |
| license: apache-2.0 |
| language: |
| - en |
| tags: |
| - tinystories |
| - llama |
| - language-model |
| - educational |
| - safetensors |
| datasets: |
| - roneneldan/TinyStories |
| model-index: |
| - name: Tiny LLaMA |
| results: [] |
| --- |
| |
| # Tiny LLaMA - TinyStories Edition |
|
|
| A small LLaMA-style causal language model trained on the TinyStories dataset. |
| This repository contains the Hugging Face `LlamaForCausalLM` conversion of the |
| local checkpoint from `/home/manojk/small_llama/llama2.c/out/ckpt.pt`. |
|
|
| ## Model Details |
|
|
| - **Model Type**: Decoder-only Transformer (`LlamaForCausalLM`) |
| - **Parameters**: 6,270,624 |
| - **Layers**: 6 |
| - **Attention Heads**: 6 |
| - **Key/Value Heads**: 6 |
| - **Head Dimension**: 48 |
| - **Hidden Size**: 288 |
| - **Intermediate Size**: 768 |
| - **Vocabulary Size**: 512 |
| - **Training Sequence Length**: 256 |
| - **Data Type**: float32 |
| - **Format**: safetensors |
|
|
| ## Training |
|
|
| - **Dataset**: TinyStories |
| - **Training Iterations**: 100 |
| - **Initial Loss**: 6.27 |
| - **Final Loss**: 4.81 |
| - **Validation Loss**: 6.29 to 4.77 |
|
|
| ## Usage |
|
|
| ```python |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| |
| tokenizer = AutoTokenizer.from_pretrained("manojredhat/tiny-llama") |
| model = AutoModelForCausalLM.from_pretrained("manojredhat/tiny-llama") |
| |
| inputs = tokenizer("Once upon a time", return_tensors="pt") |
| outputs = model.generate(**inputs, max_new_tokens=40, do_sample=False) |
| print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
| ``` |
|
|
| ## Tokenizer |
|
|
| The model uses a SentencePiece tokenizer with 512 tokens: |
|
|
| - `<unk>`: token ID 0 |
| - `<s>`: token ID 1 |
| - `</s>`: token ID 2 |
|
|
| ## Notes |
|
|
| This is an educational small model trained for short TinyStories-style text. |
| It is not intended for production use, knowledge-intensive tasks, or long-form |
| generation. |
|
|