tiny-llama / README.md
manojredhat's picture
Correct Tiny LLaMA model metadata
756e974 verified
---
library_name: transformers
license: apache-2.0
language:
- en
tags:
- tinystories
- llama
- language-model
- educational
- safetensors
datasets:
- roneneldan/TinyStories
model-index:
- name: Tiny LLaMA
results: []
---
# Tiny LLaMA - TinyStories Edition
A small LLaMA-style causal language model trained on the TinyStories dataset.
This repository contains the Hugging Face `LlamaForCausalLM` conversion of the
local checkpoint from `/home/manojk/small_llama/llama2.c/out/ckpt.pt`.
## Model Details
- **Model Type**: Decoder-only Transformer (`LlamaForCausalLM`)
- **Parameters**: 6,270,624
- **Layers**: 6
- **Attention Heads**: 6
- **Key/Value Heads**: 6
- **Head Dimension**: 48
- **Hidden Size**: 288
- **Intermediate Size**: 768
- **Vocabulary Size**: 512
- **Training Sequence Length**: 256
- **Data Type**: float32
- **Format**: safetensors
## Training
- **Dataset**: TinyStories
- **Training Iterations**: 100
- **Initial Loss**: 6.27
- **Final Loss**: 4.81
- **Validation Loss**: 6.29 to 4.77
## Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("manojredhat/tiny-llama")
model = AutoModelForCausalLM.from_pretrained("manojredhat/tiny-llama")
inputs = tokenizer("Once upon a time", return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=40, do_sample=False)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
## Tokenizer
The model uses a SentencePiece tokenizer with 512 tokens:
- `<unk>`: token ID 0
- `<s>`: token ID 1
- `</s>`: token ID 2
## Notes
This is an educational small model trained for short TinyStories-style text.
It is not intended for production use, knowledge-intensive tasks, or long-form
generation.