manojredhat
/

tiny-llama

Text Generation

text-generation-inference

Model card Files Files and versions

tiny-llama / README.md

manojredhat's picture

Correct Tiny LLaMA model metadata

756e974 verified 7 days ago

|

history blame contribute delete

1.75 kB

	---
	library_name: transformers
	license: apache-2.0
	language:
	- en
	tags:
	- tinystories
	- llama
	- language-model
	- educational
	- safetensors
	datasets:
	- roneneldan/TinyStories
	model-index:
	- name: Tiny LLaMA
	results: []
	---

	# Tiny LLaMA - TinyStories Edition

	A small LLaMA-style causal language model trained on the TinyStories dataset.
	This repository contains the Hugging Face `LlamaForCausalLM` conversion of the
	local checkpoint from `/home/manojk/small_llama/llama2.c/out/ckpt.pt`.

	## Model Details

	- Model Type: Decoder-only Transformer (`LlamaForCausalLM`)
	- Parameters: 6,270,624
	- Layers: 6
	- Attention Heads: 6
	- Key/Value Heads: 6
	- Head Dimension: 48
	- Hidden Size: 288
	- Intermediate Size: 768
	- Vocabulary Size: 512
	- Training Sequence Length: 256
	- Data Type: float32
	- Format: safetensors

	## Training

	- Dataset: TinyStories
	- Training Iterations: 100
	- Initial Loss: 6.27
	- Final Loss: 4.81
	- Validation Loss: 6.29 to 4.77

	## Usage

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	tokenizer = AutoTokenizer.from_pretrained("manojredhat/tiny-llama")
	model = AutoModelForCausalLM.from_pretrained("manojredhat/tiny-llama")

	inputs = tokenizer("Once upon a time", return_tensors="pt")
	outputs = model.generate(**inputs, max_new_tokens=40, do_sample=False)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	## Tokenizer

	The model uses a SentencePiece tokenizer with 512 tokens:

	- `<unk>`: token ID 0
	- `<s>`: token ID 1
	- `</s>`: token ID 2

	## Notes

	This is an educational small model trained for short TinyStories-style text.
	It is not intended for production use, knowledge-intensive tasks, or long-form
	generation.