File size: 1,749 Bytes
468623e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
82332a9
 
756e974
 
 
82332a9
 
 
756e974
 
82332a9
756e974
 
 
 
 
 
 
82332a9
 
 
 
 
756e974
82332a9
 
 
756e974
82332a9
756e974
82332a9
 
756e974
82332a9
97fd0bf
 
82332a9
756e974
 
 
82332a9
 
 
 
756e974
82332a9
756e974
 
 
82332a9
756e974
82332a9
756e974
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
---
library_name: transformers
license: apache-2.0
language:
- en
tags:
- tinystories
- llama
- language-model
- educational
- safetensors
datasets:
- roneneldan/TinyStories
model-index:
- name: Tiny LLaMA
  results: []
---

# Tiny LLaMA - TinyStories Edition

A small LLaMA-style causal language model trained on the TinyStories dataset.
This repository contains the Hugging Face `LlamaForCausalLM` conversion of the
local checkpoint from `/home/manojk/small_llama/llama2.c/out/ckpt.pt`.

## Model Details

- **Model Type**: Decoder-only Transformer (`LlamaForCausalLM`)
- **Parameters**: 6,270,624
- **Layers**: 6
- **Attention Heads**: 6
- **Key/Value Heads**: 6
- **Head Dimension**: 48
- **Hidden Size**: 288
- **Intermediate Size**: 768
- **Vocabulary Size**: 512
- **Training Sequence Length**: 256
- **Data Type**: float32
- **Format**: safetensors

## Training

- **Dataset**: TinyStories
- **Training Iterations**: 100
- **Initial Loss**: 6.27
- **Final Loss**: 4.81
- **Validation Loss**: 6.29 to 4.77

## Usage

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("manojredhat/tiny-llama")
model = AutoModelForCausalLM.from_pretrained("manojredhat/tiny-llama")

inputs = tokenizer("Once upon a time", return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=40, do_sample=False)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

## Tokenizer

The model uses a SentencePiece tokenizer with 512 tokens:

- `<unk>`: token ID 0
- `<s>`: token ID 1
- `</s>`: token ID 2

## Notes

This is an educational small model trained for short TinyStories-style text.
It is not intended for production use, knowledge-intensive tasks, or long-form
generation.