File size: 979 Bytes
15bc863
 
 
 
 
 
 
0ae06f0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
---
license: mit
datasets:
- Skylion007/openwebtext
language:
- en
pipeline_tag: text-generation
tags:
- llama
- causal-lm
- pretrained
model-index:
- name: miniLLM-0.1B
  results: []
---

# miniLLM-0.1B

A small (~109M parameters) causal language model pretrained from scratch on [OpenWebText](https://huggingface.co/datasets/Skylion007/openwebtext).

## Model Details

| Attribute | Value |
|---|---|
| Architecture | LlamaForCausalLM |
| Parameters | ~109M |
| Hidden Size | 768 |
| Attention Heads | 12 |
| Layers | 10 |
| Intermediate Size | 2048 |
| Max Sequence Length | 1024 |
| Vocabulary Size | 50257 |
| Tokenizer | GPT-2 (BPE) |
| Positional Encoding | RoPE (θ=10000) |
| Activation | SiLU |
| Tie Word Embeddings | Yes |
| Precision (training) | bfloat16 |



## Limitations

This is a small-scale pretrained model intended for research and educational purposes. It is **not** suitable for production use. Outputs may be incoherent, biased, or factually incorrect.