metadata
license: mit
datasets:
- Skylion007/openwebtext
language:
- en
pipeline_tag: text-generation
tags:
- llama
- causal-lm
- pretrained
model-index:
- name: miniLLM-0.1B
results: []
miniLLM-0.1B
A small (~109M parameters) causal language model pretrained from scratch on OpenWebText.
Model Details
| Attribute | Value |
|---|---|
| Architecture | LlamaForCausalLM |
| Parameters | ~109M |
| Hidden Size | 768 |
| Attention Heads | 12 |
| Layers | 10 |
| Intermediate Size | 2048 |
| Max Sequence Length | 1024 |
| Vocabulary Size | 50257 |
| Tokenizer | GPT-2 (BPE) |
| Positional Encoding | RoPE (θ=10000) |
| Activation | SiLU |
| Tie Word Embeddings | Yes |
| Precision (training) | bfloat16 |
Limitations
This is a small-scale pretrained model intended for research and educational purposes. It is not suitable for production use. Outputs may be incoherent, biased, or factually incorrect.