---
license: mit
datasets:
- Skylion007/openwebtext
language:
- en
pipeline_tag: text-generation
tags:
- llama
- causal-lm
- pretrained
model-index:
- name: miniLLM-0.1B
  results: []
---

# miniLLM-0.1B

A small (~109M parameters) causal language model pretrained from scratch on [OpenWebText](https://huggingface.co/datasets/Skylion007/openwebtext).

## Model Details

| Attribute | Value |
|---|---|
| Architecture | LlamaForCausalLM |
| Parameters | ~109M |
| Hidden Size | 768 |
| Attention Heads | 12 |
| Layers | 10 |
| Intermediate Size | 2048 |
| Max Sequence Length | 1024 |
| Vocabulary Size | 50257 |
| Tokenizer | GPT-2 (BPE) |
| Positional Encoding | RoPE (θ=10000) |
| Activation | SiLU |
| Tie Word Embeddings | Yes |
| Precision (training) | bfloat16 |


## Limitations

This is a small-scale pretrained model intended for research and educational purposes. It is **not** suitable for production use. Outputs may be incoherent, biased, or factually incorrect.