--- license: mit datasets: - Skylion007/openwebtext language: - en pipeline_tag: text-generation tags: - llama - causal-lm - pretrained model-index: - name: miniLLM-0.1B results: [] --- # miniLLM-0.1B A small (~109M parameters) causal language model pretrained from scratch on [OpenWebText](https://huggingface.co/datasets/Skylion007/openwebtext). ## Model Details | Attribute | Value | |---|---| | Architecture | LlamaForCausalLM | | Parameters | ~109M | | Hidden Size | 768 | | Attention Heads | 12 | | Layers | 10 | | Intermediate Size | 2048 | | Max Sequence Length | 1024 | | Vocabulary Size | 50257 | | Tokenizer | GPT-2 (BPE) | | Positional Encoding | RoPE (θ=10000) | | Activation | SiLU | | Tie Word Embeddings | Yes | | Precision (training) | bfloat16 | ## Limitations This is a small-scale pretrained model intended for research and educational purposes. It is **not** suitable for production use. Outputs may be incoherent, biased, or factually incorrect.