Hippocrene
/

MiniLLM-0.1B

Text Generation

Model card Files Files and versions

Hippocrene commited on 7 days ago

Commit

0ae06f0

·

verified ·

1 Parent(s): 15bc863

Update README.md

Files changed (1) hide show

README.md +36 -1

README.md CHANGED Viewed

@@ -5,4 +5,39 @@ datasets:
 language:
 - en
 pipeline_tag: text-generation
----

 language:
 - en
 pipeline_tag: text-generation
+tags:
+- llama
+- causal-lm
+- pretrained
+model-index:
+- name: miniLLM-0.1B
+  results: []
+---
+# miniLLM-0.1B
+A small (~109M parameters) causal language model pretrained from scratch on [OpenWebText](https://huggingface.co/datasets/Skylion007/openwebtext).
+## Model Details
+| Attribute | Value |
+|---|---|
+| Architecture | LlamaForCausalLM |
+| Parameters | ~109M |
+| Hidden Size | 768 |
+| Attention Heads | 12 |
+| Layers | 10 |
+| Intermediate Size | 2048 |
+| Max Sequence Length | 1024 |
+| Vocabulary Size | 50257 |
+| Tokenizer | GPT-2 (BPE) |
+| Positional Encoding | RoPE (θ=10000) |
+| Activation | SiLU |
+| Tie Word Embeddings | Yes |
+| Precision (training) | bfloat16 |
+## Limitations
+This is a small-scale pretrained model intended for research and educational purposes. It is **not** suitable for production use. Outputs may be incoherent, biased, or factually incorrect.