Hippocrene commited on
Commit
0ae06f0
·
verified ·
1 Parent(s): 15bc863

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +36 -1
README.md CHANGED
@@ -5,4 +5,39 @@ datasets:
5
  language:
6
  - en
7
  pipeline_tag: text-generation
8
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  language:
6
  - en
7
  pipeline_tag: text-generation
8
+ tags:
9
+ - llama
10
+ - causal-lm
11
+ - pretrained
12
+ model-index:
13
+ - name: miniLLM-0.1B
14
+ results: []
15
+ ---
16
+
17
+ # miniLLM-0.1B
18
+
19
+ A small (~109M parameters) causal language model pretrained from scratch on [OpenWebText](https://huggingface.co/datasets/Skylion007/openwebtext).
20
+
21
+ ## Model Details
22
+
23
+ | Attribute | Value |
24
+ |---|---|
25
+ | Architecture | LlamaForCausalLM |
26
+ | Parameters | ~109M |
27
+ | Hidden Size | 768 |
28
+ | Attention Heads | 12 |
29
+ | Layers | 10 |
30
+ | Intermediate Size | 2048 |
31
+ | Max Sequence Length | 1024 |
32
+ | Vocabulary Size | 50257 |
33
+ | Tokenizer | GPT-2 (BPE) |
34
+ | Positional Encoding | RoPE (θ=10000) |
35
+ | Activation | SiLU |
36
+ | Tie Word Embeddings | Yes |
37
+ | Precision (training) | bfloat16 |
38
+
39
+
40
+
41
+ ## Limitations
42
+
43
+ This is a small-scale pretrained model intended for research and educational purposes. It is **not** suitable for production use. Outputs may be incoherent, biased, or factually incorrect.