Fu01978 commited on
Commit
cd764fb
·
verified ·
1 Parent(s): 5444f66

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -18
README.md CHANGED
@@ -2,29 +2,33 @@
2
  language: en
3
  license: mit
4
  tags:
5
- - tiny
6
- - language-model
7
- - causal-lm
8
- - from-scratch
9
- - pytorch
 
 
 
 
10
  ---
11
 
12
  # TinyLM
13
 
14
- A ~1M parameter causal language model trained from scratch, for fun and experimentation.
15
 
16
  ## Architecture
17
 
18
  | Hyperparameter | Value |
19
  |---|---|
20
- | Parameters | ~1M |
21
  | Layers | 4 |
22
  | Hidden size | 64 |
23
  | Attention heads | 4 |
24
  | FFN dim | 192 |
25
  | Embedding rank | 32 |
26
  | Context length | 256 |
27
- | Tokenizer | GPT-2 (50,257 vocab) |
28
 
29
  Uses a **factored (low-rank) embedding** to keep the vocab projection from eating the entire parameter budget, with weight tying on the output head.
30
 
@@ -36,7 +40,7 @@ Uses a **factored (low-rank) embedding** to keep the vocab projection from eatin
36
  | Optimizer | AdamW (lr=3e-3, weight_decay=0.01) |
37
  | Scheduler | Cosine annealing with warm restarts |
38
  | Mixed precision | fp16 (torch.cuda.amp) |
39
- | Hardware | Nvidia P100 (Kaggle) |
40
 
41
  ## Usage
42
  ```python
@@ -44,10 +48,10 @@ from huggingface_hub import snapshot_download
44
  import importlib.util
45
  import torch
46
 
47
- # Download all files
48
  snapshot_download(repo_id="Fu01978/TinyLM", local_dir="./tinylm")
49
 
50
- # Load via included script
51
  spec = importlib.util.spec_from_file_location("modeling_tinylm", "./tinylm/modeling_tinylm.py")
52
  module = importlib.util.module_from_spec(spec)
53
  spec.loader.exec_module(module)
@@ -56,11 +60,6 @@ model, tokenizer, config = module.load_tinylm("./tinylm")
56
  model.eval()
57
 
58
  # Generate
59
- output = module.generate(model, tokenizer, "Once upon a time")
60
  print(output)
61
- ```
62
-
63
- ## Example Outputs
64
-
65
- **Prompt:** Once upon a time
66
- **Output:** Once upon a time there was a little girl named Mrs. She decided to go and be a little girl in the park. One day she had to go on a bed. From then on a lot of bread. She said, "What are you doing?" ...
 
2
  language: en
3
  license: mit
4
  tags:
5
+ - tiny
6
+ - language-model
7
+ - causal-lm
8
+ - pytorch
9
+ datasets:
10
+ - roneneldan/TinyStories
11
+ - Skylion007/openwebtext
12
+ pipeline_tag: text-generation
13
+ library_name: transformers
14
  ---
15
 
16
  # TinyLM
17
 
18
+ A 3.4M parameter causal language model trained from scratch, for experimentation.
19
 
20
  ## Architecture
21
 
22
  | Hyperparameter | Value |
23
  |---|---|
24
+ | Parameters | 3.403.968 |
25
  | Layers | 4 |
26
  | Hidden size | 64 |
27
  | Attention heads | 4 |
28
  | FFN dim | 192 |
29
  | Embedding rank | 32 |
30
  | Context length | 256 |
31
+ | Tokenizer | GPT-2 (50257 vocab) |
32
 
33
  Uses a **factored (low-rank) embedding** to keep the vocab projection from eating the entire parameter budget, with weight tying on the output head.
34
 
 
40
  | Optimizer | AdamW (lr=3e-3, weight_decay=0.01) |
41
  | Scheduler | Cosine annealing with warm restarts |
42
  | Mixed precision | fp16 (torch.cuda.amp) |
43
+ | Hardware | Nvidia P100 |
44
 
45
  ## Usage
46
  ```python
 
48
  import importlib.util
49
  import torch
50
 
51
+ # Download files
52
  snapshot_download(repo_id="Fu01978/TinyLM", local_dir="./tinylm")
53
 
54
+ # Load via script
55
  spec = importlib.util.spec_from_file_location("modeling_tinylm", "./tinylm/modeling_tinylm.py")
56
  module = importlib.util.module_from_spec(spec)
57
  spec.loader.exec_module(module)
 
60
  model.eval()
61
 
62
  # Generate
63
+ output = module.generate(model, tokenizer, "Once upon a time, ")
64
  print(output)
65
+ ```