Elriggs commited on
Commit
a18132a
·
verified ·
1 Parent(s): 2bf03a3

Upload folder using huggingface_hub

Browse files
Files changed (3) hide show
  1. README.md +46 -0
  2. config.json +12 -0
  3. pytorch_model.bin +3 -0
README.md ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Modded NanoGPT Model
2
+
3
+ This is a GPT-2 style model trained with modifications from modded-nanogpt.
4
+
5
+ ## Model Config
6
+
7
+ - Layers: 2
8
+ - Heads: 4
9
+ - Embedding dimension: 64
10
+ - Vocab size: 50304
11
+ - Squared MLP: False
12
+ - Bilinear: False
13
+ - Gated: False
14
+ - Expansion factor: 4
15
+
16
+ ## Training
17
+
18
+ - Training step: 500
19
+
20
+ ## Usage
21
+
22
+ ```python
23
+ from huggingface_hub import hf_hub_download
24
+ import torch
25
+ from train_gpt2 import GPT, GPTConfig
26
+ import json
27
+
28
+ # Download config
29
+ config_path = hf_hub_download(repo_id="Elriggs/gpt2-debug-baseline", filename="config.json")
30
+ with open(config_path) as f:
31
+ config_dict = json.load(f)
32
+
33
+ # Remove non-GPTConfig fields
34
+ config_dict.pop('step', None)
35
+
36
+ # Create model
37
+ config = GPTConfig(**config_dict)
38
+ model = GPT(config)
39
+
40
+ # Download and load weights
41
+ weights_path = hf_hub_download(repo_id="Elriggs/gpt2-debug-baseline", filename="pytorch_model.bin")
42
+ state_dict = torch.load(weights_path, map_location='cpu')
43
+ model.load_state_dict(state_dict)
44
+
45
+ model.eval()
46
+ ```
config.json ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "vocab_size": 50304,
3
+ "n_layer": 2,
4
+ "n_head": 4,
5
+ "n_embd": 64,
6
+ "squared_mlp": false,
7
+ "bilinear": false,
8
+ "expansion_factor": 4,
9
+ "gated": false,
10
+ "squared_attn": false,
11
+ "step": 500
12
+ }
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5bf4cb4eadfd5675bf5a2eb5af5c4e1f72cd3a0b0686506adf39a844e62c7875
3
+ size 19717251