Wilsonwin commited on
Commit
3869c0f
·
verified ·
1 Parent(s): 39336ad

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +46 -3
README.md CHANGED
@@ -1,3 +1,46 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - zh
4
+ - en
5
+ license: apache-2.0
6
+ tags:
7
+ - jax
8
+ - flax
9
+ - mini-gpt
10
+ - text-generation
11
+ ---
12
+
13
+ # handsongpt2
14
+
15
+ HandsOnGPT2 model trained on GuoFeng Webnovel Corpus using JAX/Flax on Kaggle TPU.
16
+
17
+ ## Model Details
18
+
19
+ - **Architecture**: GPT-2 style transformer
20
+ - **Parameters**: 84.6M
21
+ - **Vocab Size**: 64,000 (Yi-1.5 tokenizer, TPU-aligned)
22
+ - **Max Length**: 256
23
+ - **Layers**: 6
24
+ - **Hidden Size**: 512
25
+ - **Attention Heads**: 8
26
+
27
+ ## Training
28
+
29
+ - **Framework**: JAX/Flax
30
+ - **Hardware**: Kaggle TPU v3-8
31
+ - **Batch Size**: 16
32
+ - **Learning Rate**: 0.0003
33
+ - **Final Loss**: 0.0005
34
+
35
+ ## Usage
36
+
37
+ ```python
38
+ import orbax.checkpoint as ocp
39
+
40
+ checkpointer = ocp.PyTreeCheckpointer()
41
+ state = checkpointer.restore('/path/to/checkpoint')
42
+ ```
43
+
44
+ ## License
45
+
46
+ Apache 2.0