Arjun-G-Ravi commited on
Commit
c0b6a78
·
verified ·
1 Parent(s): d69dfcc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -19
README.md CHANGED
@@ -1,24 +1,24 @@
1
 
2
- # Custom GPT Model
3
 
4
- This is a custom GPT model with:
5
- - RMS normalization
6
- - Rotary positional embeddings (RoPE)
7
- - Separate Q,K,V projections
8
- - Squared ReLU activation in MLP
9
- - QK normalization in attention
10
- - Zero initialization for projection layers
11
 
12
- ## Architecture
13
- - Vocabulary Size: 50304
14
- - Context Length: 1024
15
- - Number of Layers: 12
16
- - Number of Heads: 6
17
- - Embedding Dimension: 768
18
 
19
- ## Usage
20
- ```python
21
- from transformers import AutoModel
22
- model = AutoModel.from_pretrained("Arjun-G-Ravi/Custom-GPT-555k")
23
- ```
24
 
 
1
 
2
+ # Custom GPT Model
3
 
4
+ This is a custom GPT model with:
5
+ - RMS normalization
6
+ - Rotary positional embeddings (RoPE)
7
+ - Separate Q,K,V projections
8
+ - Squared ReLU activation in MLP
9
+ - QK normalization in attention
10
+ - Zero initialization for projection layers
11
 
12
+ ## Architecture
13
+ - Vocabulary Size: 50304
14
+ - Context Length: 1024
15
+ - Number of Layers: 12
16
+ - Number of Heads: 6
17
+ - Embedding Dimension: 768
18
 
19
+ ## Usage
20
+ ```python
21
+ from transformers import AutoModel
22
+ model = AutoModel.from_pretrained("Arjun-G-Ravi/Custom-GPT-555k")
23
+ ```
24