File size: 591 Bytes
3220769
 
d69dfcc
 
 
3220769
 
 
 
 
d69dfcc
 
3220769
 
 
 
 
d69dfcc
3220769
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

    # Custom GPT Model

    This is a custom GPT model with:
    - RMS normalization
    - Rotary positional embeddings (RoPE)
    - Separate Q,K,V projections
    - Squared ReLU activation in MLP
    - QK normalization in attention
    - Zero initialization for projection layers

    ## Architecture
    - Vocabulary Size: 50304
    - Context Length: 1024
    - Number of Layers: 12
    - Number of Heads: 6
    - Embedding Dimension: 768

    ## Usage
    ```python
    from transformers import AutoModel
    model = AutoModel.from_pretrained("Arjun-G-Ravi/Custom-GPT-555k")
    ```