Arjun-G-Ravi
/

Custom-GPT-555k

Model card Files Files and versions

Arjun-G-Ravi commited on Jan 29, 2025

Commit

c0b6a78

·

verified ·

1 Parent(s): d69dfcc

Update README.md

Files changed (1) hide show

README.md +19 -19

README.md CHANGED Viewed

@@ -1,24 +1,24 @@
-    # Custom GPT Model
-    This is a custom GPT model with:
-    - RMS normalization
-    - Rotary positional embeddings (RoPE)
-    - Separate Q,K,V projections
-    - Squared ReLU activation in MLP
-    - QK normalization in attention
-    - Zero initialization for projection layers
-    ## Architecture
-    - Vocabulary Size: 50304
-    - Context Length: 1024
-    - Number of Layers: 12
-    - Number of Heads: 6
-    - Embedding Dimension: 768
-    ## Usage
-    ```python
-    from transformers import AutoModel
-    model = AutoModel.from_pretrained("Arjun-G-Ravi/Custom-GPT-555k")
-    ```

+# Custom GPT Model
+This is a custom GPT model with:
+- RMS normalization
+- Rotary positional embeddings (RoPE)
+- Separate Q,K,V projections
+- Squared ReLU activation in MLP
+- QK normalization in attention
+- Zero initialization for projection layers
+## Architecture
+- Vocabulary Size: 50304
+- Context Length: 1024
+- Number of Layers: 12
+- Number of Heads: 6
+- Embedding Dimension: 768
+## Usage
+```python
+from transformers import AutoModel
+model = AutoModel.from_pretrained("Arjun-G-Ravi/Custom-GPT-555k")
+```