chegde commited on
Commit
6bbf8c7
·
verified ·
1 Parent(s): be4ee5c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -3
README.md CHANGED
@@ -1,3 +1,31 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ ---
4
+
5
+ # Custom Llama-style
6
+
7
+ This repository contains a single `.pt` checkpoint file from a fine-tuned model.
8
+
9
+ **This model is NOT directly usable with `transformers.AutoModel.from_pretrained()` yet.** It needs to be converted to the Hugging Face format first.
10
+
11
+ ## Training Details
12
+
13
+ - **Framework:** [modded-nanoGPT-soap](https://github.com/nikhilvyas/modded-nanogpt-SOAP).
14
+ - **Architecture:** This model uses modern features and is NOT a standard GPT-2.
15
+ - **Positional Embeddings:** Rotary Position Embeddings (RoPE)
16
+ - **Normalization:** RMSNorm
17
+ - **Bias:** Linear layers trained with `bias=False`.
18
+
19
+ ## Model Configuration
20
+
21
+ This is the information needed to perform the conversion:
22
+
23
+ - `n_layer`: 12
24
+ - `n_head`: 12
25
+ - `n_embd`: 768
26
+ - `vocab_size`: 50257
27
+ - `block_size`: 1024
28
+
29
+ ## Tokenizer
30
+
31
+ The model was trained with the standard `gpt2` tokenizer.