Update README.md
Browse files
README.md
CHANGED
|
@@ -13,9 +13,9 @@ FLM-2-52B-Instruct utilizes the standard GPT-style decoder-only transformer arch
|
|
| 13 |
* Embedding and language model head untied
|
| 14 |
* Input and output multiplier
|
| 15 |
|
| 16 |
-
| Models
|
| 17 |
-
| -------------
|
| 18 |
-
| FLM-2-52B-Instruct-2407
|
| 19 |
|
| 20 |
# Training details
|
| 21 |
|
|
|
|
| 13 |
* Embedding and language model head untied
|
| 14 |
* Input and output multiplier
|
| 15 |
|
| 16 |
+
| Models | layer<br>number | attention<br>heads | hidden<br>size | ffn hidden<br>size | vocab<br>size | params<br>count |
|
| 17 |
+
| ------------- | :-------------: | :----------------: | :------------: | :----------------: | :-----------: | :--------------: |
|
| 18 |
+
| FLM-2-52B-Instruct-2407 | 64 | 64 | 8,192 | 21,824 | 80,000 | 52.85 B |
|
| 19 |
|
| 20 |
# Training details
|
| 21 |
|