Update README.md
Browse files
README.md
CHANGED
|
@@ -13,4 +13,4 @@ tags:
|
|
| 13 |
|
| 14 |
Converted from Composer checkpoint.
|
| 15 |
|
| 16 |
-
This model build uses Flash Attention 2 and ignores triton; the max_seq_len parameter is set to 170 and
|
|
|
|
| 13 |
|
| 14 |
Converted from Composer checkpoint.
|
| 15 |
|
| 16 |
+
This model build uses Flash Attention 2 and ignores triton; the max_seq_len parameter is set to 170 and trained using amp_bf16 precision parameter.
|