Update README.md
Browse files
README.md
CHANGED
|
@@ -19,7 +19,7 @@ This is the final, fully finished chat-ready checkpoint in GGUF format
|
|
| 19 |
Note: for very short prompts, you should prefill ```<think>``` xml tag at the start of the assistant response to ensure it would properly reason.
|
| 20 |
|
| 21 |
note:
|
| 22 |
-
use f32 gguf for highest quality, bf16 is ideal, q8_0 for edge devices.
|
| 23 |
|
| 24 |
# Fijik-1.5 2.6B
|
| 25 |
|
|
|
|
| 19 |
Note: for very short prompts, you should prefill ```<think>``` xml tag at the start of the assistant response to ensure it would properly reason.
|
| 20 |
|
| 21 |
note:
|
| 22 |
+
use f32 gguf for highest quality, bf16 is ideal, q8_0 for edge devices. (on a Samsung z fold 5 with fa2, 5k context + some layers offloaded to the GPU, TPS gets to around 40.)
|
| 23 |
|
| 24 |
# Fijik-1.5 2.6B
|
| 25 |
|