Pinkstack commited on
Commit
2b0ddb1
·
verified ·
1 Parent(s): daaa4ed

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -19,7 +19,7 @@ This is the final, fully finished chat-ready checkpoint in GGUF format
19
  Note: for very short prompts, you should prefill ```<think>``` xml tag at the start of the assistant response to ensure it would properly reason.
20
 
21
  note:
22
- use f32 gguf for highest quality, bf16 is ideal, q8_0 for edge devices.
23
 
24
  # Fijik-1.5 2.6B
25
 
 
19
  Note: for very short prompts, you should prefill ```<think>``` xml tag at the start of the assistant response to ensure it would properly reason.
20
 
21
  note:
22
+ use f32 gguf for highest quality, bf16 is ideal, q8_0 for edge devices. (on a Samsung z fold 5 with fa2, 5k context + some layers offloaded to the GPU, TPS gets to around 40.)
23
 
24
  # Fijik-1.5 2.6B
25