HaileyStorm commited on
Commit
71b2ad8
·
verified ·
1 Parent(s): 80a0b92

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -0
README.md CHANGED
@@ -101,6 +101,7 @@ I do believe it could be much better, by doing the pruning in stages (say, 4 lay
101
  *Figure 2: Model size vs average benchmark performance. Llama3-5.4b-instruct may not be fully healed, but its performance scales linearly with its size.*
102
 
103
  ## Why 5.4B?
 
104
  This size should allow for:
105
  - bf16 inference on 24GB VRAM
106
  - Q8 or Q6 inference on 6GB VRAM
 
101
  *Figure 2: Model size vs average benchmark performance. Llama3-5.4b-instruct may not be fully healed, but its performance scales linearly with its size.*
102
 
103
  ## Why 5.4B?
104
+
105
  This size should allow for:
106
  - bf16 inference on 24GB VRAM
107
  - Q8 or Q6 inference on 6GB VRAM