HaileyStorm
/

llama3-5.4b-instruct

Text Generation

Eval Results (legacy)

text-generation-inference

Model card Files Files and versions

HaileyStorm commited on May 27, 2024

Commit

71b2ad8

·

verified ·

1 Parent(s): 80a0b92

Update README.md

Files changed (1) hide show

README.md +1 -0

README.md CHANGED Viewed

@@ -101,6 +101,7 @@ I do believe it could be much better, by doing the pruning in stages (say, 4 lay
 *Figure 2: Model size vs average benchmark performance. Llama3-5.4b-instruct may not be fully healed, but its performance scales linearly with its size.*
 ## Why 5.4B?
 This size should allow for:
 - bf16 inference on 24GB VRAM
 - Q8 or Q6 inference on 6GB VRAM

 *Figure 2: Model size vs average benchmark performance. Llama3-5.4b-instruct may not be fully healed, but its performance scales linearly with its size.*
 ## Why 5.4B?
 This size should allow for:
 - bf16 inference on 24GB VRAM
 - Q8 or Q6 inference on 6GB VRAM