Alsebay commited on
Commit
77cc57d
·
verified ·
1 Parent(s): 41a7109

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -1
README.md CHANGED
@@ -24,4 +24,7 @@ language:
24
  - The context length: 16GB VRAM GPU could run maximum 2x10.7B (~ 19.2B) model with 4k context length. HyouKan is 3x7B(~ 18.5B) parameter, but have 8k(or 32k) context length that need a lot of RAM/VRAM to load. (``--auto-devices`` may help you run the model, I don't know.) => 7B 32k is ~ 13b 4k RAM/VRAM usage in GGUF version.
25
  - That model is bug/broken.😏
26
  - Bigger model will have more information that you need for your Character Card.
27
- - Best GGUF version that you should run (balance speed/performance): Q4_K_M, Q5_K_M (Slower than Q4)
 
 
 
 
24
  - The context length: 16GB VRAM GPU could run maximum 2x10.7B (~ 19.2B) model with 4k context length. HyouKan is 3x7B(~ 18.5B) parameter, but have 8k(or 32k) context length that need a lot of RAM/VRAM to load. (``--auto-devices`` may help you run the model, I don't know.) => 7B 32k is ~ 13b 4k RAM/VRAM usage in GGUF version.
25
  - That model is bug/broken.😏
26
  - Bigger model will have more information that you need for your Character Card.
27
+ - Best GGUF version that you should run (balance speed/performance): Q4_K_M, Q5_K_M (Slower than Q4)
28
+
29
+ # Useful link:
30
+ https://huggingface.co/spaces/Vokturz/can-it-run-llm