Update README.md
Browse files
README.md
CHANGED
|
@@ -24,4 +24,7 @@ language:
|
|
| 24 |
- The context length: 16GB VRAM GPU could run maximum 2x10.7B (~ 19.2B) model with 4k context length. HyouKan is 3x7B(~ 18.5B) parameter, but have 8k(or 32k) context length that need a lot of RAM/VRAM to load. (``--auto-devices`` may help you run the model, I don't know.) => 7B 32k is ~ 13b 4k RAM/VRAM usage in GGUF version.
|
| 25 |
- That model is bug/broken.😏
|
| 26 |
- Bigger model will have more information that you need for your Character Card.
|
| 27 |
-
- Best GGUF version that you should run (balance speed/performance): Q4_K_M, Q5_K_M (Slower than Q4)
|
|
|
|
|
|
|
|
|
|
|
|
| 24 |
- The context length: 16GB VRAM GPU could run maximum 2x10.7B (~ 19.2B) model with 4k context length. HyouKan is 3x7B(~ 18.5B) parameter, but have 8k(or 32k) context length that need a lot of RAM/VRAM to load. (``--auto-devices`` may help you run the model, I don't know.) => 7B 32k is ~ 13b 4k RAM/VRAM usage in GGUF version.
|
| 25 |
- That model is bug/broken.😏
|
| 26 |
- Bigger model will have more information that you need for your Character Card.
|
| 27 |
+
- Best GGUF version that you should run (balance speed/performance): Q4_K_M, Q5_K_M (Slower than Q4)
|
| 28 |
+
|
| 29 |
+
# Useful link:
|
| 30 |
+
https://huggingface.co/spaces/Vokturz/can-it-run-llm
|