Update README.md
Browse files
README.md
CHANGED
|
@@ -4,6 +4,8 @@ license: other
|
|
| 4 |
quantized_by: bartowski
|
| 5 |
---
|
| 6 |
|
|
|
|
|
|
|
| 7 |
## Exllama v2 Quantizations of internlm2-chat-7b-llama
|
| 8 |
|
| 9 |
Using <a href="https://github.com/turboderp/exllamav2/releases/tag/v0.0.12">turboderp's ExLlamaV2 v0.0.12</a> for quantization.
|
|
|
|
| 4 |
quantized_by: bartowski
|
| 5 |
---
|
| 6 |
|
| 7 |
+
Update Jan 27: This has been redone with the proper token mappings and rope scaling, performance seems improved, please comment if not
|
| 8 |
+
|
| 9 |
## Exllama v2 Quantizations of internlm2-chat-7b-llama
|
| 10 |
|
| 11 |
Using <a href="https://github.com/turboderp/exllamav2/releases/tag/v0.0.12">turboderp's ExLlamaV2 v0.0.12</a> for quantization.
|