Update README.md
Browse files
README.md
CHANGED
|
@@ -36,6 +36,32 @@ Note that tensor parallelism is not currently supported for this architecture, s
|
|
| 36 |
[measurement.json - 3.0bpw_H6 vs 4.0bpw_H6](https://huggingface.co/MikeRoz/MiniMax-M2.5-exl3/blob/main/measurement_MiniMaxAI_MiniMax-M2.5-3.0-4.0.json)
|
| 37 |
[measurement.json - 4.0bpw_H6 vs 5.0bpw_H6](https://huggingface.co/MikeRoz/MiniMax-M2.5-exl3/blob/main/measurement_MiniMaxAI_MiniMax-M2.5-4.0-5.0.json)
|
| 38 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 39 |
### How to use these quants
|
| 40 |
|
| 41 |
The documentation for [exllamav3](https://github.com/turboderp-org/exllamav3/) is your best bet here, as well as that of [TabbyAPI](https://github.com/theroyallab/tabbyAPI) or [Text Generation Web UI (oobabooga)](https://github.com/oobabooga/text-generation-webui). In short:
|
|
|
|
| 36 |
[measurement.json - 3.0bpw_H6 vs 4.0bpw_H6](https://huggingface.co/MikeRoz/MiniMax-M2.5-exl3/blob/main/measurement_MiniMaxAI_MiniMax-M2.5-3.0-4.0.json)
|
| 37 |
[measurement.json - 4.0bpw_H6 vs 5.0bpw_H6](https://huggingface.co/MikeRoz/MiniMax-M2.5-exl3/blob/main/measurement_MiniMaxAI_MiniMax-M2.5-4.0-5.0.json)
|
| 38 |
|
| 39 |
+
### matplotlib Catbench
|
| 40 |
+
|
| 41 |
+
<details><summary>Click to see cat plots!</summary>
|
| 42 |
+
|
| 43 |
+
2.0 bpw:
|
| 44 |
+

|
| 45 |
+
2.1 bpw:
|
| 46 |
+

|
| 47 |
+
2.5 bpw:
|
| 48 |
+

|
| 49 |
+
3.0 bpw:
|
| 50 |
+

|
| 51 |
+
3.06 bpw:
|
| 52 |
+

|
| 53 |
+
3.5 bpw:
|
| 54 |
+

|
| 55 |
+
4.0 bpw:
|
| 56 |
+

|
| 57 |
+
5.0 bpw:
|
| 58 |
+

|
| 59 |
+
|
| 60 |
+
Prompted in Text Generation Web UI in chat-instruct mode with MiniMax AI's recommended settings (temp 1, top p 0.95, top k 40).
|
| 61 |
+
|
| 62 |
+
Note that 2.1, 4.0, and 5.0 all required a single re-roll each to get a working script.
|
| 63 |
+
</details>
|
| 64 |
+
|
| 65 |
### How to use these quants
|
| 66 |
|
| 67 |
The documentation for [exllamav3](https://github.com/turboderp-org/exllamav3/) is your best bet here, as well as that of [TabbyAPI](https://github.com/theroyallab/tabbyAPI) or [Text Generation Web UI (oobabooga)](https://github.com/oobabooga/text-generation-webui). In short:
|