MikeRoz
/

MiniMax-M2.5-exl3

Text Generation

Model card Files Files and versions

MikeRoz commited on 4 days ago

Commit

101834b

·

verified ·

1 Parent(s): 8ae1d06

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -38,7 +38,7 @@ Note that tensor parallelism is not currently supported for this architecture, s
 ### How to use these quants
-The documentation for [exllamav3](https://github.com/turboderp-org/exllamav3/) is your best bet here, as wall as that of [TabbyAPI](https://github.com/theroyallab/tabbyAPI) or [Text Generation Web UI (oobabooga)](https://github.com/oobabooga/text-generation-webui). In short:
 * You need to have sufficient VRAM to fit the model and your context cache. I give some pointers above that may be helpful.
 * At this point, your GPUs need to be nVidia. AMD/ROCm, Intel, and offloading to system RAM are not currently supported.
 * You will need a software package capable of loading exllamav3 models. I'm still somewhat partial to oobabooga, but TabbyAPI is another popular option. Follow the documenation for your choice in order to get yourself set up.

 ### How to use these quants
+The documentation for [exllamav3](https://github.com/turboderp-org/exllamav3/) is your best bet here, as well as that of [TabbyAPI](https://github.com/theroyallab/tabbyAPI) or [Text Generation Web UI (oobabooga)](https://github.com/oobabooga/text-generation-webui). In short:
 * You need to have sufficient VRAM to fit the model and your context cache. I give some pointers above that may be helpful.
 * At this point, your GPUs need to be nVidia. AMD/ROCm, Intel, and offloading to system RAM are not currently supported.
 * You will need a software package capable of loading exllamav3 models. I'm still somewhat partial to oobabooga, but TabbyAPI is another popular option. Follow the documenation for your choice in order to get yourself set up.