MikeRoz commited on
Commit
101834b
·
verified ·
1 Parent(s): 8ae1d06

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -38,7 +38,7 @@ Note that tensor parallelism is not currently supported for this architecture, s
38
 
39
  ### How to use these quants
40
 
41
- The documentation for [exllamav3](https://github.com/turboderp-org/exllamav3/) is your best bet here, as wall as that of [TabbyAPI](https://github.com/theroyallab/tabbyAPI) or [Text Generation Web UI (oobabooga)](https://github.com/oobabooga/text-generation-webui). In short:
42
  * You need to have sufficient VRAM to fit the model and your context cache. I give some pointers above that may be helpful.
43
  * At this point, your GPUs need to be nVidia. AMD/ROCm, Intel, and offloading to system RAM are not currently supported.
44
  * You will need a software package capable of loading exllamav3 models. I'm still somewhat partial to oobabooga, but TabbyAPI is another popular option. Follow the documenation for your choice in order to get yourself set up.
 
38
 
39
  ### How to use these quants
40
 
41
+ The documentation for [exllamav3](https://github.com/turboderp-org/exllamav3/) is your best bet here, as well as that of [TabbyAPI](https://github.com/theroyallab/tabbyAPI) or [Text Generation Web UI (oobabooga)](https://github.com/oobabooga/text-generation-webui). In short:
42
  * You need to have sufficient VRAM to fit the model and your context cache. I give some pointers above that may be helpful.
43
  * At this point, your GPUs need to be nVidia. AMD/ROCm, Intel, and offloading to system RAM are not currently supported.
44
  * You will need a software package capable of loading exllamav3 models. I'm still somewhat partial to oobabooga, but TabbyAPI is another popular option. Follow the documenation for your choice in order to get yourself set up.