TigerResearch
/

tigerbot-70b-chat-v2-4bit-exl2

Text Generation

text-generation-inference

Model card Files Files and versions

vivicai commited on Dec 12, 2023

Commit

1000fe4

·

1 Parent(s): 3751f85

Update README.md

Files changed (1) hide show

README.md +9 -16

README.md CHANGED Viewed

@@ -13,9 +13,9 @@ license: apache-2.0
-This is a 4-bit GPTQ version of the [Tigerbot 70b chat v2](https://huggingface.co/TigerResearch/tigerbot-70b-chat).
-It was quantized to 4bit using: https://github.com/PanQiWei/AutoGPTQ
 ## How to download and use this model in github: https://github.com/TigerResearch/TigerBot
@@ -33,20 +33,13 @@ pip install -r requirements.txt
 Inference with command line interface
-infer with exllama
 ```
-# 安装exllama_lib
-pip install exllama_lib@git+https://github.com/taprosoft/exllama.git
-# 启动推理
-CUDA_VISIBLE_DEVICES=0 python other_infer/exllama_infer.py --model_path TigerResearch/tigerbot-70b-chat-4bit
-```
-infer with auto-gptq
-```
-# 安装auto-gptq
-pip install auto-gptq
-# 启动推理
-CUDA_VISIBLE_DEVICES=0 python other_infer/gptq_infer.py --model_path TigerResearch/tigerbot-70b-chat-4bit
 ```

+This is a 4-bit EXL2 version of the [tigerbot-70b-chat-v2](https://huggingface.co/TigerResearch/tigerbot-70b-chat-v2).
+It was quantized to 4bit using: https://github.com/turboderp/exllamav2
 ## How to download and use this model in github: https://github.com/TigerResearch/TigerBot
 Inference with command line interface
+infer with exllamav2
 ```
+# install exllamav2
+git clone https://github.com/turboderp/exllamav2
+cd exllamav2
+pip install -r requirements.txt
+# infer command
+CUDA_VISIBLE_DEVICES=0 python other_infer/exllamav2_hf_infer.py --model_path TigerResearch/tigerbot-70b-chat-v2-4bit-exl2
 ```