Vui Seng Chua commited on
Commit ·
769e2c6
1
Parent(s): 73d74ae
Revise README.md
Browse files
README.md
CHANGED
|
@@ -2,11 +2,13 @@
|
|
| 2 |
|
| 3 |
This repo contains binary of weight quantized by [OpenVINO](https://github.com/openvinotoolkit/openvino_notebooks/blob/main/notebooks/254-llm-chatbot/254-llm-chatbot.ipynb).
|
| 4 |
|
|
|
|
| 5 |
| LLM | ratio | group_size |
|
| 6 |
|----------------- |------- |------------ |
|
| 7 |
| llama-2-chat-7b | 0.8 | 128 |
|
| 8 |
| mistral-7b | 0.6 | 64 |
|
| 9 |
| gemma-2b-it | 0.6 | 64 |
|
|
|
|
| 10 |
|
| 11 |
Notes:
|
| 12 |
* ratio=0.8 means 80% of FC (linear) layers are 4-bit weight quantized and the rest in 8-bit.
|
|
|
|
| 2 |
|
| 3 |
This repo contains binary of weight quantized by [OpenVINO](https://github.com/openvinotoolkit/openvino_notebooks/blob/main/notebooks/254-llm-chatbot/254-llm-chatbot.ipynb).
|
| 4 |
|
| 5 |
+
```
|
| 6 |
| LLM | ratio | group_size |
|
| 7 |
|----------------- |------- |------------ |
|
| 8 |
| llama-2-chat-7b | 0.8 | 128 |
|
| 9 |
| mistral-7b | 0.6 | 64 |
|
| 10 |
| gemma-2b-it | 0.6 | 64 |
|
| 11 |
+
```
|
| 12 |
|
| 13 |
Notes:
|
| 14 |
* ratio=0.8 means 80% of FC (linear) layers are 4-bit weight quantized and the rest in 8-bit.
|