| | --- |
| | library_name: transformers |
| | tags: |
| | - cohere |
| | - conversational |
| | - 10languages |
| | - text-generation-inference |
| | - Inference Endpoints |
| | --- |
| | Official [AQLM](https://arxiv.org/abs/2401.06118) quantization of [CohereForAI/c4ai-command-r-v01 |
| | ](https://huggingface.co/CohereForAI/c4ai-command-r-v01). |
| |
|
| | For this quantization, we used 1 codebook of 16 bits. |
| |
|
| | Results: |
| | | Model | Quantization | MMLU (5-shot) | GSM8k (8-shot) | Model size, Gb | |
| | |------|------|-------|------|------| |
| | |CohereForAI/c4ai-command-r-v01| None |0.6755 | 0.6065 | 70.0 | |
| | | | 1x16 | 0.5719 | 0.3760 | 12.7 | |