| | --- |
| | license: apache-2.0 |
| | --- |
| | |
| | # The Quantized Command R Model |
| |
|
| | Original Base Model: `CohereForAI/c4ai-command-r-v01`.<br> |
| | Link: [https://huggingface.co/CohereForAI/c4ai-command-r-v01](https://huggingface.co/CohereForAI/c4ai-command-r-v01) |
| |
|
| | ## Special Notice |
| |
|
| | Please note the model is quantized by utilizing the `AutoModelForCausalLM.from_pretrained` in the `transformers` package. |
| |
|
| | For the model quantized by `auto-gptq` package, please check the link here: [https://huggingface.co/shuyuej/Command-R-GPTQ](https://huggingface.co/shuyuej/Command-R-GPTQ). |
| |
|
| | ## Quantization Configurations |
| | ``` |
| | "quantization_config": { |
| | "batch_size": 1, |
| | "bits": 4, |
| | "block_name_to_quantize": null, |
| | "cache_block_outputs": true, |
| | "damp_percent": 0.1, |
| | "dataset": null, |
| | "desc_act": false, |
| | "exllama_config": { |
| | "version": 1 |
| | }, |
| | "group_size": 128, |
| | "max_input_length": null, |
| | "model_seqlen": null, |
| | "module_name_preceding_first_block": null, |
| | "modules_in_block_to_quantize": null, |
| | "pad_token_id": null, |
| | "quant_method": "gptq", |
| | "sym": true, |
| | "tokenizer": null, |
| | "true_sequential": true, |
| | "use_cuda_fp16": false, |
| | "use_exllama": true |
| | }, |
| | ``` |
| |
|
| | ## Source Codes |
| | Source Codes: [https://github.com/vkola-lab/medpodgpt/tree/main/quantization](https://github.com/vkola-lab/medpodgpt/tree/main/quantization). |
| |
|