| --- |
| quantized_by: LLMJapan |
| pipeline_tag: text-generation |
| license: cc-by-nc-4.0 |
| language: |
| - en |
| tags: |
| - nvidia |
| - AceInstruct |
| - code |
| - math |
| - general_domain |
| - instruct_model |
| base_model: nvidia/AceInstruct-72B |
| --- |
| ## Exllama v2 Quantizations of AceInstruct-72B by nvidia |
|
|
| Using <a href="https://github.com/turboderp/exllamav2/releases/tag/v0.2.8">turboderp's ExLlamaV2 v0.2.8</a> for quantization. |
|
|
| Original model: https://huggingface.co/nvidia/AceInstruct-72B |
|
|
| Quantization Command Example for creating other bpw quantization |
| ``` |
| cd {your git clone directory} |
| python convert.py -i {path to}/AceInstruct-72B -o {path to}/AceInstruct-72B/workingdir -cf {path to}/AceInstruct-72B/AceInstruct-72B-3bpw -b 3.0 |
| ``` |
|
|
| ## Prompt format |
|
|
| ``` |
| <|im_start|>system |
| {system_prompt}<|im_end|> |
| <|im_start|>user |
| {prompt}<|im_end|> |
| <|im_start|>assistant |
| ``` |
|
|
| ## How to add your system prompt |
|
|
| Copy the following json and replace the "You are AceInstruct developed by NVIDIA. You are helpful assistant." sentence with your original system prompt. |
| The default tokenizer_config.json does not have system prompt. |
| |
| tokenizer_config.json |
| ``` |
| "chat_template": "{{- '<|im_start|>system\\nYou are AceInstruct developed by NVIDIA. You are helpful assistant.<|im_end|>\\n' }}\n {%- for message in messages %}\n{{- '<|im_start|>' + message.role + '\n' + message.content + '<|im_end|>' + '\n' }}\n{%- endfor %}\n{%- if add_generation_prompt %}\n{{- '<|im_start|>assistant\n' }}\n{%- endif %}\n", |
| ``` |
|
|
| ## File information |
|
|
| | quantization type | file size | |
| | ----------------------- | ----------: | |
| | 3.0bpw | 27.8 GiB | |
|
|
| ## Benchmark Results |
|
|
| | | Qwen2.5-1.5B-Instruct | AceInstruct-1.5B | Qwen2.5-7B-Instruct | AceInstruct-7B | Qwen2.5-72B-Instruct | AceInstruct-72B | |
| | --------- |:-----:|:-----:|:-----:|:-----:|:-----:|:-----:| |
| | HumanEval | 61.60 | 73.17 | 84.80 | 85.37 | 86.60 | 89.63 | |
| | MBPP | 63.20 | 65.76 | 79.20 | 74.32 | 88.20 | 83.66 | |
| | GSM8K | 73.20 | 80.44 | 91.60 | 93.10 | 95.80 | 96.36 | |
| | MATH | 55.20 | 60.34 | 75.50 | 76.40 | 83.10 | 84.50 | |
| | MMLU | 58.37 | 58.17 | 74.51 | 74.68 | 84.67 | 83.88 | |
| | MMLU Pro | 32.40 | 33.78 | 56.30 | 54.50 | 71.10 | 66.10 | |
| | Average | 57.33 | 61.94 | 76.99 | 76.40 | 84.91 | 84.02 | |
|
|
| ## Credits |
|
|
| Thanks to NVIDIA team. |
|
|
| --- |
| license: cc-by-nc-4.0 |
| --- |
|
|