| | --- |
| | tags: |
| | - uqff |
| | - mistral.rs |
| | base_model: mistralai/Mistral-Nemo-Instruct-2407 |
| | base_model_relation: quantized |
| | --- |
| | |
| | <!-- Autogenerated from user input. --> |
| |
|
| | # `mistralai/Mistral-Nemo-Instruct-2407`, UQFF quantization |
| |
|
| |
|
| | Run with [mistral.rs](https://github.com/EricLBuehler/mistral.rs). Documentation: [UQFF docs](https://github.com/EricLBuehler/mistral.rs/blob/master/docs/UQFF.md). |
| |
|
| | 1) **Flexible** π: Multiple quantization formats in *one* file format with *one* framework to run them all. |
| | 2) **Reliable** π: Compatibility ensured with *embedded* and *checked* semantic versioning information from day 1. |
| | 3) **Easy** π€: Download UQFF models *easily* and *quickly* from Hugging Face, or use a local file. |
| | 3) **Customizable** π οΈ: Make and publish your own UQFF files in minutes. |
| |
|
| | ## Examples |
| |
|
| | |Quantization type(s)|Example| |
| | |--|--| |
| | |FP8|`./mistralrs-server -i plain -m EricB/Mistral-Nemo-Instruct-2407-UQFF --from-uqff mistral-nemo-2407-instruct-f8e4m3.uqff`| |
| | |HQQ4|`./mistralrs-server -i plain -m EricB/Mistral-Nemo-Instruct-2407-UQFF --from-uqff mistral-nemo-2407-instruct-hqq4.uqff`| |
| | |HQQ8|`./mistralrs-server -i plain -m EricB/Mistral-Nemo-Instruct-2407-UQFF --from-uqff mistral-nemo-2407-instruct-hqq8.uqff`| |
| | |Q3K|`./mistralrs-server -i plain -m EricB/Mistral-Nemo-Instruct-2407-UQFF --from-uqff mistral-nemo-2407-instruct-q3k.uqff`| |
| | |Q4K|`./mistralrs-server -i plain -m EricB/Mistral-Nemo-Instruct-2407-UQFF --from-uqff mistral-nemo-2407-instruct-q4k.uqff`| |
| | |Q5K|`./mistralrs-server -i plain -m EricB/Mistral-Nemo-Instruct-2407-UQFF --from-uqff mistral-nemo-2407-instruct-q5k.uqff`| |
| | |Q8_0|`./mistralrs-server -i plain -m EricB/Mistral-Nemo-Instruct-2407-UQFF --from-uqff mistral-nemo-2407-instruct-q8_0.uqff`| |
| |
|