metadata
base_model: ilgee/Binary-Think-RM-8B
base_model_relation: quantized
quantized_by: ArtusDev
license: llama3.1
language:
- en
tags:
- reward-model
- RLHF
- reasoning
- preference-learning
- exl3
ArtusDev/ilgee_Binary-Think-RM-8B-EXL3
EXL3 quants of ilgee/Binary-Think-RM-8B using exllamav3 for quantization.
Quants
| Quant | BPW | Head Bits |
|---|---|---|
| 2.5_H6 | 2.5 | 6 |
| 3.0_H6 | 3.0 | 6 |
| 3.5_H6 | 3.5 | 6 |
| 4.0_H6 | 4.0 | 6 |
| 4.5_H6 | 4.5 | 6 |
| 5.0_H6 | 5.0 | 6 |
| 6.0_H6 | 6.0 | 6 |
| 8.0_H8 | 8.0 | 8 |
How to Download and Use Quants
You can download quants by targeting specific size using the Hugging Face CLI.
Click for download commands
1. Install huggingface-cli:
pip install -U "huggingface_hub[cli]"
2. Download a specific quant:
huggingface-cli download ArtusDev/ilgee_Binary-Think-RM-8B-EXL3 --revision "5.0bpw_H6" --local-dir ./ilgee_Binary-Think-RM-8B-EXL3-5.0bpw_H6
EXL3 quants can be run with any inference client that supports EXL3, such as TabbyAPI. Refer to documentation for set up instructions.
Acknowledgements
Made possible with cloud compute from lium.io