File size: 1,027 Bytes
2e8a4b9 48d07f3 2e8a4b9 ff23768 2e8a4b9 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
---
license: apache-2.0
base_model:
- allenai/Olmo-3-7B-Instruct
tags:
- llmcompressor
---
This is [allenai/Olmo-3-7B-Instruct](https://huggingface.co/allenai/Olmo-3-7B-Instruct) quantized with [LLM Compressor](https://github.com/vllm-project/llm-compressor) with FP8 Dynamic (W8A8). The model is compatible with vLLM (tested: v0.11.2). Tested with an RTX 5090.
How the models perform (token efficiency, accuracy per domain, ...) and how to use them:
[Quantizing Olmo 3: Most Efficient and Accurate Formats](https://kaitchup.substack.com/p/quantizing-olmo-3-most-efficient)

- **Developed by:** [The Kaitchup](https://kaitchup.substack.com/)
- **License:** Apache 2.0 license
## How to Support My Work
Subscribe to [The Kaitchup](https://kaitchup.substack.com/subscribe). This helps me a lot to continue quantizing and evaluating models for free. Or you can "[buy me a kofi](https://ko-fi.com/bnjmn_marie)". |