|
|
--- |
|
|
license: apache-2.0 |
|
|
base_model: |
|
|
- allenai/Olmo-3-7B-Instruct |
|
|
tags: |
|
|
- llmcompressor |
|
|
--- |
|
|
This is [allenai/Olmo-3-7B-Instruct](https://huggingface.co/allenai/Olmo-3-7B-Instruct) quantized with [LLM Compressor](https://github.com/vllm-project/llm-compressor) with FP8 Dynamic (W8A8). The model is compatible with vLLM (tested: v0.11.2). Tested with an RTX 5090. |
|
|
|
|
|
|
|
|
How the models perform (token efficiency, accuracy per domain, ...) and how to use them: |
|
|
[Quantizing Olmo 3: Most Efficient and Accurate Formats](https://kaitchup.substack.com/p/quantizing-olmo-3-most-efficient) |
|
|
|
|
|
 |
|
|
|
|
|
|
|
|
- **Developed by:** [The Kaitchup](https://kaitchup.substack.com/) |
|
|
- **License:** Apache 2.0 license |
|
|
|
|
|
## How to Support My Work |
|
|
Subscribe to [The Kaitchup](https://kaitchup.substack.com/subscribe). This helps me a lot to continue quantizing and evaluating models for free. Or you can "[buy me a kofi](https://ko-fi.com/bnjmn_marie)". |