kaitchup
/

Olmo-3.1-32B-Instruct-fp8-dynamic

compressed-tensors

Model card Files Files and versions

Olmo-3.1-32B-Instruct-fp8-dynamic / README.md

bnjmnmarie's picture

Update README.md

22a1305 verified 3 months ago

|

history blame contribute delete

982 Bytes

	---
	license: apache-2.0
	base_model:
	- allenai/Olmo-3.1-32B-Instruct
	tags:
	- llmcompressor
	---
	This is [allenai/Olmo-3.1-32B-Instruct](https://huggingface.co/allenai/Olmo-3.1-32B-Instruct) quantized with [LLM Compressor](https://github.com/vllm-project/llm-compressor) with the recipe in the "recipe.yaml" file. The model is compatible with vLLM (tested: v0.12.0). Tested with an RTX Pro 6000.

	How the models perform (token efficiency, accuracy per domain, ...) and how to use them:
	[Quantizing Olmo 3: Most Efficient and Accurate Formats](https://kaitchup.substack.com/p/quantizing-olmo-3-most-efficient)

	![image](https://cdn-uploads.huggingface.co/production/uploads/64b93e6bd6c468ac7536607e/H3JWV_ha07IrN-Sz6C7VL.png)

	- Developed by: [The Kaitchup](https://kaitchup.substack.com/)
	- License: Apache 2.0 license

	## How to Support My Work
	Subscribe to [The Kaitchup](https://kaitchup.substack.com/subscribe). Or you can "[buy me a kofi](https://ko-fi.com/bnjmn_marie)".