Howto create a FP8 quant?

by JochenGebhard - opened Dec 31, 2025

Dec 31, 2025

Hello all,

it was easy to create a FP8 quant of the 8b-Embedding Model. The creation of a quant for the Reranker failed for me using Llmcompressor...

The result is technically loadable, but the result of the reranking is always 0.50. Does anybody of you can share the receipt or code to create a FP8 quant of this Model?

Thanks a lot and happy new year 😀

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment