Is there a benefit of this version vs the original MXFP4?

by SuperbEmphasis - opened Sep 13, 2025

Sep 13, 2025

•

edited Sep 13, 2025

I'm currently running gpt-oss-120b using 2xH100 gpus via vllm.

But is there a benefit of using this version? Im wondering if using FP8 with the H100 would have a faster response since the H100 can utilize the FP8 cores at the cost of increased VRAM usage?

shaddow11ro

16 days ago

Precision loss wont be to noticeable, but there will be some minor differences between FP8 and FP4

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment