brandonbeiler
/

InternVL3_5-38B-FP8-Dynamic

Image-Text-to-Text

compressed-tensors

Model card Files Files and versions

brandonbeiler commited on Aug 27, 2025

Commit

efcfd92

·

verified ·

1 Parent(s): a84a3ca

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -20,7 +20,7 @@ library_name: vllm
 This is an FP8 dynamically quantized (W8A8) version of `OpenGVLab/InternVL3_5-38B`optimized for high-performance inference with *vLLM*.
-The quantization process uses a specialized recipe that preserves the model's core visual understanding capabilities while reducing the memory footprint by nearly 50%.
 ## Just Run It (vLLM serve)

 This is an FP8 dynamically quantized (W8A8) version of `OpenGVLab/InternVL3_5-38B`optimized for high-performance inference with *vLLM*.
+The quantization process uses a specialized recipe that preserves the model's core visual understanding capabilities while reducing the memory footprint by nearly 40%.
 ## Just Run It (vLLM serve)