brandonbeiler commited on
Commit
5de82c2
·
verified ·
1 Parent(s): 3befff2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -2
README.md CHANGED
@@ -19,12 +19,11 @@ license: mit
19
  This is a **FP8 dynamic quantized** version of [OpenGVLab/InternVL3-38B](https://huggingface.co/OpenGVLab/InternVL3-38B), optimized for high-performance inference with vLLM.
20
  The model utilizes **dynamic FP8 quantization** for optimal ease of use and deployment, achieving significant speedup with minimal accuracy degradation on vision-language tasks.
21
  ## 🚀 Key Features
22
- - **FP8 Dynamic Quantization**: No calibration required, ready to use immediately
23
  - **Vision-Language Optimized**: Specialized quantization recipe that preserves visual understanding
24
  - **vLLM Ready**: Seamless integration with vLLM for production deployment
25
  - **Memory Efficient**: ~50% memory reduction compared to FP16 original
26
  - **Performance Boost**: Significant faster inference on H100/L40S GPUs
27
- - **Easy Deployment**: No calibration dataset needed for quantization
28
  ## 📊 Model Details
29
  - **Original Model**: [OpenGVLab/InternVL3-38B](https://huggingface.co/OpenGVLab/InternVL3-38B)
30
  - **Source Model**: OpenGVLab/InternVL3-38B
 
19
  This is a **FP8 dynamic quantized** version of [OpenGVLab/InternVL3-38B](https://huggingface.co/OpenGVLab/InternVL3-38B), optimized for high-performance inference with vLLM.
20
  The model utilizes **dynamic FP8 quantization** for optimal ease of use and deployment, achieving significant speedup with minimal accuracy degradation on vision-language tasks.
21
  ## 🚀 Key Features
22
+ - **FP8 Dynamic Quantization**
23
  - **Vision-Language Optimized**: Specialized quantization recipe that preserves visual understanding
24
  - **vLLM Ready**: Seamless integration with vLLM for production deployment
25
  - **Memory Efficient**: ~50% memory reduction compared to FP16 original
26
  - **Performance Boost**: Significant faster inference on H100/L40S GPUs
 
27
  ## 📊 Model Details
28
  - **Original Model**: [OpenGVLab/InternVL3-38B](https://huggingface.co/OpenGVLab/InternVL3-38B)
29
  - **Source Model**: OpenGVLab/InternVL3-38B