Update README.md
Browse files
README.md
CHANGED
|
@@ -19,12 +19,11 @@ license: mit
|
|
| 19 |
This is a **FP8 dynamic quantized** version of [OpenGVLab/InternVL3-38B](https://huggingface.co/OpenGVLab/InternVL3-38B), optimized for high-performance inference with vLLM.
|
| 20 |
The model utilizes **dynamic FP8 quantization** for optimal ease of use and deployment, achieving significant speedup with minimal accuracy degradation on vision-language tasks.
|
| 21 |
## 🚀 Key Features
|
| 22 |
-
- **FP8 Dynamic Quantization
|
| 23 |
- **Vision-Language Optimized**: Specialized quantization recipe that preserves visual understanding
|
| 24 |
- **vLLM Ready**: Seamless integration with vLLM for production deployment
|
| 25 |
- **Memory Efficient**: ~50% memory reduction compared to FP16 original
|
| 26 |
- **Performance Boost**: Significant faster inference on H100/L40S GPUs
|
| 27 |
-
- **Easy Deployment**: No calibration dataset needed for quantization
|
| 28 |
## 📊 Model Details
|
| 29 |
- **Original Model**: [OpenGVLab/InternVL3-38B](https://huggingface.co/OpenGVLab/InternVL3-38B)
|
| 30 |
- **Source Model**: OpenGVLab/InternVL3-38B
|
|
|
|
| 19 |
This is a **FP8 dynamic quantized** version of [OpenGVLab/InternVL3-38B](https://huggingface.co/OpenGVLab/InternVL3-38B), optimized for high-performance inference with vLLM.
|
| 20 |
The model utilizes **dynamic FP8 quantization** for optimal ease of use and deployment, achieving significant speedup with minimal accuracy degradation on vision-language tasks.
|
| 21 |
## 🚀 Key Features
|
| 22 |
+
- **FP8 Dynamic Quantization**
|
| 23 |
- **Vision-Language Optimized**: Specialized quantization recipe that preserves visual understanding
|
| 24 |
- **vLLM Ready**: Seamless integration with vLLM for production deployment
|
| 25 |
- **Memory Efficient**: ~50% memory reduction compared to FP16 original
|
| 26 |
- **Performance Boost**: Significant faster inference on H100/L40S GPUs
|
|
|
|
| 27 |
## 📊 Model Details
|
| 28 |
- **Original Model**: [OpenGVLab/InternVL3-38B](https://huggingface.co/OpenGVLab/InternVL3-38B)
|
| 29 |
- **Source Model**: OpenGVLab/InternVL3-38B
|