Upload folder using huggingface_hub

Files changed (2) hide show

README.md CHANGED Viewed

@@ -30,7 +30,7 @@ The model utilizes **dynamic FP8 quantization** for optimal ease of use and depl
 - **Source Model**: OpenGVLab/InternVL3-38B
 - **Quantized Model**: InternVL3-38B-FP8-Dynamic
 - **Quantization Method**: FP8 Dynamic (W8A8)
-- **Quantization Library**: [LLM Compressor](https://github.com/vllm-project/llm-compressor) v0.5.1
 - **Quantized by**: [brandonbeiler](https://huggingface.co/brandonbeiler)
 ## 🔧 Usage
 ### With vLLM (Recommended)
@@ -62,11 +62,11 @@ print(response[0].outputs[0].text)
 ## 🔬 Package Versions
 This model was created using:
 ```
-llmcompressor==0.5.1
 compressed-tensors==latest
 transformers==4.52.4
 torch==2.7.0
-vllm==0.9.0.1
 ```
 *Quantized with ❤️ using LLM Compressor for the open-source community*

 - **Source Model**: OpenGVLab/InternVL3-38B
 - **Quantized Model**: InternVL3-38B-FP8-Dynamic
 - **Quantization Method**: FP8 Dynamic (W8A8)
+- **Quantization Library**: [LLM Compressor](https://github.com/vllm-project/llm-compressor) v0.5.2.dev112+g6800f811
 - **Quantized by**: [brandonbeiler](https://huggingface.co/brandonbeiler)
 ## 🔧 Usage
 ### With vLLM (Recommended)
 ## 🔬 Package Versions
 This model was created using:
 ```
+llmcompressor==0.5.2.dev112+g6800f811
 compressed-tensors==latest
 transformers==4.52.4
 torch==2.7.0
+vllm==0.9.1
 ```
 *Quantized with ❤️ using LLM Compressor for the open-source community*

recipe.yaml CHANGED Viewed

@@ -1,6 +1,6 @@
 default_stage:
   default_modifiers:
     QuantizationModifier:
-      ignore: ['re:.*lm_head', 're:.*vision.*', 're:mlp1.*']
       targets: [Linear]
       scheme: FP8_DYNAMIC

 default_stage:
   default_modifiers:
     QuantizationModifier:
       targets: [Linear]
+      ignore: ['re:.*lm_head', 're:.*vision.*', 're:mlp1.*']
       scheme: FP8_DYNAMIC