brandonbeiler commited on
Commit
d8970e1
·
verified ·
1 Parent(s): a3300d9

Upload folder using huggingface_hub

Browse files
Files changed (2) hide show
  1. README.md +3 -3
  2. recipe.yaml +1 -1
README.md CHANGED
@@ -30,7 +30,7 @@ The model utilizes **dynamic FP8 quantization** for optimal ease of use and depl
30
  - **Source Model**: OpenGVLab/InternVL3-38B
31
  - **Quantized Model**: InternVL3-38B-FP8-Dynamic
32
  - **Quantization Method**: FP8 Dynamic (W8A8)
33
- - **Quantization Library**: [LLM Compressor](https://github.com/vllm-project/llm-compressor) v0.5.1
34
  - **Quantized by**: [brandonbeiler](https://huggingface.co/brandonbeiler)
35
  ## 🔧 Usage
36
  ### With vLLM (Recommended)
@@ -62,11 +62,11 @@ print(response[0].outputs[0].text)
62
  ## 🔬 Package Versions
63
  This model was created using:
64
  ```
65
- llmcompressor==0.5.1
66
  compressed-tensors==latest
67
  transformers==4.52.4
68
  torch==2.7.0
69
- vllm==0.9.0.1
70
  ```
71
 
72
  *Quantized with ❤️ using LLM Compressor for the open-source community*
 
30
  - **Source Model**: OpenGVLab/InternVL3-38B
31
  - **Quantized Model**: InternVL3-38B-FP8-Dynamic
32
  - **Quantization Method**: FP8 Dynamic (W8A8)
33
+ - **Quantization Library**: [LLM Compressor](https://github.com/vllm-project/llm-compressor) v0.5.2.dev112+g6800f811
34
  - **Quantized by**: [brandonbeiler](https://huggingface.co/brandonbeiler)
35
  ## 🔧 Usage
36
  ### With vLLM (Recommended)
 
62
  ## 🔬 Package Versions
63
  This model was created using:
64
  ```
65
+ llmcompressor==0.5.2.dev112+g6800f811
66
  compressed-tensors==latest
67
  transformers==4.52.4
68
  torch==2.7.0
69
+ vllm==0.9.1
70
  ```
71
 
72
  *Quantized with ❤️ using LLM Compressor for the open-source community*
recipe.yaml CHANGED
@@ -1,6 +1,6 @@
1
  default_stage:
2
  default_modifiers:
3
  QuantizationModifier:
4
- ignore: ['re:.*lm_head', 're:.*vision.*', 're:mlp1.*']
5
  targets: [Linear]
 
6
  scheme: FP8_DYNAMIC
 
1
  default_stage:
2
  default_modifiers:
3
  QuantizationModifier:
 
4
  targets: [Linear]
5
+ ignore: ['re:.*lm_head', 're:.*vision.*', 're:mlp1.*']
6
  scheme: FP8_DYNAMIC