RedHatAI
/

granite-3.1-2b-instruct-FP8-dynamic

Text Generation

compressed-tensors

Model card Files Files and versions

nm-research commited on Jan 20, 2025

Commit

c8b4ee2

·

verified ·

1 Parent(s): 6043ef6

Update README.md

Files changed (1) hide show

README.md +4 -0

README.md CHANGED Viewed

@@ -66,6 +66,10 @@ vLLM also supports OpenAI-compatible serving. See the [documentation](https://do
 This model was created with [llm-compressor](https://github.com/vllm-project/llm-compressor) by running the code snippet below.
 ```python
 import argparse
 from transformers import AutoModelForCausalLM, AutoTokenizer

 This model was created with [llm-compressor](https://github.com/vllm-project/llm-compressor) by running the code snippet below.
+```bash
+python quantize.py --model_id ibm-granite/granite-3.1-2b-base --save_path "output_dir/"
+```
 ```python
 import argparse
 from transformers import AutoModelForCausalLM, AutoTokenizer