Update README.md
Browse files
README.md
CHANGED
|
@@ -76,7 +76,16 @@ vLLM also supports OpenAI-compatible serving. See the [documentation](https://do
|
|
| 76 |
|
| 77 |
This model was created by applying [LLM Compressor](https://github.com/vllm-project/llm-compressor), as presented in the code snippet below.
|
| 78 |
|
|
|
|
| 79 |
<details>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 80 |
```python
|
| 81 |
from compressed_tensors.offload import dispatch_model
|
| 82 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
|
|
|
| 76 |
|
| 77 |
This model was created by applying [LLM Compressor](https://github.com/vllm-project/llm-compressor), as presented in the code snippet below.
|
| 78 |
|
| 79 |
+
|
| 80 |
<details>
|
| 81 |
+
<summary>Creation details</summary>
|
| 82 |
+
|
| 83 |
+
Install specific llm-compression version:
|
| 84 |
+
```
|
| 85 |
+
uv pip install git+https://github.com/vllm-project/llm-compressor.git
|
| 86 |
+
uv pip install --upgrade torchvision --break-system-packages --no-cache
|
| 87 |
+
```
|
| 88 |
+
|
| 89 |
```python
|
| 90 |
from compressed_tensors.offload import dispatch_model
|
| 91 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|