KameronB commited on
Commit
203a743
·
verified ·
1 Parent(s): 5452ae4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -1
README.md CHANGED
@@ -215,4 +215,17 @@ for i, detail in enumerate(sentence_details, 1):
215
 
216
  print("="*80)
217
 
218
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
215
 
216
  print("="*80)
217
 
218
+ ```
219
+
220
+ If you want to quantize this model to save a lot of memory, you can use torchao.
221
+ This is the config you would use if you wanted to run it on a laptop or small device
222
+
223
+ ```python
224
+ from torchao.quantization import quantize_, Int8WeightOnlyConfig
225
+
226
+ model.eval().to("cpu")
227
+
228
+ # In-place: converts Linear layers to int8 weights
229
+ quantize_(model, Int8WeightOnlyConfig())
230
+ ```
231
+