Instructions to use amd-shark/sdxl-quant-int8 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use amd-shark/sdxl-quant-int8 with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("amd-shark/sdxl-quant-int8", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Update quant params structure
#2
by nickfraser - opened
Updates the following:
- Add
<weight|input>_zp_dtypetoquant_param.jsonto differentiate between exported versions - Update input/weight zero-points to be int8 (not uint8)
- Update the math model and tests to incorporate the above changes
- Remove SmoothQuant multipliers from layers that aren't quantized
-
Upload newquant_param.json -
Upload newparams.safetensors -
Upload new example outputout.safetensors -
Confirm compliant FID of model (FID β (23.0108, 23.9501)): 23.89 -
Confirm compliant CLIP score of model (CLIP β (31.686, 31.813)): 31.86
Strikethrough items were updated outside this PR.
nickfraser changed pull request status to open
nickfraser changed pull request status to merged