Add/update the quantized ONNX model files and README.md for Transformers.js v3
#2
by
whitphx
HF Staff
- opened
Applied Quantizations
✅ Based on decoder_model.onnx with slimming
↳ ✅ q4f16: decoder_model_q4f16.onnx (added)
✅ Based on encoder_model.onnx with slimming
↳ ✅ q4f16: encoder_model_q4f16.onnx (added)
✅ Based on decoder_with_past_model.onnx with slimming
↳ ✅ q4f16: decoder_with_past_model_q4f16.onnx (added)
✅ Based on decoder_model_merged.onnx with slimming
The base model decoder_model_merged.onnx has been renamed to model.onnx.
↳ ✅ fp16: model_fp16.onnx (replaced because it was invalid)
↳ ✅ q4f16: model_q4f16.onnx (added)
Xenova
changed pull request status to
merged