Add/update the quantized ONNX model files and README.md for Transformers.js v3

by whitphx - opened Aug 18, 2025

base: refs/heads/main

←

from: refs/pr/2

Discussion Files changed

+37

-0

whitphx

Aug 18, 2025

Applied Quantizations

✅ Based on `decoder_model_merged.onnx` with slimming

The base model decoder_model_merged.onnx has been renamed to model.onnx.

↳ ✅ fp16: model_fp16.onnx (added)
↳ ✅ int8: model_int8.onnx (added)
↳ ✅ uint8: model_uint8.onnx (added)
↳ ✅ q4: model_q4.onnx (added)
↳ ✅ q4f16: model_q4f16.onnx (added)
↳ ✅ bnb4: model_bnb4.onnx (added)

Add/update the quantized ONNX model files and README.md for Transformers.js v37a3c4c0c

Xenova changed pull request status to merged Aug 18, 2025

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Add/update the quantized ONNX model files and README.md for Transformers.js v3

Applied Quantizations

✅ Based on decoder_model_merged.onnx with slimming

✅ Based on `decoder_model_merged.onnx` with slimming