Add/update the quantized ONNX model files and README.md for Transformers.js v3
Applied Quantizations
✅ Based on decoder_model.onnx with slimming
↳ ✅ fp16: decoder_model_fp16.onnx (added)
↳ ✅ int8: decoder_model_int8.onnx (added)
↳ ✅ uint8: decoder_model_uint8.onnx (added)
↳ ✅ q4: decoder_model_q4.onnx (added)
↳ ✅ q4f16: decoder_model_q4f16.onnx (added)
↳ ✅ bnb4: decoder_model_bnb4.onnx (added)
✅ Based on decoder_with_past_model.onnx with slimming
↳ ✅ fp16: decoder_with_past_model_fp16.onnx (added)
↳ ✅ int8: decoder_with_past_model_int8.onnx (added)
↳ ✅ uint8: decoder_with_past_model_uint8.onnx (added)
↳ ✅ q4: decoder_with_past_model_q4.onnx (added)
↳ ✅ q4f16: decoder_with_past_model_q4f16.onnx (added)
↳ ✅ bnb4: decoder_with_past_model_bnb4.onnx (added)
❌ Based on decoder_model_merged.onnx with slimming
The base model decoder_model_merged.onnx has been renamed to model.onnx.
None
↳ ❌ fp16: `` (added but JS-based E2E test failed)
[0;93m2025-08-29 06:45:48.728261991 [W:onnxruntime:, graph.cc:113 MergeShapeInfo] Error merging shape info for output. '/transformer/Slice_output_0' source:{2} target:{1}. Falling back to lenient merge.[m
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25
__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
^
Error: Load model from /tmp/tmp6_c_aalh/1e3f86ffc5524c2a4bd546c4c49a13c7b8fbd5b8/onnx/model_fp16.onnx failed:Node (/transformer/Squeeze) Op (Squeeze) [ShapeInferenceError] Dimension of input 0 must be 1 instead of 2
at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25:92)
at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:67:29)
at process.processImmediate (node:internal/timers:485:21)
Node.js v22.16.0
↳ ❌ int8: `` (added but JS-based E2E test failed)
[0;93m2025-08-29 06:45:58.919510973 [W:onnxruntime:, graph.cc:113 MergeShapeInfo] Error merging shape info for output. '/transformer/Slice_output_0' source:{2} target:{1}. Falling back to lenient merge.[m
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25
__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
^
Error: Load model from /tmp/tmp6_c_aalh/1e3f86ffc5524c2a4bd546c4c49a13c7b8fbd5b8/onnx/model_int8.onnx failed:Node (/transformer/Squeeze) Op (Squeeze) [ShapeInferenceError] Dimension of input 0 must be 1 instead of 2
at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25:92)
at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:67:29)
at process.processImmediate (node:internal/timers:485:21)
Node.js v22.16.0
↳ ❌ uint8: `` (added but JS-based E2E test failed)
[0;93m2025-08-29 06:46:09.869966897 [W:onnxruntime:, graph.cc:113 MergeShapeInfo] Error merging shape info for output. '/transformer/Slice_output_0' source:{2} target:{1}. Falling back to lenient merge.[m
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25
__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
^
Error: Load model from /tmp/tmp6_c_aalh/1e3f86ffc5524c2a4bd546c4c49a13c7b8fbd5b8/onnx/model_uint8.onnx failed:Node (/transformer/Squeeze) Op (Squeeze) [ShapeInferenceError] Dimension of input 0 must be 1 instead of 2
at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25:92)
at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:67:29)
at process.processImmediate (node:internal/timers:485:21)
Node.js v22.16.0
↳ ❌ q4: `` (added but JS-based E2E test failed)
[0;93m2025-08-29 06:46:18.713796850 [W:onnxruntime:, graph.cc:113 MergeShapeInfo] Error merging shape info for output. '/transformer/Slice_output_0' source:{2} target:{1}. Falling back to lenient merge.[m
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25
__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
^
Error: Load model from /tmp/tmp6_c_aalh/1e3f86ffc5524c2a4bd546c4c49a13c7b8fbd5b8/onnx/model_q4.onnx failed:Node (/transformer/Squeeze) Op (Squeeze) [ShapeInferenceError] Dimension of input 0 must be 1 instead of 2
at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25:92)
at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:67:29)
at process.processImmediate (node:internal/timers:485:21)
Node.js v22.16.0
↳ ❌ q4f16: `` (added but JS-based E2E test failed)
[0;93m2025-08-29 06:46:24.865172037 [W:onnxruntime:, graph.cc:113 MergeShapeInfo] Error merging shape info for output. '/transformer/Slice_output_0' source:{2} target:{1}. Falling back to lenient merge.[m
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25
__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
^
Error: Load model from /tmp/tmp6_c_aalh/1e3f86ffc5524c2a4bd546c4c49a13c7b8fbd5b8/onnx/model_q4f16.onnx failed:Node (/transformer/Squeeze) Op (Squeeze) [ShapeInferenceError] Dimension of input 0 must be 1 instead of 2
at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25:92)
at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:67:29)
at process.processImmediate (node:internal/timers:485:21)
Node.js v22.16.0
↳ ❌ bnb4: `` (added but JS-based E2E test failed)
[0;93m2025-08-29 06:46:33.401879077 [W:onnxruntime:, graph.cc:113 MergeShapeInfo] Error merging shape info for output. '/transformer/Slice_output_0' source:{2} target:{1}. Falling back to lenient merge.[m
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25
__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
^
Error: Load model from /tmp/tmp6_c_aalh/1e3f86ffc5524c2a4bd546c4c49a13c7b8fbd5b8/onnx/model_bnb4.onnx failed:Node (/transformer/Squeeze) Op (Squeeze) [ShapeInferenceError] Dimension of input 0 must be 1 instead of 2
at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:25:92)
at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/onnxruntime-node@1.21.0/node_modules/onnxruntime-node/dist/backend.js:67:29)
at process.processImmediate (node:internal/timers:485:21)
Node.js v22.16.0