Add/update the quantized ONNX model files and README.md for Transformers.js v3

#1
by whitphx HF Staff - opened

Applied Quantizations

❌ Based on model.onnx with slimming

0%|          | 0/1 [00:00<?, ?it/s]
Processing /tmp/tmpkb5pkodn/model.onnx:   0%|          | 0/1 [00:00<?, ?it/s]

  0%|          | 0/5 [00:00<?, ?it/s]

 - Quantizing to int8:   0%|          | 0/5 [00:00<?, ?it/s]2025-08-18 09:34:34,065 root [INFO] - Quantization parameters for tensor:"pixel_values" not specified
2025-08-18 09:34:34,072 root [INFO] - Quantization parameters for tensor:"/segformer/encoder/block.0.0/layer_norm_1/Add_1_output_0" not specified

 - Quantizing to int8:   0%|          | 0/5 [00:01<?, ?it/s]

Processing /tmp/tmpkb5pkodn/model.onnx:   0%|          | 0/1 [00:01<?, ?it/s]
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/home/ubuntu/src/tjsmigration/transformers.js/scripts/quantize.py", line 377, in <module>
    main()
  File "/home/ubuntu/src/tjsmigration/transformers.js/scripts/quantize.py", line 374, in main
    quantize(input_folder, output_folder, quantization_args)
  File "/home/ubuntu/src/tjsmigration/transformers.js/scripts/quantize.py", line 359, in quantize
    quantize_q8(
  File "/home/ubuntu/src/tjsmigration/transformers.js/scripts/quantize.py", line 196, in quantize_q8
    quantizer.quantize_model()
  File "/home/ubuntu/.cache/uv/archive-v0/cQ6A7vyzEBQhtbSuz6CnD/lib/python3.12/site-packages/onnxruntime/quantization/onnx_quantizer.py", line 211, in quantize_model
    op_quantizer.quantize()
  File "/home/ubuntu/.cache/uv/archive-v0/cQ6A7vyzEBQhtbSuz6CnD/lib/python3.12/site-packages/onnxruntime/quantization/operators/base_operator.py", line 21, in quantize
    dequantize_node = self.quantizer._dequantize_value(node_input)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/.cache/uv/archive-v0/cQ6A7vyzEBQhtbSuz6CnD/lib/python3.12/site-packages/onnxruntime/quantization/onnx_quantizer.py", line 945, in _dequantize_value
    assert onnx.numpy_helper.to_array(scale_init).size == 1
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/.cache/uv/archive-v0/cQ6A7vyzEBQhtbSuz6CnD/lib/python3.12/site-packages/onnx/numpy_helper.py", line 349, in to_array
    elem_type = tensor.data_type
                ^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'data_type'

❌ Based on model.onnx without slimming

0%|          | 0/1 [00:00<?, ?it/s]
Processing /tmp/tmpybtwzvta/model.onnx:   0%|          | 0/1 [00:00<?, ?it/s]

  0%|          | 0/5 [00:00<?, ?it/s]

 - Quantizing to int8:   0%|          | 0/5 [00:00<?, ?it/s]2025-08-18 09:34:38,910 root [INFO] - Quantization parameters for tensor:"pixel_values" not specified
2025-08-18 09:34:38,918 root [INFO] - Quantization parameters for tensor:"/segformer/encoder/block.0.0/layer_norm_1/Add_1_output_0" not specified

 - Quantizing to int8:   0%|          | 0/5 [00:02<?, ?it/s]

Processing /tmp/tmpybtwzvta/model.onnx:   0%|          | 0/1 [00:02<?, ?it/s]
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/home/ubuntu/src/tjsmigration/transformers.js/scripts/quantize.py", line 377, in <module>
    main()
  File "/home/ubuntu/src/tjsmigration/transformers.js/scripts/quantize.py", line 374, in main
    quantize(input_folder, output_folder, quantization_args)
  File "/home/ubuntu/src/tjsmigration/transformers.js/scripts/quantize.py", line 359, in quantize
    quantize_q8(
  File "/home/ubuntu/src/tjsmigration/transformers.js/scripts/quantize.py", line 196, in quantize_q8
    quantizer.quantize_model()
  File "/home/ubuntu/.cache/uv/archive-v0/cQ6A7vyzEBQhtbSuz6CnD/lib/python3.12/site-packages/onnxruntime/quantization/onnx_quantizer.py", line 211, in quantize_model
    op_quantizer.quantize()
  File "/home/ubuntu/.cache/uv/archive-v0/cQ6A7vyzEBQhtbSuz6CnD/lib/python3.12/site-packages/onnxruntime/quantization/operators/base_operator.py", line 21, in quantize
    dequantize_node = self.quantizer._dequantize_value(node_input)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/.cache/uv/archive-v0/cQ6A7vyzEBQhtbSuz6CnD/lib/python3.12/site-packages/onnxruntime/quantization/onnx_quantizer.py", line 945, in _dequantize_value
    assert onnx.numpy_helper.to_array(scale_init).size == 1
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/.cache/uv/archive-v0/cQ6A7vyzEBQhtbSuz6CnD/lib/python3.12/site-packages/onnx/numpy_helper.py", line 349, in to_array
    elem_type = tensor.data_type
                ^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'data_type'
Xenova changed pull request status to merged

Sign up or log in to comment