Add/update the quantized ONNX model files and README.md for Transformers.js v3

#1
by whitphx - opened

Applied Quantizations

❌ Based on decoder_model.onnx with slimming

0%|          | 0/1 [00:00<?, ?it/s]
Processing /tmp/tmp1azua2m9/decoder_model.onnx:   0%|          | 0/1 [00:00<?, ?it/s]

  0%|          | 0/6 [00:00<?, ?it/s]

 - Quantizing to fp16:   0%|          | 0/6 [00:00<?, ?it/s]/home/ubuntu/src/tjsmigration/transformers.js/scripts/float16.py:85: UserWarning: the float32 number -3.4028234663852886e+38 will be truncated to -10000.0
  warnings.warn(
/home/ubuntu/src/tjsmigration/transformers.js/scripts/float16.py:73: UserWarning: the float32 number 5.960464477539063e-08 will be truncated to 1e-07
  warnings.warn(
/home/ubuntu/src/tjsmigration/transformers.js/scripts/float16.py:92: UserWarning: the float32 number -5.960464477539063e-08 will be truncated to -1e-07
  warnings.warn(

 - Quantizing to fp16:   0%|          | 0/6 [00:07<?, ?it/s]

Processing /tmp/tmp1azua2m9/decoder_model.onnx:   0%|          | 0/1 [00:07<?, ?it/s]
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/home/ubuntu/src/tjsmigration/transformers.js/scripts/quantize.py", line 377, in <module>
    main()
  File "/home/ubuntu/src/tjsmigration/transformers.js/scripts/quantize.py", line 374, in main
    quantize(input_folder, output_folder, quantization_args)
  File "/home/ubuntu/src/tjsmigration/transformers.js/scripts/quantize.py", line 309, in quantize
    quantize_fp16(
  File "/home/ubuntu/src/tjsmigration/transformers.js/scripts/quantize.py", line 223, in quantize_fp16
    check_and_save_model(model_fp16, save_path)
  File "/home/ubuntu/src/tjsmigration/transformers.js/scripts/utils.py", line 29, in check_and_save_model
    strict_check_model(model)
  File "/home/ubuntu/src/tjsmigration/transformers.js/scripts/utils.py", line 21, in strict_check_model
    raise e
  File "/home/ubuntu/src/tjsmigration/transformers.js/scripts/utils.py", line 16, in strict_check_model
    onnx.checker.check_model(model_or_path, full_check=True)
  File "/home/ubuntu/.cache/uv/archive-v0/cQ6A7vyzEBQhtbSuz6CnD/lib/python3.12/site-packages/onnx/checker.py", line 179, in check_model
    C.check_model(
onnx.onnx_cpp2py_export.shape_inference.InferenceError: [ShapeInferenceError] (op_type:Where, node name: /decoder/Where_3): Y has inconsistent type tensor(float16)

❌ Based on decoder_model.onnx without slimming

None

↳ ❌ fp16: decoder_model_fp16.onnx (added but JS-based E2E test failed)

dtype not specified for "model". Using the default dtype (fp32) for this device (cpu).
file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:30853
                    throw Error(`\`local_files_only=true\` or \`env.allowRemoteModels=false\` and file was not found locally at "${localPath}".`);
                          ^

Error: `local_files_only=true` or `env.allowRemoteModels=false` and file was not found locally at "/tmp/tmplklwzfps/9fe32815325dbdabe9d9d2dedd96a334a07067aa/onnx/model.onnx".
    at getModelFile (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:30853:27)
    at async getSession (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7132:28)
    at async file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7149:73
    at async Promise.all (index 0)
    at async constructSessions (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7147:31)
    at async Promise.all (index 0)
    at async OPTForCausalLM.from_pretrained (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7946:20)
    at async AutoModelForCausalLM.from_pretrained (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:14326:20)
    at async Promise.all (index 1)
    at async loadItems (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:23881:5)

Node.js v22.16.0

↳ ❌ int8: decoder_model_int8.onnx (added but JS-based E2E test failed)

dtype not specified for "model". Using the default dtype (fp32) for this device (cpu).
file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:30853
                    throw Error(`\`local_files_only=true\` or \`env.allowRemoteModels=false\` and file was not found locally at "${localPath}".`);
                          ^

Error: `local_files_only=true` or `env.allowRemoteModels=false` and file was not found locally at "/tmp/tmplklwzfps/9fe32815325dbdabe9d9d2dedd96a334a07067aa/onnx/model.onnx".
    at getModelFile (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:30853:27)
    at async getSession (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7132:28)
    at async file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7149:73
    at async Promise.all (index 0)
    at async constructSessions (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7147:31)
    at async Promise.all (index 0)
    at async OPTForCausalLM.from_pretrained (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7946:20)
    at async AutoModelForCausalLM.from_pretrained (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:14326:20)
    at async Promise.all (index 1)
    at async loadItems (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:23881:5)

Node.js v22.16.0

↳ ❌ uint8: decoder_model_uint8.onnx (added but JS-based E2E test failed)

dtype not specified for "model". Using the default dtype (fp32) for this device (cpu).
file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:30853
                    throw Error(`\`local_files_only=true\` or \`env.allowRemoteModels=false\` and file was not found locally at "${localPath}".`);
                          ^

Error: `local_files_only=true` or `env.allowRemoteModels=false` and file was not found locally at "/tmp/tmplklwzfps/9fe32815325dbdabe9d9d2dedd96a334a07067aa/onnx/model.onnx".
    at getModelFile (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:30853:27)
    at async getSession (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7132:28)
    at async file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7149:73
    at async Promise.all (index 0)
    at async constructSessions (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7147:31)
    at async Promise.all (index 0)
    at async OPTForCausalLM.from_pretrained (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7946:20)
    at async AutoModelForCausalLM.from_pretrained (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:14326:20)
    at async Promise.all (index 1)
    at async loadItems (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:23881:5)

Node.js v22.16.0

↳ ❌ q4: decoder_model_q4.onnx (added but JS-based E2E test failed)

dtype not specified for "model". Using the default dtype (fp32) for this device (cpu).
file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:30853
                    throw Error(`\`local_files_only=true\` or \`env.allowRemoteModels=false\` and file was not found locally at "${localPath}".`);
                          ^

Error: `local_files_only=true` or `env.allowRemoteModels=false` and file was not found locally at "/tmp/tmplklwzfps/9fe32815325dbdabe9d9d2dedd96a334a07067aa/onnx/model.onnx".
    at getModelFile (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:30853:27)
    at async getSession (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7132:28)
    at async file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7149:73
    at async Promise.all (index 0)
    at async constructSessions (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7147:31)
    at async Promise.all (index 0)
    at async OPTForCausalLM.from_pretrained (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7946:20)
    at async AutoModelForCausalLM.from_pretrained (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:14326:20)
    at async Promise.all (index 1)
    at async loadItems (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:23881:5)

Node.js v22.16.0

↳ ❌ q4f16: decoder_model_q4f16.onnx (added but JS-based E2E test failed)

dtype not specified for "model". Using the default dtype (fp32) for this device (cpu).
file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:30853
                    throw Error(`\`local_files_only=true\` or \`env.allowRemoteModels=false\` and file was not found locally at "${localPath}".`);
                          ^

Error: `local_files_only=true` or `env.allowRemoteModels=false` and file was not found locally at "/tmp/tmplklwzfps/9fe32815325dbdabe9d9d2dedd96a334a07067aa/onnx/model.onnx".
    at getModelFile (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:30853:27)
    at async getSession (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7132:28)
    at async file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7149:73
    at async Promise.all (index 0)
    at async constructSessions (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7147:31)
    at async Promise.all (index 0)
    at async OPTForCausalLM.from_pretrained (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7946:20)
    at async AutoModelForCausalLM.from_pretrained (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:14326:20)
    at async Promise.all (index 1)
    at async loadItems (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:23881:5)

Node.js v22.16.0

↳ ❌ bnb4: decoder_model_bnb4.onnx (added but JS-based E2E test failed)

dtype not specified for "model". Using the default dtype (fp32) for this device (cpu).
file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:30853
                    throw Error(`\`local_files_only=true\` or \`env.allowRemoteModels=false\` and file was not found locally at "${localPath}".`);
                          ^

Error: `local_files_only=true` or `env.allowRemoteModels=false` and file was not found locally at "/tmp/tmplklwzfps/9fe32815325dbdabe9d9d2dedd96a334a07067aa/onnx/model.onnx".
    at getModelFile (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:30853:27)
    at async getSession (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7132:28)
    at async file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7149:73
    at async Promise.all (index 0)
    at async constructSessions (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7147:31)
    at async Promise.all (index 0)
    at async OPTForCausalLM.from_pretrained (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7946:20)
    at async AutoModelForCausalLM.from_pretrained (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:14326:20)
    at async Promise.all (index 1)
    at async loadItems (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:23881:5)

Node.js v22.16.0

❌ Based on decoder_with_past_model.onnx with slimming

0%|          | 0/1 [00:00<?, ?it/s]
Processing /tmp/tmp3j73s0el/decoder_with_past_model.onnx:   0%|          | 0/1 [00:00<?, ?it/s]

  0%|          | 0/6 [00:00<?, ?it/s]

 - Quantizing to fp16:   0%|          | 0/6 [00:00<?, ?it/s]/home/ubuntu/src/tjsmigration/transformers.js/scripts/float16.py:73: UserWarning: the float32 number 5.960464477539063e-08 will be truncated to 1e-07
  warnings.warn(
/home/ubuntu/src/tjsmigration/transformers.js/scripts/float16.py:92: UserWarning: the float32 number -5.960464477539063e-08 will be truncated to -1e-07
  warnings.warn(

 - Quantizing to fp16:   0%|          | 0/6 [00:08<?, ?it/s]

Processing /tmp/tmp3j73s0el/decoder_with_past_model.onnx:   0%|          | 0/1 [00:08<?, ?it/s]
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/home/ubuntu/src/tjsmigration/transformers.js/scripts/quantize.py", line 377, in <module>
    main()
  File "/home/ubuntu/src/tjsmigration/transformers.js/scripts/quantize.py", line 374, in main
    quantize(input_folder, output_folder, quantization_args)
  File "/home/ubuntu/src/tjsmigration/transformers.js/scripts/quantize.py", line 309, in quantize
    quantize_fp16(
  File "/home/ubuntu/src/tjsmigration/transformers.js/scripts/quantize.py", line 223, in quantize_fp16
    check_and_save_model(model_fp16, save_path)
  File "/home/ubuntu/src/tjsmigration/transformers.js/scripts/utils.py", line 29, in check_and_save_model
    strict_check_model(model)
  File "/home/ubuntu/src/tjsmigration/transformers.js/scripts/utils.py", line 21, in strict_check_model
    raise e
  File "/home/ubuntu/src/tjsmigration/transformers.js/scripts/utils.py", line 16, in strict_check_model
    onnx.checker.check_model(model_or_path, full_check=True)
  File "/home/ubuntu/.cache/uv/archive-v0/cQ6A7vyzEBQhtbSuz6CnD/lib/python3.12/site-packages/onnx/checker.py", line 179, in check_model
    C.check_model(
onnx.onnx_cpp2py_export.shape_inference.InferenceError: [ShapeInferenceError] (op_type:Where, node name: /decoder/Where_1): Y has inconsistent type tensor(float16)

❌ Based on decoder_with_past_model.onnx without slimming

None

↳ ❌ fp16: decoder_with_past_model_fp16.onnx (added but JS-based E2E test failed)

dtype not specified for "model". Using the default dtype (fp32) for this device (cpu).
file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:30853
                    throw Error(`\`local_files_only=true\` or \`env.allowRemoteModels=false\` and file was not found locally at "${localPath}".`);
                          ^

Error: `local_files_only=true` or `env.allowRemoteModels=false` and file was not found locally at "/tmp/tmpzz7bbhkd/9fe32815325dbdabe9d9d2dedd96a334a07067aa/onnx/model.onnx".
    at getModelFile (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:30853:27)
    at async getSession (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7132:28)
    at async file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7149:73
    at async Promise.all (index 0)
    at async constructSessions (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7147:31)
    at async Promise.all (index 0)
    at async OPTForCausalLM.from_pretrained (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7946:20)
    at async AutoModelForCausalLM.from_pretrained (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:14326:20)
    at async Promise.all (index 1)
    at async loadItems (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:23881:5)

Node.js v22.16.0

↳ ❌ int8: decoder_with_past_model_int8.onnx (added but JS-based E2E test failed)

dtype not specified for "model". Using the default dtype (fp32) for this device (cpu).
file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:30853
                    throw Error(`\`local_files_only=true\` or \`env.allowRemoteModels=false\` and file was not found locally at "${localPath}".`);
                          ^

Error: `local_files_only=true` or `env.allowRemoteModels=false` and file was not found locally at "/tmp/tmpzz7bbhkd/9fe32815325dbdabe9d9d2dedd96a334a07067aa/onnx/model.onnx".
    at getModelFile (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:30853:27)
    at async getSession (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7132:28)
    at async file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7149:73
    at async Promise.all (index 0)
    at async constructSessions (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7147:31)
    at async Promise.all (index 0)
    at async OPTForCausalLM.from_pretrained (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7946:20)
    at async AutoModelForCausalLM.from_pretrained (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:14326:20)
    at async Promise.all (index 1)
    at async loadItems (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:23881:5)

Node.js v22.16.0

↳ ❌ uint8: decoder_with_past_model_uint8.onnx (added but JS-based E2E test failed)

dtype not specified for "model". Using the default dtype (fp32) for this device (cpu).
file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:30853
                    throw Error(`\`local_files_only=true\` or \`env.allowRemoteModels=false\` and file was not found locally at "${localPath}".`);
                          ^

Error: `local_files_only=true` or `env.allowRemoteModels=false` and file was not found locally at "/tmp/tmpzz7bbhkd/9fe32815325dbdabe9d9d2dedd96a334a07067aa/onnx/model.onnx".
    at getModelFile (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:30853:27)
    at async getSession (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7132:28)
    at async file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7149:73
    at async Promise.all (index 0)
    at async constructSessions (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7147:31)
    at async Promise.all (index 0)
    at async OPTForCausalLM.from_pretrained (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7946:20)
    at async AutoModelForCausalLM.from_pretrained (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:14326:20)
    at async Promise.all (index 1)
    at async loadItems (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:23881:5)

Node.js v22.16.0

↳ ❌ q4: decoder_with_past_model_q4.onnx (added but JS-based E2E test failed)

dtype not specified for "model". Using the default dtype (fp32) for this device (cpu).
file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:30853
                    throw Error(`\`local_files_only=true\` or \`env.allowRemoteModels=false\` and file was not found locally at "${localPath}".`);
                          ^

Error: `local_files_only=true` or `env.allowRemoteModels=false` and file was not found locally at "/tmp/tmpzz7bbhkd/9fe32815325dbdabe9d9d2dedd96a334a07067aa/onnx/model.onnx".
    at getModelFile (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:30853:27)
    at async getSession (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7132:28)
    at async file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7149:73
    at async Promise.all (index 0)
    at async constructSessions (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7147:31)
    at async Promise.all (index 0)
    at async OPTForCausalLM.from_pretrained (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7946:20)
    at async AutoModelForCausalLM.from_pretrained (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:14326:20)
    at async Promise.all (index 1)
    at async loadItems (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:23881:5)

Node.js v22.16.0

↳ ❌ q4f16: decoder_with_past_model_q4f16.onnx (added but JS-based E2E test failed)

dtype not specified for "model". Using the default dtype (fp32) for this device (cpu).
file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:30853
                    throw Error(`\`local_files_only=true\` or \`env.allowRemoteModels=false\` and file was not found locally at "${localPath}".`);
                          ^

Error: `local_files_only=true` or `env.allowRemoteModels=false` and file was not found locally at "/tmp/tmpzz7bbhkd/9fe32815325dbdabe9d9d2dedd96a334a07067aa/onnx/model.onnx".
    at getModelFile (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:30853:27)
    at async getSession (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7132:28)
    at async file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7149:73
    at async Promise.all (index 0)
    at async constructSessions (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7147:31)
    at async Promise.all (index 0)
    at async OPTForCausalLM.from_pretrained (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7946:20)
    at async AutoModelForCausalLM.from_pretrained (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:14326:20)
    at async Promise.all (index 1)
    at async loadItems (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:23881:5)

Node.js v22.16.0

↳ ❌ bnb4: decoder_with_past_model_bnb4.onnx (added but JS-based E2E test failed)

dtype not specified for "model". Using the default dtype (fp32) for this device (cpu).
file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:30853
                    throw Error(`\`local_files_only=true\` or \`env.allowRemoteModels=false\` and file was not found locally at "${localPath}".`);
                          ^

Error: `local_files_only=true` or `env.allowRemoteModels=false` and file was not found locally at "/tmp/tmpzz7bbhkd/9fe32815325dbdabe9d9d2dedd96a334a07067aa/onnx/model.onnx".
    at getModelFile (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:30853:27)
    at async getSession (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7132:28)
    at async file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7149:73
    at async Promise.all (index 0)
    at async constructSessions (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7147:31)
    at async Promise.all (index 0)
    at async OPTForCausalLM.from_pretrained (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:7946:20)
    at async AutoModelForCausalLM.from_pretrained (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:14326:20)
    at async Promise.all (index 1)
    at async loadItems (file:///home/ubuntu/src/tjsmigration/node_modules/.pnpm/@huggingface+transformers@3.5.2/node_modules/@huggingface/transformers/dist/transformers.node.mjs:23881:5)

Node.js v22.16.0

❌ Based on decoder_model_merged.onnx with slimming

The base model decoder_model_merged.onnx has been renamed to model.onnx.

0%|          | 0/1 [00:00<?, ?it/s]
Processing /tmp/tmp7gdbuxp2/model.onnx:   0%|          | 0/1 [00:00<?, ?it/s]

  0%|          | 0/6 [00:00<?, ?it/s]

 - Quantizing to fp16:   0%|          | 0/6 [00:00<?, ?it/s]/home/ubuntu/src/tjsmigration/transformers.js/scripts/float16.py:73: UserWarning: the float32 number 5.960464477539063e-08 will be truncated to 1e-07
  warnings.warn(
/home/ubuntu/src/tjsmigration/transformers.js/scripts/float16.py:92: UserWarning: the float32 number -5.960464477539063e-08 will be truncated to -1e-07
  warnings.warn(
/home/ubuntu/src/tjsmigration/transformers.js/scripts/float16.py:85: UserWarning: the float32 number -3.4028234663852886e+38 will be truncated to -10000.0
  warnings.warn(

 - Quantizing to fp16:   0%|          | 0/6 [00:08<?, ?it/s]

Processing /tmp/tmp7gdbuxp2/model.onnx:   0%|          | 0/1 [00:08<?, ?it/s]
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/home/ubuntu/src/tjsmigration/transformers.js/scripts/quantize.py", line 377, in <module>
    main()
  File "/home/ubuntu/src/tjsmigration/transformers.js/scripts/quantize.py", line 374, in main
    quantize(input_folder, output_folder, quantization_args)
  File "/home/ubuntu/src/tjsmigration/transformers.js/scripts/quantize.py", line 309, in quantize
    quantize_fp16(
  File "/home/ubuntu/src/tjsmigration/transformers.js/scripts/quantize.py", line 223, in quantize_fp16
    check_and_save_model(model_fp16, save_path)
  File "/home/ubuntu/src/tjsmigration/transformers.js/scripts/utils.py", line 29, in check_and_save_model
    strict_check_model(model)
  File "/home/ubuntu/src/tjsmigration/transformers.js/scripts/utils.py", line 21, in strict_check_model
    raise e
  File "/home/ubuntu/src/tjsmigration/transformers.js/scripts/utils.py", line 16, in strict_check_model
    onnx.checker.check_model(model_or_path, full_check=True)
  File "/home/ubuntu/.cache/uv/archive-v0/cQ6A7vyzEBQhtbSuz6CnD/lib/python3.12/site-packages/onnx/checker.py", line 179, in check_model
    C.check_model(
onnx.onnx_cpp2py_export.shape_inference.InferenceError: [ShapeInferenceError] Inference error(s): (op_type:If, node name: optimum::if): [ShapeInferenceError] (op_type:Max, node name: /decoder/layers.0/self_attn/Max): data_0 has inconsistent type tensor(float16)

✅ Based on decoder_model_merged.onnx without slimming

The base model decoder_model_merged.onnx has been renamed to model.onnx.

↳ ✅ fp16: model_fp16.onnx (added)
↳ ✅ int8: model_int8.onnx (added)
↳ ✅ uint8: model_uint8.onnx (added)
↳ ✅ q4: model_q4.onnx (added)
↳ ✅ q4f16: model_q4f16.onnx (added)
↳ ✅ bnb4: model_bnb4.onnx (added)

Xenova changed pull request status to merged

Sign up or log in to comment