Add q4 Transformers.js export of openbmb/MiniCPM5-1B

b2e7e17 verified 9 days ago

1.06 kB

license: apache-2.0
library_name: transformers.js
pipeline_tag: text-generation
base_model: openbmb/MiniCPM5-1B
tags:
  - transformers.js
  - onnx
  - onnxruntime-web
  - llama
  - minicpm5
  - text-generation
  - browser
  - webgpu

MiniCPM5-1B ONNX Web

Transformers.js q4 ONNX export of openbmb/MiniCPM5-1B for browser text generation.

Files

onnx/model_q4.onnx: ONNX Runtime 4-bit MatMul quantized decoder with KV cache.
config.json: includes transformers.js_config.dtype = "q4" so Transformers.js loads the q4 artifact by default.
tokenizer and generation config files copied from the source model export.

Usage

import { pipeline } from "@huggingface/transformers";

const generator = await pipeline("text-generation", "Mike0021/MiniCPM5-1B-ONNX-Web", {
  dtype: "q4",
  device: "webgpu",
});

const output = await generator("Briefly introduce yourself.", {
  max_new_tokens: 64,
  temperature: 0.2,
  do_sample: true,
});
console.log(output[0].generated_text);

If WebGPU is unavailable, use device: "wasm" in the browser.