Add q4 Transformers.js export of openbmb/MiniCPM5-1B

b2e7e17 verified 9 days ago

1.06 kB

	---
	license: apache-2.0
	library_name: transformers.js
	pipeline_tag: text-generation
	base_model: openbmb/MiniCPM5-1B
	tags:
	- transformers.js
	- onnx
	- onnxruntime-web
	- llama
	- minicpm5
	- text-generation
	- browser
	- webgpu
	---

	# MiniCPM5-1B ONNX Web

	Transformers.js q4 ONNX export of `openbmb/MiniCPM5-1B` for browser text generation.

	## Files

	- `onnx/model_q4.onnx`: ONNX Runtime 4-bit MatMul quantized decoder with KV cache.
	- `config.json`: includes `transformers.js_config.dtype = "q4"` so Transformers.js loads the q4 artifact by default.
	- tokenizer and generation config files copied from the source model export.

	## Usage

	```js
	import { pipeline } from "@huggingface/transformers";

	const generator = await pipeline("text-generation", "Mike0021/MiniCPM5-1B-ONNX-Web", {
	dtype: "q4",
	device: "webgpu",
	});

	const output = await generator("Briefly introduce yourself.", {
	max_new_tokens: 64,
	temperature: 0.2,
	do_sample: true,
	});
	console.log(output[0].generated_text);
	```

	If WebGPU is unavailable, use `device: "wasm"` in the browser.