EuroLLM-1.7B-Instruct-ONNX
This repository contains ONNX weights for utter-project/EuroLLM-1.7B-Instruct
prepared for use with Transformers.js.
Available dtypes in this export: fp32, q4, q8.
The repository layout follows the standard Transformers.js convention:
- tokenizer and config files in the repository root
- ONNX model files inside
onnx/
Usage (Transformers.js)
import { pipeline } from "@huggingface/transformers";
const generator = await pipeline("text-generation", "EuroLLM-1.7B-Instruct-ONNX", {
device: "webgpu",
dtype: "fp16", // or "fp32", "q4", "q8"
});
const messages = [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Schreibe einen kurzen Satz auf Deutsch." },
];
const output = await generator(messages, { max_new_tokens: 64 });
console.log(output[0].generated_text.at(-1).content);
Recommended choices:
fp32: highest precision, typically for WebGPUfp16: smaller WebGPU model with good speed/quality tradeoffq8: smaller CPU/WASM modelq4: smallest model, best for constrained devices
Source Model
- Base model:
utter-project/EuroLLM-1.7B-Instruct - Export format: ONNX
- Intended runtime: Transformers.js
- Downloads last month
- 14
Model tree for flackzz/EuroLLM-1.7B-Instruct-ONNX
Base model
utter-project/EuroLLM-1.7B Finetuned
utter-project/EuroLLM-1.7B-Instruct