Instructions to use Xenova/bge-m3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers.js
How to use Xenova/bge-m3 with Transformers.js:
// npm i @huggingface/transformers import { pipeline } from '@huggingface/transformers'; // Allocate pipeline const pipe = await pipeline('feature-extraction', 'Xenova/bge-m3');
use in web browser
#3
by ciekawy - opened
ok, I managed to run everything with transformers v3 branch and latest onnxruntime-web thanks to https://github.com/microsoft/onnxruntime/issues/20876
however I noticed now that that the wasm is up to 2x faster than webgpu on Apple M3 and enough RAM (with quantized model, measuring just single extractor calls)
however I noticed now that that the wasm is up to 2x faster than webgpu on Apple M3 and enough RAM (with quantized model, measuring just single extractor calls)
I would recommend setting the dtype to fp16 or q4: with await pipeline('feature-extraction', 'Xenova/bge-m3', { dtype: 'fp16' }) for example.