MiA-Emb-ONNX
Collection
ONNX conversions of MindscapeRAG MiA embedding models
•
3 items
•
Updated
ONNX conversion of MindscapeRAG/MiA-Emb-0.6B for fast CPU/GPU inference.
import onnxruntime as ort
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("maxiboch/MiA-Emb-0.6B-onnx")
session = ort.InferenceSession("model.onnx", providers=['CUDAExecutionProvider', 'CPUExecutionProvider'])
inputs = tokenizer("Your text here", return_tensors="np", padding=True, truncation=True)
outputs = session.run(None, dict(inputs))
embeddings = outputs[0]
Converted to ONNX by @maxiboch.
Base model
Qwen/Qwen3-0.6B-Base