Temp fix for https://github.com/urchade/GLiNER/issues/314
1a. load ONNX FP32 (CPU)
from gliner import GLiNER
model = GLiNER.from_pretrained(
model_id='ataleckij/gliner_multi-v2.1_onnx'
map_location='cpu',
load_tokenizer=True,
load_onnx_model=True,
onnx_model_file='model.onnx'
)
model.eval()
1b. load ONNX FP16 (GPU)
from gliner import GLiNER
model = GLiNER.from_pretrained(
model_id='ataleckij/gliner_multi-v2.1_onnx'
map_location='gpu',
load_tokenizer=True,
load_onnx_model=True,
onnx_model_file='model_fp16.onnx'
)
model = model.half()
model.eval()
2. Inference
with torch.no_grad():
preds_batch: list[list[dict]] = model.inference(
texts,
entity_types,
threshold=threshold,
batch_size=batch_size,
flat_ner=False
)
- Downloads last month
- 1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for ataleckij/gliner_multi-v2.1_onnx
Base model
urchade/gliner_multi-v2.1
Quantized
onnx-community/gliner_multi-v2.1