GLiNER ONNX - Small

Pre-converted ONNX model for GLiNER zero-shot NER.

Model Details

Base Model: urchade/gliner_small-v2.1
Format: ONNX with INT8 quantization
Use Case: Fast CPU inference for entity extraction

Usage

from gliner import GLiNER

model = GLiNER.from_pretrained(
    "nexuswho/gliner-onnx-small",
    load_onnx_model=True,
    onnx_model_file="model_quantized.onnx",
)

entities = model.predict_entities(
    "AWS Lambda integrates with Amazon S3",
    ["cloud service", "storage service"],
)

Files

model_quantized.onnx - INT8 quantized ONNX model (recommended)
model.onnx - Full precision ONNX model
config.json, tokenizer.json, etc. - Model configuration

License

Apache 2.0 (same as base GLiNER model)

Downloads last month: 1

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for nexuswho/gliner-onnx-small

Base model

urchade/gliner_small-v2.1

Quantized

(5)

this model