GLiNER ONNX - Small

Pre-converted ONNX model for GLiNER zero-shot NER.

Model Details

  • Base Model: urchade/gliner_small-v2.1
  • Format: ONNX with INT8 quantization
  • Use Case: Fast CPU inference for entity extraction

Usage

from gliner import GLiNER

model = GLiNER.from_pretrained(
    "nexuswho/gliner-onnx-small",
    load_onnx_model=True,
    onnx_model_file="model_quantized.onnx",
)

entities = model.predict_entities(
    "AWS Lambda integrates with Amazon S3",
    ["cloud service", "storage service"],
)

Files

  • model_quantized.onnx - INT8 quantized ONNX model (recommended)
  • model.onnx - Full precision ONNX model
  • config.json, tokenizer.json, etc. - Model configuration

License

Apache 2.0 (same as base GLiNER model)

Downloads last month
11
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for nexuswho/gliner-onnx-small

Quantized
(3)
this model