GLiNER2 Base v1 Q4_K - Zero-Shot Entity Recognition

GLiNER2 Base v1 Q4_K is a Termite split GGUF export of fastino/gliner2-base-v1 for zero-shot named entity recognition and relation-oriented extraction workflows.

Built by antflydb for use with Termite, a standalone ML inference service for embeddings, chunking, reranking, and recognition.

Architecture

Text + labels -> DeBERTa encoder GGUF -> GLiNER2 span head GGUF -> labeled spans
  • Encoder: DeBERTa-style encoder from fastino/gliner2-base-v1, exported as encoder.gguf.
  • Head: GLiNER2 span/head sidecar exported as gliner_head.gguf.
  • Quantization: eligible encoder, embedding, relative-position, and head tensors are stored as Q4_K; small normalization/bias tensors remain dense.
  • Bundle format: termite_bundle.json marks this as a gliner2_split_bundle/v1.

Intended Uses

  • Zero-shot named entity recognition with caller-provided labels
  • Entity extraction for Antfly indexes and document pipelines
  • Lightweight local recognition experiments through Termite
  • Relation extraction workflows that build on GLiNER2 entity spans

How to Use with Termite

termite recognize ./gliner2-base-v1-q4_k \
  "John works at Google in California." \
  --label person \
  --label organization \
  --label location \
  --backend native \
  --graph-runtime partitioned

Example output from local validation:

{
  "entities": [[
    {"text": "John", "label": "person", "score": 0.99997926},
    {"text": "Google", "label": "organization", "score": 0.9999995},
    {"text": "California.", "label": "location", "score": 0.9748632}
  ]]
}

Export Details

This bundle was created from the local Termite model cache:

termite export /Users/timkaye/.termite/models/recognizers/fastino/gliner2-base-v1 \
  --target gguf \
  --format q4_k \
  --output /private/tmp/gliner2-q4k-export-full-bundle/encoder.gguf

This export uses the full q4_k pass, including quantized token embeddings, relative-position embeddings, and the GLiNER head position embedding. Termite native recognition support for rank-3 quantized embedding tables is required for this bundle.

GGUF Files

File Description Local size
encoder.gguf DeBERTa encoder with 74 Q4_K tensors, including token and relative-position embeddings ~102 MB
gliner_head.gguf GLiNER2 span/head sidecar with 12 Q4_K tensors, including count_embed.pos_embedding.weight ~27 MB
termite_bundle.json Termite split-bundle marker <1 KB

Additional files include tokenizer.json, tokenizer_config.json, special_tokens_map.json, added_tokens.json, config.json, gliner_config.json, and model_manifest.json.

Validation

The exported bundle was inspected and run locally with Termite:

termite smoke /private/tmp/gliner2-q4k-export-full-bundle test --inspect-only

Inspection found a parseable DeBERTa GGUF with no unsupported tensor types and no missing required tensors.

termite recognize /private/tmp/gliner2-q4k-export-full-bundle \
  "John works at Google in California." \
  --label person \
  --label organization \
  --label location \
  --backend native \
  --graph-runtime partitioned

Limitations

  • This is a Termite split GGUF bundle, not a generic Transformers checkpoint.
  • The current package is intended for Termite native inference. Metal support depends on a Termite build with GGUF quantized embedding lookup support.
  • Small tensors such as normalization weights and biases remain dense where Q4_K is not appropriate.
  • Accuracy and label behavior inherit the limitations of fastino/gliner2-base-v1 and the caller-provided label set.

Citation

If you use this bundle, cite the upstream GLiNER2 model and the underlying DeBERTa backbone as appropriate for your work.

Downloads last month
264
GGUF
Model size
0.2B params
Architecture
deberta
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for antflydb/gliner2-base-v1-q4_k

Quantized
(4)
this model