GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer
Paper
• 2311.08526 • Published
• 13
A token-level GLiNER model fine-tuned from knowledgator/gliner-multitask-v1.0 on the DeBERTa-v2-xlarge backbone.
| Property | Value |
|---|---|
| Backbone | microsoft/deberta-v2-xlarge |
| Hidden size | 1536 (encoder) / 1024 (GLiNER head) |
| Num layers | 24 |
| Attention heads | 24 |
| Span mode | token_level |
| Max sequence length | 1024 |
| Max entity types per sample | 30 |
| Max span width | 12 tokens |
| Subtoken pooling | first |
| Precision | bf16 |
| Model size | ~3.5 GB |
This model uses the GLiNER architecture with:
<<ENT>>, <<SEP>>) with a custom vocabulary size of 128,003| Parameter | Value |
|---|---|
| Base model | knowledgator/gliner-multitask-v1.0 |
| Optimizer | AdamW (beta1=0.9, beta2=0.999, eps=1e-8) |
| Encoder LR | 9e-6 |
| Other LR | 5e-5 / 7e-6 |
| Scheduler | Cosine with linear warmup (10%) |
| Batch size | 40 per device |
| Gradient accumulation | 1 |
| Max steps | 1,000 (150,000 total planned) |
| Weight decay | 0.01 (encoder) / 0.001 (other) |
| Max grad norm | 1.0 |
| Gradient checkpointing | Enabled |
| Dropout | 0.35 |
| Seed | 42 |
Evaluated. Mean F1 = 0.280 across 9 benchmark datasets.
Evaluated across 9 benchmark datasets.
Full evaluation results available at arthrod/gliner_review_comparison.
from gliner import GLiNER
model = GLiNER.from_pretrained("arthrod/gliner-multitask-v1.0")
text = "Apple Inc. was founded by Steve Jobs in Cupertino, California."
labels = ["company", "person", "location"]
entities = model.predict_entities(text, labels, threshold=0.5)
for entity in entities:
print(f"{entity['text']} => {entity['label']} (score: {entity['score']:.2f})")
This model is designed for zero-shot and few-shot Named Entity Recognition across arbitrary entity types. It can extract entities from text without requiring fine-tuning for specific entity categories.
If you use this model, please cite the original GLiNER paper:
@article{zaratiana2023gliner,
title={GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer},
author={Zaratiana, Urchade and Nzeyimana, Nabil and Holat, Pierre and Py, Olivier},
journal={arXiv preprint arXiv:2311.08526},
year={2023}
}