GLiNER Multitask v1.0

A token-level GLiNER model fine-tuned from knowledgator/gliner-multitask-v1.0 on the DeBERTa-v2-xlarge backbone.

Model Details

Property	Value
Backbone	`microsoft/deberta-v2-xlarge`
Hidden size	1536 (encoder) / 1024 (GLiNER head)
Num layers	24
Attention heads	24
Span mode	`token_level`
Max sequence length	1024
Max entity types per sample	30
Max span width	12 tokens
Subtoken pooling	`first`
Precision	bf16
Model size	~3.5 GB

Architecture

This model uses the GLiNER architecture with:

DeBERTa-v2-xlarge as the text encoder with relative position attention
Token-level span representation (not span-level) for entity extraction
RNN post-fusion layer (1 layer) for contextual refinement
Embedded entity tokens (<<ENT>>, <<SEP>>) with a custom vocabulary size of 128,003
Focal loss (alpha=0.75, gamma=0) for handling class imbalance
Global masking strategy for negative sampling

Training Configuration

Parameter	Value
Base model	`knowledgator/gliner-multitask-v1.0`
Optimizer	AdamW (beta1=0.9, beta2=0.999, eps=1e-8)
Encoder LR	9e-6
Other LR	5e-5 / 7e-6
Scheduler	Cosine with linear warmup (10%)
Batch size	40 per device
Gradient accumulation	1
Max steps	1,000 (150,000 total planned)
Weight decay	0.01 (encoder) / 0.001 (other)
Max grad norm	1.0
Gradient checkpointing	Enabled
Dropout	0.35
Seed	42

Evaluation Status

Evaluated. Mean F1 = 0.280 across 9 benchmark datasets.

Evaluated across 9 benchmark datasets.

Full evaluation results available at arthrod/gliner_review_comparison.

Usage

from gliner import GLiNER

model = GLiNER.from_pretrained("arthrod/gliner-multitask-v1.0")

text = "Apple Inc. was founded by Steve Jobs in Cupertino, California."
labels = ["company", "person", "location"]

entities = model.predict_entities(text, labels, threshold=0.5)

for entity in entities:
    print(f"{entity['text']} => {entity['label']} (score: {entity['score']:.2f})")

Intended Use

This model is designed for zero-shot and few-shot Named Entity Recognition across arbitrary entity types. It can extract entities from text without requiring fine-tuning for specific entity categories.

Limitations

Performance depends on the quality and specificity of the entity type labels provided
Maximum input length is 1024 tokens
Span width is limited to 12 tokens
Primarily trained on English text

Citation

If you use this model, please cite the original GLiNER paper:

@article{zaratiana2023gliner,
  title={GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer},
  author={Zaratiana, Urchade and Nzeyimana, Nabil and Holat, Pierre and Py, Olivier},
  journal={arXiv preprint arXiv:2311.08526},
  year={2023}
}

Downloads last month: 88

Paper for arthrod/gliner-multitask-v1.0

GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer

Paper • 2311.08526 • Published Nov 14, 2023 • 13