Token Classification
PyTorch
ONNX
multilingual
glitext
File size: 6,213 Bytes
2cafc97
 
175d602
 
 
 
 
 
fd904d9
 
 
 
 
175d602
 
fd904d9
 
 
 
 
175d602
fd904d9
2cafc97
 
175d602
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2cafc97
175d602
2cafc97
175d602
2cafc97
175d602
2cafc97
 
fd904d9
2cafc97
 
fd904d9
2cafc97
 
fd904d9
2cafc97
 
fd904d9
2cafc97
fd904d9
2cafc97
 
fd904d9
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
---
license: apache-2.0
language:
- multilingual
library_name: glitext
datasets:
- urchade/pile-mistral-v0.1
pipeline_tag: token-classification
tags:
- glitext
glitext:
  name: medium
  label: GliText Recognition (Balanced)
  description: An efficient zero-shot named entity recognition model tuned for generalized
    extraction with balanced speed and accuracy.
  recognition: true
  classification: false
  association: false
  span_mode: true
  size_gb: 0.78
  hf_repo: sassoftware/glitext-medium
  source_url: gliner-community/gliner_medium-v2.5
---

# About

GLiNER is a Named Entity Recognition (NER) model capable of identifying any entity type using a bidirectional transformer encoder (BERT-like). It provides a practical alternative to traditional NER models, which are limited to predefined entities, and Large Language Models (LLMs) that, despite their flexibility, are costly and large for resource-constrained scenarios.


## Links

* Paper: https://arxiv.org/abs/2311.08526
* Repository: https://github.com/urchade/GLiNER

## Installation
To use this model, you must install the GLiNER Python library:
```
!pip install gliner -U
```

## Usage
Once you've downloaded the GLiNER library, you can import the GLiNER class. You can then load this model using `GLiNER.from_pretrained` and predict entities with `predict_entities`.

```python
from gliner import GLiNER

model = GLiNER.from_pretrained("gliner-community/gliner_medium-v2.5", load_tokenizer=True)

text = """
Cristiano Ronaldo dos Santos Aveiro (Portuguese pronunciation: [kɾiʃˈtjɐnu ʁɔˈnaldu]; born 5 February 1985) is a Portuguese professional footballer who plays as a forward for and captains both Saudi Pro League club Al Nassr and the Portugal national team. Widely regarded as one of the greatest players of all time, Ronaldo has won five Ballon d'Or awards,[note 3] a record three UEFA Men's Player of the Year Awards, and four European Golden Shoes, the most by a European player. He has won 33 trophies in his career, including seven league titles, five UEFA Champions Leagues, the UEFA European Championship and the UEFA Nations League. Ronaldo holds the records for most appearances (183), goals (140) and assists (42) in the Champions League, goals in the European Championship (14), international goals (128) and international appearances (205). He is one of the few players to have made over 1,200 professional career appearances, the most by an outfield player, and has scored over 850 official senior career goals for club and country, making him the top goalscorer of all time.
"""

labels = ["person", "award", "date", "competitions", "teams"]

entities = model.predict_entities(text, labels)

for entity in entities:
    print(entity["text"], "=>", entity["label"])
```

```
Cristiano Ronaldo dos Santos Aveiro => person
5 February 1985 => date
Al Nassr => teams
Portugal national team => teams
Ballon d'Or => award
UEFA Men's Player of the Year Awards => award
European Golden Shoes => award
UEFA Champions Leagues => competitions
UEFA European Championship => competitions
UEFA Nations League => competitions
Champions League => competitions
European Championship => competitions
```

## Named Entity Recognition benchmark result
Below is a comparison of results between previous versions of the model and the current one:
![Models performance](models_comparison.png)


## Available models

| Release | Model Name | # of Parameters | Language | License |
| - | - | - | - | - |
| v0 | [urchade/gliner_base](https://huggingface.co/urchade/gliner_base)<br>[urchade/gliner_multi](https://huggingface.co/urchade/gliner_multi) | 209M<br>209M | English<br>Multilingual | cc-by-nc-4.0 |
| v1 | [urchade/gliner_small-v1](https://huggingface.co/urchade/gliner_small-v1)<br>[urchade/gliner_medium-v1](https://huggingface.co/urchade/gliner_medium-v1)<br>[urchade/gliner_large-v1](https://huggingface.co/urchade/gliner_large-v1) | 166M<br>209M<br>459M | English <br> English <br> English | cc-by-nc-4.0 |
| v2 | [urchade/gliner_small-v2](https://huggingface.co/urchade/gliner_small-v2)<br>[urchade/gliner_medium-v2](https://huggingface.co/urchade/gliner_medium-v2)<br>[urchade/gliner_large-v2](https://huggingface.co/urchade/gliner_large-v2) | 166M<br>209M<br>459M |  English <br> English <br> English | apache-2.0 |
| v2.1 | [urchade/gliner_small-v2.1](https://huggingface.co/urchade/gliner_small-v2.1)<br>[urchade/gliner_medium-v2.1](https://huggingface.co/urchade/gliner_medium-v2.1)<br>[urchade/gliner_large-v2.1](https://huggingface.co/urchade/gliner_large-v2.1) <br>[urchade/gliner_multi-v2.1](https://huggingface.co/urchade/gliner_multi-v2.1) | 166M<br>209M<br>459M<br>209M | English <br> English <br> English <br> Multilingual | apache-2.0 |


## Model Authors
The model authors are:
* [Urchade Zaratiana](https://huggingface.co/urchade)
* [Ihor Stepanov](https://huggingface.co/Ihor)
* Nadi Tomeh
* Pierre Holat
* Thierry Charnois

## Citation
```bibtex
@misc{zaratiana2023gliner,
      title={GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer}, 
      author={Urchade Zaratiana and Nadi Tomeh and Pierre Holat and Thierry Charnois},
      year={2023},
      eprint={2311.08526},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
```

## Source Model Repo

This model is derived from [`gliner-community/gliner_medium-v2.5`](https://huggingface.co/gliner-community/gliner_medium-v2.5). See the upstream repository for the original safetensors weights, training data, and the full upstream model card.

## ONNX Weights

ONNX weights added by SAS — converted from the upstream safetensors checkpoint.

File in this repo: `model.onnx`.

## Using this Model with the SAS GLiText API

This repo is consumed by the SAS GLiText product. To download it onto a SAS GLiText server:

```
POST /v1/models/download?name=medium
```

To download and load into memory in one step:

```
PUT /v1/models?name=medium
```

## Security Scan

Scanned with [modelaudit](https://github.com/promptfoo/modelaudit) v0.2.40 on 2026-04-27. 24/24 checks passed. [Full results](modelaudit.json).


| File | Size | SHA-256 |
|------|------|---------|
| `model.onnx` | 835.6 MB | `dfbf82b4c9b7cb8e…` |