BRIGHT NER: GLiNER2 fine-tuned for diagnosis

Description

This is a GLiNER2 architecture fine-tuned to extract clinical neuro-oncology entities related to the diagnosis semantic group. It was trained on a synthetic dataset generated for the properly de-identified BRIGHT project dataset (see the generated_data folder in the primary repository).

This model repository was specifically designed to fit within the bright_db overarching namespace.

Fields

It extracts the following fields (described in French):

diag_histologique: Diagnostic anatomopathologique
diag_integre: Diagnostic intégré OMS 2021
classification_oms: Classification OMS utilisée (2007, 2016 ou 2021)
grade: Grade OMS (1, 2, 3 ou 4)
num_labo: Numéro échantillon laboratoire anatomopathologie

Performance on Validation Set

Aggregates:

Macro F1: 0.8900 (Precision: 0.9103, Recall: 0.8927)
Micro F1: 0.9167 (Precision: 0.9255, Recall: 0.9080)

Per-Label Breakdowns:

Label	Precision	Recall	F1
diag_histologique	0.9876	1.0000	0.9938
diag_integre	0.8148	0.4701	0.5962
classification_oms	0.9690	1.0000	0.9842
grade	0.9810	0.9936	0.9873
num_labo	0.7993	1.0000	0.8884

Usage

# Inference Code
from gliner2 import GLiNER2

model = GLiNER2.from_pretrained("raphael-r/bright-gliner-diagnosis")
text = "Patient presenting with epileptic seizures..."
entities = model.extract_entities(text)

for entity in entities:
    print(entity["text"], "=>", entity["label"])

Downloads last month: -

Safetensors

Model size

0.3B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support