BRIGHT NER: GLiNER2 fine-tuned for diagnosis

Description

This is a GLiNER2 architecture fine-tuned to extract clinical neuro-oncology entities related to the diagnosis semantic group. It was trained on a synthetic dataset generated for the properly de-identified BRIGHT project dataset (see the generated_data folder in the primary repository).

This model repository was specifically designed to fit within the bright_db overarching namespace.

Fields

It extracts the following fields (described in French):

  • diag_histologique: Diagnostic anatomopathologique
  • diag_integre: Diagnostic intégré OMS 2021
  • classification_oms: Classification OMS utilisée (2007, 2016 ou 2021)
  • grade: Grade OMS (1, 2, 3 ou 4)
  • num_labo: Numéro échantillon laboratoire anatomopathologie

Performance on Validation Set

Aggregates:

  • Macro F1: 0.8900 (Precision: 0.9103, Recall: 0.8927)
  • Micro F1: 0.9167 (Precision: 0.9255, Recall: 0.9080)

Per-Label Breakdowns:

Label Precision Recall F1
diag_histologique 0.9876 1.0000 0.9938
diag_integre 0.8148 0.4701 0.5962
classification_oms 0.9690 1.0000 0.9842
grade 0.9810 0.9936 0.9873
num_labo 0.7993 1.0000 0.8884

Usage

# Inference Code
from gliner2 import GLiNER2

model = GLiNER2.from_pretrained("raphael-r/bright-gliner-diagnosis")
text = "Patient presenting with epileptic seizures..."
entities = model.extract_entities(text)

for entity in entities:
    print(entity["text"], "=>", entity["label"])
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support