BRIGHT NER: GLiNER2 fine-tuned for molecular

Description

This is a GLiNER2 architecture fine-tuned to extract clinical neuro-oncology entities related to the molecular semantic group. It was trained on a synthetic dataset generated for the properly de-identified BRIGHT project dataset (see the generated_data folder in the primary repository).

This model repository was specifically designed to fit within the bright_db overarching namespace.

Fields

It extracts the following fields (described in French):

  • mol_idh1: Statut mutation IDH1
  • mol_idh2: Statut mutation IDH2
  • mol_mgmt: Méthylation promoteur MGMT
  • mol_h3f3a: Mutation H3F3A
  • mol_hist1h3b: Mutation HIST1H3B
  • mol_tert: Mutation promoteur TERT
  • mol_CDKN2A: Délétion homozygote CDKN2A
  • mol_atrx: Mutation ATRX
  • mol_cic: Mutation CIC
  • mol_fubp1: Mutation FUBP1
  • mol_fgfr1: Mutation FGFR1
  • mol_egfr_mut: Mutation EGFR
  • mol_prkca: Mutation PRKCA
  • mol_pten: Mutation PTEN
  • mol_p53: Mutation p53
  • mol_braf: Mutation BRAF

Performance on Validation Set

Aggregates:

  • Macro F1: 0.3667 (Precision: 0.2956, Recall: 0.7353)
  • Micro F1: 0.5620 (Precision: 0.4165, Recall: 0.8634)

Per-Label Breakdowns:

Label Precision Recall F1
mol_idh1 0.9356 0.7927 0.8583
mol_idh2 0.6583 0.9357 0.7729
mol_mgmt 0.6730 0.9465 0.7867
mol_h3f3a 0.1333 0.7692 0.2273
mol_hist1h3b 0.0909 0.6667 0.1600
mol_tert 0.5600 0.8750 0.6829
mol_CDKN2A 0.4706 0.9333 0.6257
mol_atrx 0.3365 0.7778 0.4698
mol_cic 0.1488 0.8333 0.2525
mol_fubp1 0.1565 0.7200 0.2571
mol_fgfr1 0.0000 0.0000 0.0000
mol_egfr_mut 0.0079 1.0000 0.0156
mol_prkca 0.0000 0.0000 0.0000
mol_pten 0.0543 0.7143 0.1010
mol_p53 0.0333 1.0000 0.0645
mol_braf 0.4706 0.8000 0.5926

Usage

# Inference Code
from gliner2 import GLiNER2

model = GLiNER2.from_pretrained("raphael-r/bright-gliner-molecular")
text = "Patient presenting with epileptic seizures..."
entities = model.extract_entities(text)

for entity in entities:
    print(entity["text"], "=>", entity["label"])
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support