BRIGHT NER: EDS-NLP (CamemBERT + CRF) fine-tuned for histology

Description

This is a EDS-NLP (CamemBERT + CRF) architecture fine-tuned to extract clinical neuro-oncology entities related to the histology semantic group. It was trained on a synthetic dataset generated for the properly de-identified BRIGHT project dataset (see the generated_data folder in the primary repository).

This model repository was specifically designed to fit within the bright_db overarching namespace.

Fields

It extracts the following fields (described in French):

  • histo_necrose: Présence nécrose
  • histo_pec: Prolifération endothéliale/microvasculaire
  • histo_mitoses: Nombre mitoses ou index mitotique
  • aspect_cellulaire: Aspect cellulaire (astrocytaire, oligodendroglial)

Performance on Validation Set

Aggregates:

  • Macro F1: 0.6069 (Precision: 0.5036, Recall: 0.9521)
  • Micro F1: 0.6445 (Precision: 0.4913, Recall: 0.9365)

Per-Label Breakdowns:

Label Precision Recall F1
histo_necrose 0.2732 1.0000 0.4292
histo_pec 0.2011 1.0000 0.3348
histo_mitoses 0.8424 1.0000 0.9145
aspect_cellulaire 0.6978 0.8083 0.7490

Usage

# Inference Code
import edsnlp

nlp = edsnlp.load("raphael-r/bright-eds-histology")
doc = nlp("Patient presenting with epileptic seizures...")

for ent in doc.ents:
    print(ent.text, "=>", ent.label_)
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including raphael-r/bright-eds-histology