bright_db
Collection
Collection of models for the BRIGHT clinical database project. • 10 items • Updated
This is a EDS-NLP (CamemBERT + CRF) architecture fine-tuned to extract clinical neuro-oncology entities related to the diagnosis semantic group. It was trained on a synthetic dataset generated for the properly de-identified BRIGHT project dataset (see the generated_data folder in the primary repository).
This model repository was specifically designed to fit within the bright_db overarching namespace.
It extracts the following fields (described in French):
Aggregates:
Per-Label Breakdowns:
| Label | Precision | Recall | F1 |
|---|---|---|---|
| diag_histologique | 0.9869 | 0.9467 | 0.9664 |
| diag_integre | 0.7157 | 0.9145 | 0.8030 |
| classification_oms | 0.8700 | 1.0000 | 0.9305 |
| grade | 0.9615 | 0.8814 | 0.9197 |
| num_labo | 0.6296 | 0.4574 | 0.5299 |
# Inference Code
import edsnlp
nlp = edsnlp.load("raphael-r/bright-eds-diagnosis")
doc = nlp("Patient presenting with epileptic seizures...")
for ent in doc.ents:
print(ent.text, "=>", ent.label_)