bright_db
Collection
Collection of models for the BRIGHT clinical database project. • 20 items • Updated
This is a GLiNER2 architecture fine-tuned to extract clinical neuro-oncology entities related to the ihc semantic group. It was trained on a synthetic dataset generated for the properly de-identified BRIGHT project dataset (see the generated_data folder in the primary repository).
This model repository was specifically designed to fit within the bright_db overarching namespace.
It extracts the following fields (described in French):
Aggregates:
Per-Label Breakdowns:
| Label | Precision | Recall | F1 |
|---|---|---|---|
| ihc_idh1 | 0.7085 | 0.9860 | 0.8246 |
| ihc_atrx | 0.7069 | 0.9389 | 0.8066 |
| ihc_p53 | 0.6324 | 0.9435 | 0.7573 |
| ihc_fgfr3 | 0.0000 | 0.0000 | 0.0000 |
| ihc_braf | 0.0714 | 1.0000 | 0.1333 |
| ihc_gfap | 0.7605 | 0.9922 | 0.8610 |
| ihc_olig2 | 0.7750 | 0.9841 | 0.8671 |
| ihc_ki67 | 0.6477 | 0.9542 | 0.7716 |
| ihc_hist_h3k27m | 0.1250 | 1.0000 | 0.2222 |
| ihc_hist_h3k27me3 | 0.0390 | 1.0000 | 0.0750 |
| ihc_egfr_hirsch | 0.0385 | 0.8571 | 0.0736 |
| ihc_mmr | 0.0000 | 0.0000 | 0.0000 |
# Inference Code
from gliner2 import GLiNER2
model = GLiNER2.from_pretrained("raphael-r/bright-gliner-ihc")
text = "Patient presenting with epileptic seizures..."
entities = model.extract_entities(text)
for entity in entities:
print(entity["text"], "=>", entity["label"])