bright_db
Collection
Collection of models for the BRIGHT clinical database project. • 20 items • Updated
This is a GLiNER2 architecture fine-tuned to extract clinical neuro-oncology entities related to the demographics semantic group. It was trained on a synthetic dataset generated for the properly de-identified BRIGHT project dataset (see the generated_data folder in the primary repository).
This model repository was specifically designed to fit within the bright_db overarching namespace.
It extracts the following fields (described in French):
Aggregates:
Per-Label Breakdowns:
| Label | Precision | Recall | F1 |
|---|---|---|---|
| sexe | 0.9677 | 0.2875 | 0.4433 |
| annee_de_naissance | 0.8633 | 0.9557 | 0.9072 |
| activite_professionnelle | 0.4524 | 1.0000 | 0.6230 |
| antecedent_tumoral | 0.2222 | 0.1818 | 0.2000 |
| ik_clinique | 0.9767 | 1.0000 | 0.9882 |
| dominance_cerebrale | 0.3215 | 1.0000 | 0.4866 |
| neuroncologue | 0.4108 | 1.0000 | 0.5823 |
| neurochirurgien | 0.4291 | 0.9912 | 0.5989 |
| radiotherapeute | 0.3793 | 0.8148 | 0.5176 |
| anatomo_pathologiste | 0.2470 | 0.7593 | 0.3727 |
# Inference Code
from gliner2 import GLiNER2
model = GLiNER2.from_pretrained("raphael-r/bright-gliner-demographics")
text = "Patient presenting with epileptic seizures..."
entities = model.extract_entities(text)
for entity in entities:
print(entity["text"], "=>", entity["label"])