HERBERT-GP / README.md
fernandogd97's picture
Update README.md
26929fe verified
---
library_name: transformers
tags:
- contrastive-learning
- Spanish-UMLS
- Hierarchical-enrichment
license: mit
language:
- es
base_model:
- PlanTL-GOB-ES/roberta-base-biomedical-es
---
# HERBERT: Leveraging UMLS Hierarchical Knowledge to Enhance Clinical Entity Normalization in Spanish
**HERBERT-GP** is a contrastive-learning-based bi-encoder for medical entity normalization in Spanish.
It leverages hierarchical relationships from UMLS (parents and grandparents) to enhance the candidate retrieval step for entity linking in Spanish clinical texts.
**Key features:**
- Base model: [PlanTL-GOB-ES/roberta-base-biomedical-clinical-es](https://huggingface.co/PlanTL-GOB-ES/roberta-base-biomedical-clinical-es)
- Trained with 15 positive pairs per anchor using synonyms, parents, and grandparents from UMLS/SNOMED-CT.
- Task: Normalization of disease, procedure, and symptom mentions to SNOMED-CT/UMLS codes.
- Domain: Spanish biomedical/clinical texts.
- Corpora: DisTEMIST, MedProcNER, SympTEMIST.
---
## Evaluation (top-k accuracy):
| Corpus | Top-1 | Top-5 | Top-25 | Top-200 |
|-------------|--------|--------|--------|---------|
| DisTEMIST | 0.574 | 0.720 | 0.803 | 0.871 |
| SympTEMIST | 0.630 | 0.779 | 0.886 | 0.949 |
| MedProcNER | 0.655 | 0.767 | 0.840 | 0.894 |