HERBERT-P / README.md
fernandogd97's picture
Update README.md
a14333e verified
metadata
library_name: transformers
tags:
  - contrastive-learning
  - Spanish-UMLS
  - Hierarchical-enrichment
  - entity-linking
  - biomedical
  - spanish
license: mit
language:
  - es
base_model:
  - PlanTL-GOB-ES/roberta-base-biomedical-clinical-es

HERBERT: Leveraging UMLS Hierarchical Knowledge to Enhance Clinical Entity Normalization in Spanish

HERBERT-P is a contrastive-learning-based bi-encoder for medical entity normalization in Spanish, leveraging synonym and parent relationships from UMLS to enhance candidate retrieval for entity linking in clinical texts.

Key features:

  • Base model: PlanTL-GOB-ES/roberta-base-biomedical-clinical-es
  • Trained with 15 positive pairs per anchor (synonyms + parents)
  • Task: Normalization of disease, procedure, and symptom mentions to SNOMED-CT/UMLS codes.
  • Domain: Spanish biomedical/clinical texts.
  • Corpora: DisTEMIST, MedProcNER, SympTEMIST.

Benchmark Results

Corpus Top-1 Top-5 Top-25 Top-200
DisTEMIST 0.574 0.720 0.803 0.869
SympTEMIST 0.630 0.779 0.881 0.945
MedProcNER 0.651 0.763 0.838 0.892