ICB-UMA
/

HERBERT-P-30

Feature Extraction

contrastive-learning

Hierarchical-enrichment

text-embeddings-inference

Model card Files Files and versions

HERBERT: Leveraging UMLS Hierarchical Knowledge to Enhance Clinical Entity Normalization in Spanish

HERBERT-P is a contrastive-learning-based bi-encoder for medical entity normalization in Spanish, leveraging synonym and parent relationships from UMLS to enhance candidate retrieval for entity linking in clinical texts.

Key features:

Base model: PlanTL-GOB-ES/roberta-base-biomedical-clinical-es
Trained with 30 positive pairs per anchor (synonyms + parents)
Task: Normalization of disease, procedure, and symptom mentions to SNOMED-CT/UMLS codes.
Domain: Spanish biomedical/clinical texts.
Corpora: DisTEMIST, MedProcNER, SympTEMIST.

Benchmark Results

Corpus	Top-1	Top-5	Top-25	Top-200
DisTEMIST	0.588	0.723	0.803	0.867
SympTEMIST	0.635	0.784	0.882	0.946
MedProcNER	0.651	0.765	0.838	0.892

Downloads last month: 3

Safetensors

Model size

0.1B params

Tensor type

F32

·

Model tree for ICB-UMA/HERBERT-P-30

Base model

PlanTL-GOB-ES/roberta-base-biomedical-clinical-es

Finetuned

(10)

this model