Feature Extraction
Transformers
Safetensors
Spanish
roberta
contrastive-learning
Spanish-UMLS
Hierarchical-enrichment
entity-linking
biomedical
spanish
text-embeddings-inference
Instructions to use ICB-UMA/HERBERT-P-30 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ICB-UMA/HERBERT-P-30 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("feature-extraction", model="ICB-UMA/HERBERT-P-30")# Load model directly from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained("ICB-UMA/HERBERT-P-30") model = AutoModel.from_pretrained("ICB-UMA/HERBERT-P-30") - Notebooks
- Google Colab
- Kaggle
HERBERT: Leveraging UMLS Hierarchical Knowledge to Enhance Clinical Entity Normalization in Spanish
HERBERT-P is a contrastive-learning-based bi-encoder for medical entity normalization in Spanish, leveraging synonym and parent relationships from UMLS to enhance candidate retrieval for entity linking in clinical texts.
Key features:
- Base model: PlanTL-GOB-ES/roberta-base-biomedical-clinical-es
- Trained with 30 positive pairs per anchor (synonyms + parents)
- Task: Normalization of disease, procedure, and symptom mentions to SNOMED-CT/UMLS codes.
- Domain: Spanish biomedical/clinical texts.
- Corpora: DisTEMIST, MedProcNER, SympTEMIST.
Benchmark Results
| Corpus | Top-1 | Top-5 | Top-25 | Top-200 |
|---|---|---|---|---|
| DisTEMIST | 0.588 | 0.723 | 0.803 | 0.867 |
| SympTEMIST | 0.635 | 0.784 | 0.882 | 0.946 |
| MedProcNER | 0.651 | 0.765 | 0.838 | 0.892 |
- Downloads last month
- 3