Feature Extraction
Transformers
Safetensors
Spanish
roberta
contrastive-learning
Spanish-UMLS
Hierarchical-enrichment
text-embeddings-inference
Instructions to use ICB-UMA/HERBERT-GP with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ICB-UMA/HERBERT-GP with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("feature-extraction", model="ICB-UMA/HERBERT-GP")# Load model directly from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained("ICB-UMA/HERBERT-GP") model = AutoModel.from_pretrained("ICB-UMA/HERBERT-GP") - Notebooks
- Google Colab
- Kaggle
HERBERT: Leveraging UMLS Hierarchical Knowledge to Enhance Clinical Entity Normalization in Spanish
HERBERT-GP is a contrastive-learning-based bi-encoder for medical entity normalization in Spanish.
It leverages hierarchical relationships from UMLS (parents and grandparents) to enhance the candidate retrieval step for entity linking in Spanish clinical texts.
Key features:
- Base model: PlanTL-GOB-ES/roberta-base-biomedical-clinical-es
- Trained with 15 positive pairs per anchor using synonyms, parents, and grandparents from UMLS/SNOMED-CT.
- Task: Normalization of disease, procedure, and symptom mentions to SNOMED-CT/UMLS codes.
- Domain: Spanish biomedical/clinical texts.
- Corpora: DisTEMIST, MedProcNER, SympTEMIST.
Evaluation (top-k accuracy):
| Corpus | Top-1 | Top-5 | Top-25 | Top-200 |
|---|---|---|---|---|
| DisTEMIST | 0.574 | 0.720 | 0.803 | 0.871 |
| SympTEMIST | 0.630 | 0.779 | 0.886 | 0.949 |
| MedProcNER | 0.655 | 0.767 | 0.840 | 0.894 |
- Downloads last month
- 11
Model tree for ICB-UMA/HERBERT-GP
Base model
PlanTL-GOB-ES/roberta-base-biomedical-es