BioBERT Fine-Tuned
Model Description
This model is a fine-tuned version of BioBERT, a pre-trained biomedical language model, adapted for medical text classification. It classifies medical abstracts into predefined categories based on their content.
Training Data
- Dataset: Contains 2286 medical abstracts across five categories:
- Neoplasms
- Digestive System Diseases
- Nervous System Diseases
- Cardiovascular Diseases
- General Pathological Conditions
- Preprocessing: Includes normalization, lemmatization, tokenization, stopword removal, and medical term standardization.
Intended Use
- Medical Text Classification: This model can be used for categorizing medical abstracts and research papers into relevant medical departments.
Limitations
- Not suitable for general-purpose NLP tasks.
- Domain-specific: The model may not perform well outside the medical field or with non-English text.