AbdoMatrix's picture
Upload BioBERT medical classifier
302e287 verified
metadata
language: en
license: mit
tags:
  - medical
  - classification
  - biobert
  - pubmedqa
  - healthcare-rag
datasets:
  - qiaojin/PubMedQA
metrics:
  - f1
pipeline_tag: text-classification

BioBERT Medical Query Classifier

Fine-tuned dmis-lab/biobert-v1.1 for classifying medical questions into 6 categories.

Categories

ID Category
0 Diagnosis
1 General
2 Medication
3 Prevention
4 Symptoms
5 Treatment

Results

Metric Score
Macro F1 0.9066
Weighted F1 0.9094
Accuracy 0.9088

Training Config

Item Value
Base model dmis-lab/biobert-v1.1
Dataset qiaojin/PubMedQA (211,186 rows)
Split 80/10/10
Epochs 3
Learning rate 2e-5
Batch size 16
Class weights Balanced (custom WeightedTrainer)

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification import torch

tokenizer = AutoTokenizer.from_pretrained("AbdoMatrix/biobert-medical-classifier") model = AutoModelForSequenceClassification.from_pretrained("AbdoMatrix/biobert-medical-classifier")

text = "What are the symptoms of diabetes?" inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=256)

with torch.no_grad(): outputs = model(**inputs)

predicted = model.config.id2label[torch.argmax(outputs.logits, dim=1).item()] print(predicted) # → Symptoms

Project

Healthcare RAG-Powered Medical Q&A Assistant eyouth x DEPI | Microsoft Machine Learning Track | 2026 GitHub: https://github.com/AbdooMatrix/Healthcare-RAG-Powered-Medical-QA-Assistant