BioGPT BI-RADS Classifier

This model is a fine-tuned version of microsoft/biogpt-large for BI-RADS classification of radiology reports.

Model Description

Base Model: microsoft/biogpt-large
Task: Multi-class text classification (BI-RADS categories 0-6)
Training Data: Radiology reports with BI-RADS annotations
Accuracy: 97.36%
F1-Score (Macro): 90.66%

Performance

Overall Metrics

Accuracy: 97.36%
F1-Score (Macro): 90.66%
F1-Score (Weighted): 97.34%
Precision (Macro): 92.65%
Recall (Macro): 88.96%

Per-Class Performance

BI-RADS	Precision	Recall	F1-Score	Support
0	0.9946	0.9482	0.9708	193
1	0.9504	0.9664	0.9583	119
2	0.9740	0.9943	0.9840	527
3	1.0000	0.8333	0.9091	18
4	0.9000	0.8182	0.8571	11
5	0.6667	0.6667	0.6667	3
6	1.0000	1.0000	1.0000	1

Usage

from transformers import AutoTokenizer, BioGptForSequenceClassification
import torch

# Load model and tokenizer
model = BioGptForSequenceClassification.from_pretrained("ishro/biogpt-aura")
tokenizer = AutoTokenizer.from_pretrained("ishro/biogpt-aura")

# Prepare input
report_text = "Your radiology report text here..."
inputs = tokenizer(report_text, return_tensors="pt", padding=True, truncation=True, max_length=512)

# Get prediction
with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
    predicted_class = torch.argmax(predictions, dim=-1).item()

# Map to BI-RADS label
birads_label = model.config.id2label[predicted_class]
print(f"Predicted BI-RADS: {birads_label}")
print(f"Confidence: {predictions[0][predicted_class].item():.4f}")

Training Details

Training Hyperparameters

Learning Rate: 2e-5
Batch Size: 4 per device (2 GPUs)
Gradient Accumulation Steps: 8
Effective Batch Size: 64
Epochs: 3
Optimizer: AdamW (fused)
Mixed Precision: BF16
Hardware: 2x NVIDIA L40S (46GB each)

Training Data

The model was trained on radiology reports with the following features:

Report observations
Conclusions
Recommendations
Patient metadata (age, hormonal therapy, family history, etc.)

Limitations

Performance on BI-RADS categories 5 and 6 is lower due to limited training samples
Model is trained on specific radiology report format
May not generalize well to reports from different institutions without fine-tuning

Ethical Considerations

This model is intended for research purposes and should not be used as the sole basis for clinical decisions
Always consult with qualified medical professionals for diagnosis and treatment
The model may have biases based on the training data distribution

Citation

If you use this model, please cite:

@misc{biogpt-birads-classifier,
  author = {Your Name},
  title = {BioGPT BI-RADS Classifier},
  year = {2025},
  publisher = {HuggingFace},
  url = {https://huggingface.co/ishro/biogpt-aura}
}

Model Card Authors

ishro

Model Card Contact

For questions or issues, please open an issue on the model repository.

Downloads last month: 15

Safetensors

Model size

0.3B params

Tensor type

F32

Space using ishro/biogpt-aura 1

Evaluation results

Accuracy
self-reported

0.974
F1 (Macro)
self-reported

0.907