OpenBioLLM-D β Discriminative ICD-10 Classifier
Trained as part of the Master's thesis: "Enhancing Automated ICD-10 Medical Coding with Large Language Models" California State University, Sacramento Author: Namirah Imtieaz Shaik Advisor: Dr. Haiquan Chen
Model Description
OpenBioLLM-D is a discriminative ICD-10 medical coding model built on top of Llama3-OpenBioLLM-8B. It treats ICD-10 coding as a 30-class single-label classification problem over the most frequent diagnostic codes in MIMIC-IV clinical discharge summaries.
Architecture:
- Backbone: aaditya/Llama3-OpenBioLLM-8B loaded via AutoModel (hidden states only, no LM head)
- LoRA: r=16, alpha=32, targeting q/k/v/o/gate/up/down projections (~0.1% trainable params)
- Pooling: Masked mean pooling over non-padding token hidden states
- Head: Two-layer MLP (4096 β 1536 β 30 logits)
- Loss: CrossEntropyLoss (single-label multiclass)
Training Details:
- Dataset: MIMIC-IV discharge summaries
- Train / Val / Test: 16,540 / 2,068 / 2,068 examples
- Optimizer: AdamW with cosine LR schedule and 5% warmup
- Early stopping: monitored on macro F1 with patience=2
- Text handling: Head+tail cropping (40% head / 60% tail) to fit 512-token limit
- Results reported as mean Β± std across 5 random seeds
Results
| Metric | Score |
|---|---|
| Micro F1 | 0.7802 Β± 0.0045 |
| ROC-AUC (Weighted OVR) | 0.9857 Β± 0.0004 |
Best performing discriminative model in the thesis. Outperforms BERT-PLM-ICD (0.7466), Longformer-PLM-ICD (0.7316), RoBERTa-PLM-ICD (0.7282), Meditron-D (0.7668), and BioMistral-D (0.7776).
How to Load and Use
from transformers import AutoTokenizer, AutoModel
from peft import PeftModel
import torch
import torch.nn as nn
from types import SimpleNamespace
# Step 1 - Load base model
base_model = AutoModel.from_pretrained(
"aaditya/Llama3-OpenBioLLM-8B",
torch_dtype=torch.float16,
device_map="auto",
)
# Step 2 - Attach LoRA adapter
base_model = PeftModel.from_pretrained(
base_model,
"Namirah07/OpenBioLLM-D-ICD10",
subfolder="lora_adapter"
)
# Step 3 - Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("Namirah07/OpenBioLLM-D-ICD10")
if tokenizer.pad_token is None:
tokenizer.pad_token = tokenizer.eos_token
# Step 4 - Load label map
import json
from huggingface_hub import hf_hub_download
label_map_path = hf_hub_download("Namirah07/OpenBioLLM-D-ICD10", "label_map.json")
with open(label_map_path) as f:
label_map = json.load(f)
id2label = {int(k): v for k, v in label_map["id2label"].items()}
# Step 5 - Tokenize and predict
note = "Patient admitted with chest pain and shortness of breath..."
inputs = tokenizer(
note,
return_tensors="pt",
truncation=True,
max_length=512,
)
with torch.no_grad():
out = base_model(**inputs)
# Mean pool
mask = inputs["attention_mask"].unsqueeze(-1).float()
pooled = (out.last_hidden_state * mask).sum(1) / mask.sum(1)
# Note: custom_head.pt must be loaded separately for full inference
# See GitHub repository for the complete inference pipeline
print("See GitHub for full inference code including the MLP head")
For the complete inference pipeline including the custom MLP head see the full training and evaluation code in the GitHub repository below.
Dataset
MIMIC-IV clinical discharge summaries (Johnson et al., 2023). Access requires a PhysioNet credentialed account and data use agreement. https://physionet.org/content/mimiciv/
GitHub Repository
Full training code, evaluation scripts, hyperparameter tuning scripts, and the Gradio explainability demo: https://github.com/Namirah07/Enhancing-Automated-ICD-Medical-Coding-with-Large-Language-Models
Citation
@mastersthesis{shaik2025icd10,
author = {Namirah Imtieaz Shaik},
title = {Enhancing Automated ICD-10 Medical Coding with Large Language Models},
school = {California State University, Sacramento},
year = {2025},
advisor = {Dr. Haiquan Chen}
}
License
MIT License. The base model (Llama3-OpenBioLLM-8B) is subject to its own license on HuggingFace. The MIMIC-IV dataset requires a PhysioNet data use agreement.
Model tree for Namirah07/OpenBioLLM-D-ICD10
Base model
meta-llama/Meta-Llama-3-8B