BoilerplateChecker-1225
BoilerplateChecker-1225 is a binary text classifier that estimates whether a given patient summary indicates that the patient has a history of conditions that may exclude them from a given clinical trial based on that trial's boilerplate exclusion criteria.
"Boilerplate exclusions" are intended to represent exclusion criteria that are not central to defining the target population for a specifictrial, but that instead tend to exclude patients from many clinical trials in general.
Examples of "boilerplate exclusions" include concepts like "uncontrolled brain metastases" or "history of pneumonitis."
This model is fine-tuned from [answerdotai/ModernBERT-large] for sequence classification on pairs of (trial_boilerplate_text, patient_boilerplate_text). "Patient boilerplate text" represents a subsection of an overall patient summary that describes any history of such conditions.
This model is not intended to capture whether a patient is excluded from a clinical trial based on trial criteria central to defining the trial's target population, which include age, sex, cancer type, histology, cancer burden requirements, biomarker requirements, and treatment history requirements. These concepts are covered by the separate TrialChecker classification model.
Important: This is a research prototype for model development, not a medical device or approved clinical decision support tool. It is not intended for clinical decision-making.
Training summary
The classifier was trained with a script that:
- Loads three sources of annotated patient–trial pairs:
- Pairs originating from space-specific eligibility checks
- “Patient→top-cohorts” checks (rounds 1–3)
- “Trial-space→top patients” checks (rounds 1–3)
- Deduplicates by
['patient_boilerplate_text', 'trial_boilerplate_text'] - Builds the final text input as:
text = "Patient history: " + patient_boilerplate_text + "\nTrial exclusions:" + trial_boilerplate_text
- Uses
exclusion_resultas the binary label (0/1) - Model is ModernBERT-large (sequence classification, 2 labels) at max_length 3192
Key hyperparameters from training (on H100 x 8)
- Base model:
answerdotai/ModernBERT-large - Max length: 3192
- Optimizer settings:
learning_rate=2e-5,weight_decay=0.01 - Batch size:
per_device_train_batch_size=8 - Epochs:
2 - Save strategy:
epoch - Tokenizer:
AutoTokenizer.from_pretrained("answerdotai/ModernBERT-large") - Data collator:
DataCollatorWithPadding
Intended use
- Input: a string describing the patient's history of common "boilerplate exclusion conditions", if any, and a clinical trial's "boilerplate exclusion criteria," if any.
- Output: probability that the patient is excluded from the trial based on the trial's "boilerplate exclusion criteria".
- Use cases:
- Deeper pre-screening of candidate patients for specific trials
Out of scope:
- Confirming formal eligibility or safety
- Formal (autonomous) medical record review, diagnosis, or treatment decision-making
Inference (Transformers)
Quick start (single example)
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification
device = "cuda" if torch.cuda.is_available() else "cpu"
MODEL_REPO = "ksg-dfci/BoilerplateChecker-1225"
tok = AutoTokenizer.from_pretrained(MODEL_REPO)
model = AutoModelForSequenceClassification.from_pretrained(MODEL_REPO).to(device)
model.eval()
trial_boilerplate_text = (
"Patients with uncontrolled brain metastases are excluded."
)
patient_boilerplate_text = (
"New brain metastases identified 01/02/23, not yet treated."
)
text = "Patient history: " + patient_boilerplate_text + "\nTrial exclusions:" + trial_boilerplate_text
# Raw Transformers model
enc = tok(text, return_tensors="pt", truncation=True, max_length=4096).to(device)
with torch.no_grad():
logits = model(**enc).logits
probs = logits.softmax(-1).squeeze(0)
# Label mapping was set in training: {0: "NEGATIVE", 1: "POSITIVE"}
p_positive = float(probs[1])
print(f"Exclusion probability: {p_positive:.3f}")
# Or pipeline API to get similar outputs
from trasnformers import pipeline
pipe = pipeline('text-classification', 'ksg-dfci/BoilerplateChecker-1225')
pipe([text])
Batched scoring
from typing import List
import torch
def score_pairs(spaces: List[str], summaries: List[str], tokenizer, model, max_length=4096, batch_size=8):
assert len(spaces) == len(summaries)
device = next(model.parameters()).device
scores = []
for i in range(0, len(spaces), batch_size):
batch_spaces = spaces[i:i+batch_size]
batch_summaries = summaries[i:i+batch_size]
texts = [s + "\nNow here is the patient summary:" + p for s, p in zip(batch_spaces, batch_summaries)]
enc = tokenizer(texts, return_tensors="pt", padding=True, truncation=True, max_length=max_length).to(device)
with torch.no_grad():
logits = model(**enc).logits
probs = logits.softmax(-1)[:, 1] # POSITIVE
scores.extend(probs.detach().cpu().tolist())
return scores
# Example
trial_exclusions = [trial_boilerplate_text] * 3
paitne_boilerplate_texts = [patient_boilerplate_text, "Different patient comorbidities 1...", "Different patient comorbidities 2..."]
scores = score_pairs(spaces, summaries, tok, model)
print(scores)
Thresholding & calibration
- Default decision: 0.5 on the POSITIVE probability.
- For better calibration/operating points, tune the threshold on a validation set (e.g., maximize F1, optimize Youden’s J, or set to a desired precision).
How to prepare inputs
Trial boilerplate text: as per example above, a compact list of exclusion criteria for a trial that are not central to the target population for the trial Patient boilerplate text: as per example above, a concise summary of any medical conditions that may meet common boilerplat exclusion criteria
You can generate these inputs with your upstream LLM pipeline (e.g., gpt-oss-120b or our OncoReasoning-3B-1225 model for summarization and trial information extraction), but the classifier accepts any plain strings in the format shown above.
Reproducibility (high-level)
Below is the minimal structure used by the training script to build the dataset before tokenization:
# 1) Load and merge three labeled sources
# - space_specific_eligibility_checks.parquet
# - top_ten_cohorts_checked_round{1,2,3}.csv
# - top_twenty_patients_checked_round{1,2,3}.csv
# 2) Deduplicate by ['patient_boilerplate_text','trial_boilerplate_text'] and keep:
# - split, patient_boilerplate_text, trial_boilerplate_text, exclusion_result
# 3) Compose input text and label:
text = this_space + "\nNow here is the patient summary:" + patient_summary
label = int(eligibility_result) # 0 or 1
# 4) Tokenize with ModernBERT tokenizer (max_length=3192, truncation=True)
# 5) Train AutoModelForSequenceClassification, which then produces probabilities for the "POSITIVE" class (patient may be excluded) and for the "NEGATIVE" class (patient not predicted to be excluded)
To reproduce exactly, consult and run the original training scripts at https://github.com/kenlkehl/matchminer-ai-training.
Limitations & ethical considerations
- Outputs reflect training data and may contain biases or errors.
- The model estimates probability of exclusion based on common boilerplate criteria, not formal eligibility screening.
- Not validated for safety-critical use; do not use for diagnosis or treatment decisions.
Citation
If you use this model or parts of the pipeline, please cite this model card and arxiv preprint (https://arxiv.org/abs/2412.17228) or corresponding journal publication (pending).
- Downloads last month
- 93