Text Classification
Transformers
PyTorch
Safetensors
German
roberta
radiology
medical-imaging
chest-ct
multi-label-classification
radbert
german
ctrate
custom_code
text-embeddings-inference
Instructions to use suitch/radbert-german-ctrate-classifier with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use suitch/radbert-german-ctrate-classifier with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="suitch/radbert-german-ctrate-classifier", trust_remote_code=True)# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("suitch/radbert-german-ctrate-classifier", trust_remote_code=True) model = AutoModelForSequenceClassification.from_pretrained("suitch/radbert-german-ctrate-classifier", trust_remote_code=True) - Notebooks
- Google Colab
- Kaggle
| language: | |
| - de | |
| license: mit | |
| library_name: transformers | |
| pipeline_tag: text-classification | |
| tags: | |
| - radiology | |
| - medical-imaging | |
| - chest-ct | |
| - multi-label-classification | |
| - radbert | |
| - german | |
| - ctrate | |
| base_model: zzxslp/RadBERT-RoBERTa-4m | |
| # RadBERT German CTRate Classifier | |
| A **RadBERT**-based multi-label classifier for predicting 18 pathology labels from **German-language** radiology reports. | |
| The training data consists of German-translated reports from the [CTRate](https://huggingface.co/datasets/ibrahimhamamci/CT-RATE) dataset, translated using Qwen 3.5 9B. | |
| ## Model Details | |
| | Property | Value | | |
| |---|---| | |
| | **Base model** | RadBERT (RoBERTa-base architecture, pre-trained on radiology text) | | |
| | **Task** | Multi-label text classification (18 labels) | | |
| | **Language** | German (`de`) | | |
| | **Framework** | 🤗 Transformers + PyTorch | | |
| | **Problem type** | `multi_label_classification` | | |
| ## Labels (18 pathologies) | |
| | ID | Label | | |
| |----|-------| | |
| | 0 | Medical material | | |
| | 1 | Arterial wall calcification | | |
| | 2 | Cardiomegaly | | |
| | 3 | Pericardial effusion | | |
| | 4 | Coronary artery wall calcification | | |
| | 5 | Hiatal hernia | | |
| | 6 | Lymphadenopathy | | |
| | 7 | Emphysema | | |
| | 8 | Atelectasis | | |
| | 9 | Lung nodule | | |
| | 10 | Lung opacity | | |
| | 11 | Pulmonary fibrotic sequela | | |
| | 12 | Pleural effusion | | |
| | 13 | Mosaic attenuation pattern | | |
| | 14 | Peribronchial thickening | | |
| | 15 | Consolidation | | |
| | 16 | Bronchiectasis | | |
| | 17 | Interlobular septal thickening | | |
| ## Quick Start | |
| ### Installation | |
| ```bash | |
| pip install transformers torch | |
| ``` | |
| ### Loading the model | |
| ```python | |
| from transformers import AutoTokenizer, AutoConfig | |
| from modeling_radbert import RadBertForSequenceClassification | |
| import torch | |
| repo_id = "suitch/radbert-german-ctrate-classifier" | |
| # Download the custom model class (or copy modeling_radbert.py locally) | |
| from huggingface_hub import hf_hub_download | |
| import sys, os | |
| modeling_path = hf_hub_download(repo_id=repo_id, filename="modeling_radbert.py") | |
| sys.path.insert(0, os.path.dirname(modeling_path)) | |
| # Load config, model, and tokenizer | |
| config = AutoConfig.from_pretrained(repo_id) | |
| model = RadBertForSequenceClassification.from_pretrained(repo_id, config=config) | |
| tokenizer = AutoTokenizer.from_pretrained(repo_id) | |
| model.eval() | |
| ``` | |
| ### Inference example | |
| ```python | |
| text = "Das Herz ist leicht vergrößert. Es zeigt sich ein kleiner Pleuraerguss links." | |
| inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=512) | |
| with torch.no_grad(): | |
| logits = model(**inputs) | |
| probabilities = torch.sigmoid(logits).squeeze() | |
| threshold = 0.5 | |
| predicted_labels = [ | |
| config.id2label[i] for i, p in enumerate(probabilities) if p >= threshold | |
| ] | |
| print("Predicted labels:", predicted_labels) | |
| print("Probabilities:") | |
| for i, p in enumerate(probabilities): | |
| print(f" {config.id2label[i]}: {p:.4f}") | |
| ``` | |
| ## Training Details | |
| - **Base checkpoint**: RadBERT (RoBERTa-base weights pre-trained on radiology corpora) | |
| - **Training data**: German translations of CTRate radiology reports (translated with Qwen 2.5 9B) | |
| - **Classification head**: Linear layer on top of the `[CLS]` / pooler output | |
| - **Loss**: Binary Cross-Entropy with Logits (per-label sigmoid) | |
| ## Limitations | |
| - This model is trained for **label inference from report text only** — it does **not** process images. | |
| - It should **not** be treated as a clinical decision support system. | |
| - Performance is limited by the quality of the machine-translated training data. | |
| ## Citation | |
| If you use this model, please cite the CTRate dataset and RadBERT: | |
| ```bibtex | |
| @article{hamamci2024ctrate, | |
| title={CT-RATE: A Large-Scale Computed Tomography Report-Image Dataset for AI in Radiology}, | |
| author={Hamamci, Ibrahim Ethem and others}, | |
| journal={arXiv preprint}, | |
| year={2024} | |
| } | |
| @article{yan2022radbert, | |
| title={RadBERT: Adapting Transformer-based Language Models to Radiology}, | |
| author={Yan, Di and others}, | |
| journal={Radiology: Artificial Intelligence}, | |
| year={2022} | |
| } | |
| ``` | |
| ## License | |
| MIT | |