|
|
---
|
|
|
language:
|
|
|
- en
|
|
|
license: apache-2.0
|
|
|
library_name: sentence-transformers
|
|
|
tags:
|
|
|
- sentence-transformers
|
|
|
- feature-extraction
|
|
|
- sentence-similarity
|
|
|
- radiology
|
|
|
- medical
|
|
|
- retrieval
|
|
|
- colbert
|
|
|
- late-interaction
|
|
|
datasets:
|
|
|
- custom
|
|
|
metrics:
|
|
|
- mrr
|
|
|
- recall
|
|
|
pipeline_tag: sentence-similarity
|
|
|
model-index:
|
|
|
- name: radlit-colbert
|
|
|
results:
|
|
|
- task:
|
|
|
type: retrieval
|
|
|
name: Radiology Document Retrieval
|
|
|
dataset:
|
|
|
type: custom
|
|
|
name: RadLIT-9
|
|
|
config: radlit9-v1.1-balanced
|
|
|
metrics:
|
|
|
- type: mrr
|
|
|
value: 0.750
|
|
|
name: MRR
|
|
|
- type: recall@10
|
|
|
value: 0.943
|
|
|
name: Recall@10
|
|
|
- type: ndcg@10
|
|
|
value: 0.794
|
|
|
name: nDCG@10
|
|
|
---
|
|
|
|
|
|
# RadLIT-ColBERT: Radiology Late Interaction Transformer
|
|
|
|
|
|
A ColBERT-style late interaction model trained for radiology document retrieval. RadLIT uses token-level MaxSim scoring to provide more nuanced relevance matching than pooled embeddings.
|
|
|
|
|
|
## Model Description
|
|
|
|
|
|
RadLIT (Radiology Late Interaction Transformer) is a ColBERT-v2 style model adapted for radiology retrieval. Unlike traditional bi-encoders that produce single-vector representations, RadLIT maintains per-token embeddings and computes relevance through late interaction (MaxSim scoring).
|
|
|
|
|
|
### Why Late Interaction?
|
|
|
|
|
|
Late interaction models offer advantages for medical terminology:
|
|
|
- **Precise term matching**: Each query token finds its best-matching document token
|
|
|
- **Better handling of multi-word concepts**: "hepatocellular carcinoma" tokens can independently match
|
|
|
- **Implicit term weighting**: Important query terms contribute more to the final score
|
|
|
|
|
|
### Architecture
|
|
|
|
|
|
- **Base Model**: RoBERTa-base with ColBERT adapter
|
|
|
- **Hidden Size**: 768
|
|
|
- **Output Dimension**: 128 (compressed for efficiency)
|
|
|
- **Layers**: 12
|
|
|
- **Attention Heads**: 12
|
|
|
- **Parameters**: ~125M
|
|
|
- **Max Sequence Length**: 512 tokens
|
|
|
|
|
|
### Training
|
|
|
|
|
|
The model was trained using the ColBERT framework with radiology-specific data:
|
|
|
|
|
|
- **Training Objective**: InfoNCE with in-batch negatives + hard negatives
|
|
|
- **Hard Negative Mining**: Top-100 BM25 negatives per query
|
|
|
- **Training Epochs**: 4
|
|
|
- **Batch Size**: 32
|
|
|
|
|
|
**Note**: Training data sources are not disclosed due to variable licensing.
|
|
|
|
|
|
## Performance
|
|
|
|
|
|
### RadLIT-9 Benchmark
|
|
|
|
|
|
| Metric | Score |
|
|
|
|--------|-------|
|
|
|
| **MRR** | 0.750 |
|
|
|
| **nDCG@10** | 0.794 |
|
|
|
| **Recall@10** | 94.3% |
|
|
|
| **Recall@5** | 89.0% |
|
|
|
| **Recall@1** | 64.5% |
|
|
|
| **Latency** | ~5ms |
|
|
|
|
|
|
### Subspecialty Performance
|
|
|
|
|
|
| Subspecialty | MRR | Recall@10 |
|
|
|
|--------------|-----|-----------|
|
|
|
| Thoracic | **0.958** | 98% |
|
|
|
| Pediatric | 0.882 | 100% |
|
|
|
| Cardiac | 0.754 | 98% |
|
|
|
| Breast | 0.740 | 100% |
|
|
|
| Neuroradiology | 0.729 | 90% |
|
|
|
| MSK | 0.706 | 87% |
|
|
|
| Physics | 0.699 | 93% |
|
|
|
| GI | 0.686 | 94% |
|
|
|
| GU | 0.578 | 90% |
|
|
|
|
|
|
### Comparison with Other Approaches
|
|
|
|
|
|
| Model | MRR | Latency |
|
|
|
|-------|-----|---------|
|
|
|
| **RadLIT-ColBERT** | 0.750 | 5ms |
|
|
|
| RadLIT-BiEncoder | 0.703 | 5ms |
|
|
|
| BM25 | ~0.55 | <1ms |
|
|
|
|
|
|
## Usage
|
|
|
|
|
|
### Installation
|
|
|
|
|
|
```bash
|
|
|
pip install sentence-transformers colbert-ai
|
|
|
```
|
|
|
|
|
|
### Basic Usage with Sentence Transformers
|
|
|
|
|
|
```python
|
|
|
from sentence_transformers import SentenceTransformer
|
|
|
|
|
|
# Load model
|
|
|
model = SentenceTransformer('matulichpt/radlit-colbert')
|
|
|
|
|
|
# Encode queries and documents
|
|
|
query = "What are the imaging features of hepatocellular carcinoma on MRI?"
|
|
|
documents = [
|
|
|
"HCC typically shows arterial enhancement with washout...",
|
|
|
"Breast cancer staging involves mammography and MRI..."
|
|
|
]
|
|
|
|
|
|
# Get embeddings (token-level for ColBERT)
|
|
|
query_emb = model.encode(query, convert_to_tensor=True)
|
|
|
doc_embs = [model.encode(d, convert_to_tensor=True) for d in documents]
|
|
|
|
|
|
# For ColBERT MaxSim, you need to compute token-level similarities
|
|
|
# See ColBERT documentation for proper MaxSim implementation
|
|
|
```
|
|
|
|
|
|
### Late Interaction Scoring (MaxSim)
|
|
|
|
|
|
```python
|
|
|
import torch
|
|
|
|
|
|
def maxsim_score(query_emb, doc_emb):
|
|
|
"""
|
|
|
Compute MaxSim score between query and document embeddings.
|
|
|
|
|
|
For each query token, find the maximum similarity with any document token,
|
|
|
then sum these maximum similarities.
|
|
|
"""
|
|
|
# query_emb: [num_query_tokens, dim]
|
|
|
# doc_emb: [num_doc_tokens, dim]
|
|
|
|
|
|
# Compute all pairwise similarities
|
|
|
similarities = torch.matmul(query_emb, doc_emb.T) # [q_tokens, d_tokens]
|
|
|
|
|
|
# For each query token, take max similarity across all doc tokens
|
|
|
max_sims = similarities.max(dim=1).values # [q_tokens]
|
|
|
|
|
|
# Sum all max similarities
|
|
|
return max_sims.sum().item()
|
|
|
|
|
|
# Usage
|
|
|
query_emb = model.encode(query, convert_to_tensor=True, output_value='token_embeddings')
|
|
|
doc_emb = model.encode(document, convert_to_tensor=True, output_value='token_embeddings')
|
|
|
score = maxsim_score(query_emb, doc_emb)
|
|
|
```
|
|
|
|
|
|
### Integration with RadLITE Pipeline
|
|
|
|
|
|
RadLIT-ColBERT is the first-stage retriever in the full RadLITE pipeline:
|
|
|
|
|
|
```
|
|
|
Query -> RadLIT-ColBERT (fast retrieval, top-50) -> CrossEncoder (reranking) -> Results
|
|
|
```
|
|
|
|
|
|
For best results, use the full RadLITE pipeline:
|
|
|
- [RadLIT-BiEncoder](https://huggingface.co/matulichpt/radlit-biencoder) - Dense retrieval alternative
|
|
|
- [RadLIT-CrossEncoder](https://huggingface.co/matulichpt/radlit-crossencoder) - Reranking stage
|
|
|
|
|
|
## Evolution: RadLIT to RadLITE
|
|
|
|
|
|
| Version | Model | MRR | Innovation |
|
|
|
|---------|-------|-----|------------|
|
|
|
| v1.0 | **RadLIT-ColBERT** (this model) | 0.750 | Late interaction |
|
|
|
| v1.5 | RadLITx | 0.782 | + Cross-encoder fusion |
|
|
|
| v2.0 | RadLITE | **0.829** | + Calibrated fusion |
|
|
|
|
|
|
## Intended Use
|
|
|
|
|
|
### Primary Use Cases
|
|
|
|
|
|
- Fast first-stage radiology retrieval
|
|
|
- Educational content search
|
|
|
- Medical imaging literature retrieval
|
|
|
|
|
|
### Out-of-Scope Uses
|
|
|
|
|
|
- Non-radiology content retrieval
|
|
|
- Clinical diagnosis
|
|
|
- Final relevance scoring (use CrossEncoder for that)
|
|
|
|
|
|
## Limitations
|
|
|
|
|
|
1. **Subspecialty variance**: Performance varies from 0.58 (GU) to 0.96 (Thoracic)
|
|
|
2. **Domain specificity**: Optimized for radiology; limited generalization
|
|
|
3. **Late interaction overhead**: Token-level storage increases index size
|
|
|
|
|
|
## Ethical Considerations
|
|
|
|
|
|
- Not a diagnostic tool
|
|
|
- Should be used to surface relevant educational content
|
|
|
- May reflect biases in radiology literature
|
|
|
|
|
|
## Citation
|
|
|
|
|
|
```bibtex
|
|
|
@software{radlit_colbert_2026,
|
|
|
title = {RadLIT-ColBERT: Late Interaction for Radiology Retrieval},
|
|
|
author = {Grai Team},
|
|
|
year = {2026},
|
|
|
url = {https://huggingface.co/matulichpt/radlit-colbert},
|
|
|
note = {MRR 0.750 on RadLIT-9 benchmark}
|
|
|
}
|
|
|
```
|
|
|
|
|
|
## Related Models
|
|
|
|
|
|
- [RadLIT-BiEncoder](https://huggingface.co/matulichpt/radlit-biencoder) - Dense retrieval (RadLITE v2.0)
|
|
|
- [RadLIT-CrossEncoder](https://huggingface.co/matulichpt/radlit-crossencoder) - Reranking
|
|
|
|
|
|
## License
|
|
|
|
|
|
Apache 2.0 - Free for research and commercial use.
|
|
|
|