File size: 11,208 Bytes
2046a6b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
---

license: apache-2.0
language:
- en
tags:
- cross-encoder
- reranker
- radiology
- medical
- retrieval
- sentence-similarity
- healthcare
- clinical
base_model: cross-encoder/ms-marco-MiniLM-L-12-v2
pipeline_tag: text-classification
library_name: sentence-transformers
datasets:
- radiology-education-corpus
metrics:
- mrr
- ndcg
model-index:
- name: RadLITE-Reranker
  results:
  - task:
      type: reranking
      name: Document Reranking
    dataset:
      name: RadLIT-9 (Radiology Retrieval Benchmark)
      type: radiology-retrieval
    metrics:
    - type: mrr
      value: 0.829
      name: MRR (with bi-encoder)
    - type: mrr_improvement
      value: 0.303
      name: MRR Improvement on ACR Core Exam (+30.3%)
---


# RadLITE-Reranker

**Radiology Late Interaction Transformer Enhanced - Cross-Encoder Reranker**

A domain-specialized cross-encoder for reranking radiology search results. This model takes a query-document pair and predicts a relevance score, providing more accurate ranking than bi-encoder similarity alone.

> **Recommended:** Use this reranker together with [RadLITE-Encoder](https://huggingface.co/matulichpt/RadLITE-Encoder) in a two-stage pipeline for optimal performance. The bi-encoder handles fast retrieval over large corpora, then this cross-encoder reranks the top candidates for precision. This combination achieves **MRR 0.829** on radiology benchmarks (+30% on board exam questions).

## Model Description

| Property | Value |
|----------|-------|
| **Model Type** | Cross-Encoder (Reranker) |
| **Base Model** | [ms-marco-MiniLM-L-12-v2](https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-12-v2) |
| **Domain** | Radiology / Medical Imaging |
| **Hidden Size** | 384 |
| **Max Sequence Length** | 512 tokens |
| **Output** | Single relevance score |
| **License** | Apache 2.0 |

### Why Use a Reranker?

Bi-encoders (like RadLITE-Encoder) are fast but encode query and document independently. Cross-encoders process them together, capturing fine-grained interactions:

| Approach | Speed | Accuracy | Use Case |
|----------|-------|----------|----------|
| Bi-Encoder | Fast (1000s docs/sec) | Good | First-stage retrieval |
| Cross-Encoder | Slow (10s docs/sec) | Excellent | Reranking top candidates |

**Two-stage pipeline**: Use bi-encoder to get top 50-100 candidates, then rerank with cross-encoder for best results.

## Performance

### Impact on RadLIT-9 Benchmark

| Configuration | MRR | Improvement |
|---------------|-----|-------------|
| Bi-Encoder only | 0.78 | baseline |
| **Bi-Encoder + Reranker** | **0.829** | **+6.3%** |

### ACR Core Exam (Board-Style Questions)

| Dataset | With Reranker | Without | Improvement |
|---------|---------------|---------|-------------|
| Core Exam Chest | 0.533 | 0.409 | **+30.3%** |
| Core Exam Combined | 0.466 | 0.381 | **+22.5%** |

The reranker is especially valuable for complex, multi-part queries typical of board exam questions.

## Quick Start

### Installation

```bash

pip install sentence-transformers>=2.2.0

```

### Basic Usage

```python

from sentence_transformers import CrossEncoder



# Load the reranker

reranker = CrossEncoder("matulichpt/RadLITE-Reranker", max_length=512)



# Query and candidate documents

query = "What are the imaging features of hepatocellular carcinoma?"

documents = [

    "HCC typically shows arterial enhancement with portal venous washout on CT.",

    "Fatty liver disease presents as decreased attenuation on non-contrast CT.",

    "Hepatic hemangiomas show peripheral nodular enhancement.",

]



# Create query-document pairs

pairs = [[query, doc] for doc in documents]



# Get relevance scores

scores = reranker.predict(pairs)



# Apply temperature calibration (RECOMMENDED)

calibrated_scores = scores / 1.5



print("Scores:", calibrated_scores)

# Document about HCC will have highest score

```

### Temperature Calibration

**Important**: This model outputs scores with high variance. Apply temperature scaling for better fusion with other signals:

```python

# Raw scores might be: [4.2, -1.5, 0.8]

# After calibration:   [2.8, -1.0, 0.53]



TEMPERATURE = 1.5  # Recommended value



def calibrated_predict(reranker, pairs):

    raw_scores = reranker.predict(pairs)

    return raw_scores / TEMPERATURE

```

### Full Two-Stage Search Pipeline

```python

from sentence_transformers import SentenceTransformer, CrossEncoder

import numpy as np



class RadLITESearch:

    def __init__(self, device="cuda"):

        # Stage 1: Fast bi-encoder

        self.encoder = SentenceTransformer(

            "matulichpt/RadLITE-Encoder",

            device=device

        )

        # Stage 2: Precise reranker

        self.reranker = CrossEncoder(

            "matulichpt/RadLITE-Reranker",

            max_length=512,

            device=device

        )

        self.temperature = 1.5

        self.corpus_embeddings = None

        self.corpus = None



    def index_corpus(self, documents: list):

        """Pre-compute embeddings for your corpus."""

        self.corpus = documents

        self.corpus_embeddings = self.encoder.encode(

            documents,

            normalize_embeddings=True,

            show_progress_bar=True,

            batch_size=32

        )



    def search(self, query: str, top_k: int = 10, candidates: int = 50):

        """Two-stage search: retrieve then rerank."""



        # Stage 1: Bi-encoder retrieval

        query_emb = self.encoder.encode(query, normalize_embeddings=True)

        scores = query_emb @ self.corpus_embeddings.T

        top_indices = np.argsort(scores)[-candidates:][::-1]



        # Stage 2: Cross-encoder reranking

        candidate_docs = [self.corpus[i] for i in top_indices]

        pairs = [[query, doc] for doc in candidate_docs]

        rerank_scores = self.reranker.predict(pairs) / self.temperature



        # Sort by reranked scores

        sorted_indices = np.argsort(rerank_scores)[::-1]



        results = []

        for idx in sorted_indices[:top_k]:

            results.append({

                "document": candidate_docs[idx],

                "corpus_index": int(top_indices[idx]),

                "score": float(rerank_scores[idx]),

                "biencoder_score": float(scores[top_indices[idx]])

            })

        return results





# Usage

searcher = RadLITESearch()

searcher.index_corpus(your_radiology_documents)

results = searcher.search("pneumothorax CT findings")

```

## Integration with Any Corpus

### Radiopaedia / Educational Content

```python

import json



# Load your content (e.g., Radiopaedia articles)

with open("radiopaedia_articles.json") as f:

    articles = json.load(f)



corpus = [article["content"] for article in articles]



# Initialize search

searcher = RadLITESearch()

searcher.index_corpus(corpus)



# Search

results = searcher.search("classic findings of pulmonary embolism on CTPA")



for r in results[:5]:

    print(f"Score: {r['score']:.3f}")

    print(f"Content: {r['document'][:200]}...")

    print()

```

### Integration with Elasticsearch/OpenSearch

```python

from sentence_transformers import CrossEncoder



reranker = CrossEncoder("matulichpt/RadLITE-Reranker", max_length=512)



def rerank_elasticsearch_results(query: str, es_results: list, top_k: int = 10):

    """Rerank Elasticsearch BM25 results."""

    documents = [hit["_source"]["content"] for hit in es_results]

    pairs = [[query, doc] for doc in documents]



    scores = reranker.predict(pairs) / 1.5  # Temperature calibration



    # Combine with ES scores (optional)

    for i, hit in enumerate(es_results):

        hit["rerank_score"] = float(scores[i])

        hit["combined_score"] = 0.3 * hit["_score"] + 0.7 * scores[i]



    # Sort by combined score

    reranked = sorted(es_results, key=lambda x: x["combined_score"], reverse=True)

    return reranked[:top_k]

```

## Optimal Fusion Weights

When combining multiple signals (bi-encoder, cross-encoder, BM25), use these weights:

```python

# Optimal weights from grid search on RadLIT-9

FUSION_WEIGHTS = {

    "biencoder": 0.5,    # RadLITE-Encoder similarity

    "crossencoder": 0.2, # RadLITE-Reranker (after temp calibration)

    "bm25": 0.3          # Lexical matching (if available)

}



def fused_score(bienc_score, ce_score, bm25_score=0):

    return (

        FUSION_WEIGHTS["biencoder"] * bienc_score +

        FUSION_WEIGHTS["crossencoder"] * ce_score +

        FUSION_WEIGHTS["bm25"] * bm25_score

    )

```

## Architecture

```

[Query] + [SEP] + [Document]

           |

           v

    [BERT Tokenizer]

           |

           v

    [MiniLM Encoder] (12 layers, 384 hidden)

           |

           v

    [Classification Head]

           |

           v

    Relevance Score (float)

```

## Training Details

- **Base Model**: ms-marco-MiniLM-L-12-v2 (trained on MS MARCO passage ranking)
- **Fine-tuning**: Radiology query-document relevance pairs
- **Training Steps**: 5,626
- **Best Validation Loss**: 0.691
- **Learning Rate**: 2e-5
- **Batch Size**: 32
- **Category Weighting**: Yes (balanced across radiology subspecialties)

## Best Practices

### 1. Always Use Temperature Calibration

Raw cross-encoder scores can be extreme. Temperature scaling (1.5) produces better fusion:

```python

calibrated = raw_score / 1.5

```

### 2. Limit Candidates for Reranking

Cross-encoders are slow. Only rerank top 50-100 candidates from bi-encoder:

```python

# Good: Rerank top 50

rerank_candidates = 50



# Bad: Rerank entire corpus

rerank_candidates = len(corpus)  # Too slow!

```

### 3. Batch Predictions

```python

# Efficient: Single batch call

pairs = [[query, doc] for doc in candidates]

scores = reranker.predict(pairs, batch_size=32)



# Inefficient: Individual calls

scores = [reranker.predict([[query, doc]])[0] for doc in candidates]

```

### 4. GPU Acceleration

```python

reranker = CrossEncoder(

    "matulichpt/RadLITE-Reranker",

    max_length=512,

    device="cuda"  # Use GPU

)

```

## Limitations

- **English only**: Trained on English radiology text
- **Speed**: ~10-50 pairs/second (use for reranking, not full corpus)
- **512 token limit**: Long documents are truncated
- **Domain-specific**: Optimized for radiology, may underperform on general medical content

## Citation

If you use RadLITE in your work, please cite:

```bibtex

@software{radlite_2026,

    title = {RadLITE: Calibrated Multi-Stage Retrieval for Radiology Education},

    author = {Grai Team},

    year = {2026},

    month = {January},

    url = {https://huggingface.co/matulichpt/RadLITE-Reranker},

    note = {+30% MRR improvement on ACR Core Exam questions}

}

```

## Related Models

- [RadLITE-Encoder](https://huggingface.co/matulichpt/RadLITE-Encoder) - Bi-encoder for first-stage retrieval
- [RadBERT-RoBERTa-4m](https://huggingface.co/zzxslp/RadBERT-RoBERTa-4m) - Base radiology language model

## License

Apache 2.0 - Free for commercial and research use.