matulichpt
/

radlit-biencoder

@@ -1,239 +1,257 @@
----
-language:
-- en
-license: apache-2.0
-library_name: sentence-transformers
-tags:
-- sentence-transformers
-- feature-extraction
-- sentence-similarity
-- radiology
-- medical
-- retrieval
-- embedding
-datasets:
-- custom
-pipeline_tag: sentence-similarity
-model-index:
-- name: radlit-biencoder
-  results:
-  - task:
-      type: retrieval
-      name: Radiology Document Retrieval
-    dataset:
-      type: custom
-      name: RadLIT-9
-      config: radlit9-v1.1-balanced
-    metrics:
-    - type: mrr
-      value: 0.829
-      name: MRR
-    - type: recall@10
-      value: 0.971
-      name: Recall@10
-    - type: ndcg@10
-      value: 0.863
-      name: nDCG@10
----
-# RadLIT-BiEncoder: Radiology Late Interaction Transformer
-A domain-specialized bi-encoder model for radiology document retrieval, trained to understand medical imaging terminology, clinical reasoning patterns, and radiology-specific queries.
-## Model Description
-RadLIT-BiEncoder is the first stage of the RadLITE retrieval pipeline. It generates dense embeddings optimized for radiology content retrieval, significantly outperforming general-purpose embedding models on radiology-specific queries.
-### Architecture
-- **Base Model**: RoBERTa-base architecture
-- **Hidden Size**: 768
-- **Layers**: 12
-- **Attention Heads**: 12
-- **Parameters**: ~125M
-- **Max Sequence Length**: 512 tokens
-- **Embedding Dimension**: 768
-### Training
-The model was trained using contrastive learning with hard negative mining on a large corpus of radiology educational content. Training details:
-- **Training Objective**: Multiple Negatives Ranking Loss with hard negatives
-- **Batch Size**: 32
-- **Learning Rate**: 2e-5 with warmup
-- **Training Epochs**: 4
-- **Hard Negatives**: Mined from top-k retrieval failures
-**Note**: Training data consisted of radiology educational materials. Specific sources are not disclosed due to variable licensing, but the model is released under Apache 2.0 for research and commercial use.
-## Performance
-### RadLIT-9 Benchmark
-RadLIT-9 is a comprehensive radiology retrieval benchmark covering 9 subspecialties:
-| Metric | Score |
-|--------|-------|
-| **MRR** | 0.829 |
-| **nDCG@10** | 0.863 |
-| **Recall@10** | 97.1% |
-| **Recall@5** | 93.8% |
-| **Recall@1** | 74.3% |
-### Subspecialty Performance
-| Subspecialty | MRR | Recall@10 |
-|--------------|-----|-----------|
-| Physics/Nuclear | 0.936 | 100% |
-| Pediatric | 0.931 | 100% |
-| Thoracic | 0.913 | 98% |
-| Cardiac | 0.862 | 98% |
-| Neuroradiology | 0.860 | 98% |
-| Gastrointestinal | 0.800 | 96% |
-| Breast | 0.722 | 93% |
-| Musculoskeletal | 0.695 | 89% |
-| Genitourinary | 0.694 | 100% |
-### Comparison with Baselines
-| Model | MRR | vs RadLIT |
-|-------|-----|-----------|
-| **RadLIT-BiEncoder** | **0.829** | -- |
-| ColBERT-v2 | 0.750 | -9.5% |
-| General bi-encoder | 0.703 | -15.2% |
-| BM25 | ~0.55 | -33.6% |
-## Usage
-### Installation
-```bash
-pip install sentence-transformers
-```
-### Basic Usage
-```python
-from sentence_transformers import SentenceTransformer
-# Load model
-model = SentenceTransformer('matulichpt/radlit-biencoder')
-# Encode queries and documents
-queries = [
-    "What are the imaging features of hepatocellular carcinoma on MRI?",
-    "How do you differentiate glioblastoma from metastasis?"
-]
-documents = [
-    "HCC typically shows arterial enhancement with washout on portal venous phase...",
-    "GBM and metastases can be differentiated by their location and multiplicity..."
-]
-query_embeddings = model.encode(queries, convert_to_tensor=True)
-doc_embeddings = model.encode(documents, convert_to_tensor=True)
-# Compute similarity
-from sentence_transformers.util import cos_sim
-similarities = cos_sim(query_embeddings, doc_embeddings)
-print(similarities)
-```
-### For Retrieval Pipeline
-```python
-from sentence_transformers import SentenceTransformer, util
-import torch
-model = SentenceTransformer('matulichpt/radlit-biencoder')
-# Pre-encode your document corpus
-corpus = ["document 1...", "document 2...", ...]
-corpus_embeddings = model.encode(corpus, convert_to_tensor=True, show_progress_bar=True)
-# At query time
-query = "What are the CT findings in pulmonary embolism?"
-query_embedding = model.encode(query, convert_to_tensor=True)
-# Find top-k similar documents
-cos_scores = util.cos_sim(query_embedding, corpus_embeddings)[0]
-top_results = torch.topk(cos_scores, k=10)
-for score, idx in zip(top_results[0], top_results[1]):
-    print(f"Score: {score:.4f} - {corpus[idx][:100]}...")
-```
-## Recommended: Full RadLITE Pipeline
-For best results, use RadLIT-BiEncoder as the first stage followed by RadLIT-CrossEncoder for reranking:
-```python
-from sentence_transformers import SentenceTransformer, CrossEncoder
-# Stage 1: Bi-encoder retrieval
-biencoder = SentenceTransformer('grai-rad/radlit-biencoder')
-# Stage 2: Cross-encoder reranking
-crossencoder = CrossEncoder('matulichpt/radlit-crossencoder')
-# Retrieve candidates
-query = "What are the MRI findings in anterior cruciate ligament tear?"
-candidates = retrieve_with_biencoder(query, corpus, biencoder, top_k=50)
-# Rerank with cross-encoder
-pairs = [[query, doc] for doc in candidates]
-scores = crossencoder.predict(pairs)
-# Apply temperature calibration (recommended: T=1.5)
-calibrated_scores = scores / 1.5
-# Sort by calibrated scores
-reranked = sorted(zip(candidates, calibrated_scores), key=lambda x: x[1], reverse=True)
-```
-## Intended Use
-### Primary Use Cases
-- Radiology educational content retrieval
-- Medical imaging literature search
-- Clinical decision support (retrieval component)
-- Radiology question-answering systems
-### Out-of-Scope Uses
-- General web search
-- Non-medical document retrieval
-- Clinical diagnosis (this is a retrieval model, not a diagnostic tool)
-## Limitations
-1. **Domain Specificity**: Optimized for radiology; may underperform on general medical or non-medical content
-2. **Language**: English only
-3. **Subspecialty Variance**: Performance varies by subspecialty (0.69-0.94 MRR range)
-4. **Not a Diagnostic Tool**: This model retrieves relevant documents; it does not provide medical diagnoses
-## Ethical Considerations
-- This model should not be used as a sole source for clinical decision-making
-- Retrieved documents should be reviewed by qualified medical professionals
-- The model may reflect biases present in radiology educational literature
-## Citation
-```bibtex
-@software{radlit_biencoder_2026,
-  title = {RadLIT-BiEncoder: Domain-Specialized Embeddings for Radiology Retrieval},
-  author = {Grai Team},
-  year = {2026},
-  url = {https://huggingface.co/matulichpt/radlit-biencoder},
-  note = {MRR 0.829 on RadLIT-9 benchmark}
-}
-```
-## License
-Apache 2.0 - Free for research and commercial use.
-## Contact
-For questions or collaboration: Open an issue on the model repository

+---
+language:
+- en
+license: apache-2.0
+library_name: sentence-transformers
+tags:
+- sentence-transformers
+- feature-extraction
+- sentence-similarity
+- radiology
+- medical
+- retrieval
+- embedding
+datasets:
+- custom
+metrics:
+- mrr
+- recall
+pipeline_tag: sentence-similarity
+model-index:
+- name: radlit-biencoder
+  results:
+  - task:
+      type: retrieval
+      name: Radiology Document Retrieval
+    dataset:
+      type: custom
+      name: RadLIT-9
+      config: radlit9-v1.1-balanced
+    metrics:
+    - type: mrr
+      value: 0.698
+      name: MRR (bi-encoder only)
+    - type: recall@10
+      value: 0.914
+      name: Recall@10
+    - type: ndcg@10
+      value: 0.748
+      name: nDCG@10
+---
+# RadLIT-BiEncoder: Radiology Document Retrieval
+A domain-specialized bi-encoder model for radiology document retrieval, trained to understand medical imaging terminology and radiology-specific queries.
+## Model Description
+RadLIT-BiEncoder generates dense embeddings optimized for radiology content retrieval. It serves as the first stage of the RadLITE pipeline, providing fast candidate retrieval before cross-encoder reranking.
+### Architecture
+- **Base Model**: RoBERTa-base architecture
+- **Hidden Size**: 768
+- **Layers**: 12
+- **Attention Heads**: 12
+- **Parameters**: ~125M
+- **Max Sequence Length**: 512 tokens
+- **Embedding Dimension**: 768
+### Training
+The model was trained using contrastive learning with hard negative mining on radiology educational content:
+- **Training Objective**: Multiple Negatives Ranking Loss with hard negatives
+- **Batch Size**: 32
+- **Learning Rate**: 2e-5 with warmup
+- **Training Epochs**: 4
+**Note**: Training data sources are not disclosed due to variable licensing. The model is released under Apache 2.0.
+## Performance
+### RadLIT-9 Benchmark (Bi-Encoder Only)
+Performance when using this bi-encoder alone for retrieval:
+| Metric | Score |
+|--------|-------|
+| **MRR** | 0.698 |
+| **nDCG@10** | 0.748 |
+| **Recall@10** | 91.4% |
+| **Recall@5** | 86.9% |
+| **Recall@1** | 56.7% |
+### Comparison with General-Purpose Models
+On RadLIT-9 benchmark (bi-encoder retrieval only, no reranking):
+| Model | MRR | nDCG@10 | Recall@10 |
+|-------|-----|---------|-----------|
+| GTE-large | 0.843 | 0.873 | 97.1% |
+| E5-large-v2 | 0.813 | 0.850 | 96.9% |
+| BGE-large | 0.792 | 0.836 | 97.4% |
+| **RadLIT-BiEncoder** | **0.698** | **0.748** | **91.4%** |
+**Important**: The bi-encoder alone underperforms general-purpose models. The value of RadLIT comes from the full pipeline with cross-encoder reranking (see below).
+### Full RadLITE Pipeline Performance
+When combined with RadLIT-CrossEncoder and BM25 fusion:
+| Configuration | MRR | Improvement |
+|---------------|-----|-------------|
+| Bi-encoder only | 0.698 | baseline |
+| + Cross-encoder reranking | 0.782 | +12.0% |
+| + BM25 fusion (RadLITE) | **0.829** | **+18.8%** |
+The full RadLITE pipeline achieves **0.829 MRR**, competitive with the best general-purpose models while being optimized for radiology.
+### Subspecialty Performance (Bi-Encoder Only)
+| Subspecialty | MRR | Recall@10 |
+|--------------|-----|-----------|
+| Physics/Nuclear | 0.790 | 100% |
+| Pediatric | 0.827 | 92% |
+| Thoracic | 0.828 | 94% |
+| Cardiac | 0.778 | 98% |
+| Neuroradiology | 0.731 | 88% |
+| Gastrointestinal | 0.626 | 98% |
+| Breast | 0.592 | 90% |
+| Musculoskeletal | 0.598 | 78% |
+| Genitourinary | 0.470 | 84% |
+## Usage
+### Installation
+```bash
+pip install sentence-transformers
+```
+### Basic Usage
+```python
+from sentence_transformers import SentenceTransformer
+# Load model
+model = SentenceTransformer('matulichpt/radlit-biencoder')
+# Encode queries and documents
+queries = [
+    "What are the imaging features of hepatocellular carcinoma on MRI?",
+    "How do you differentiate glioblastoma from metastasis?"
+]
+documents = [
+    "HCC typically shows arterial enhancement with washout on portal venous phase...",
+    "GBM and metastases can be differentiated by their location and multiplicity..."
+]
+query_embeddings = model.encode(queries, convert_to_tensor=True)
+doc_embeddings = model.encode(documents, convert_to_tensor=True)
+# Compute similarity
+from sentence_transformers.util import cos_sim
+similarities = cos_sim(query_embeddings, doc_embeddings)
+print(similarities)
+```
+### For Retrieval Pipeline
+```python
+from sentence_transformers import SentenceTransformer, util
+import torch
+model = SentenceTransformer('matulichpt/radlit-biencoder')
+# Pre-encode your document corpus
+corpus = ["document 1...", "document 2...", ...]
+corpus_embeddings = model.encode(corpus, convert_to_tensor=True, show_progress_bar=True)
+# At query time
+query = "What are the CT findings in pulmonary embolism?"
+query_embedding = model.encode(query, convert_to_tensor=True)
+# Find top-k similar documents
+cos_scores = util.cos_sim(query_embedding, corpus_embeddings)[0]
+top_results = torch.topk(cos_scores, k=10)
+for score, idx in zip(top_results[0], top_results[1]):
+    print(f"Score: {score:.4f} - {corpus[idx][:100]}...")
+```
+## Recommended: Full RadLITE Pipeline
+For best results, use RadLIT-BiEncoder as the first stage followed by RadLIT-CrossEncoder for reranking:
+```python
+from sentence_transformers import SentenceTransformer, CrossEncoder
+# Stage 1: Bi-encoder retrieval (fast, gets candidates)
+biencoder = SentenceTransformer('matulichpt/radlit-biencoder')
+# Stage 2: Cross-encoder reranking (slower, more accurate)
+crossencoder = CrossEncoder('matulichpt/radlit-crossencoder')
+# Retrieve candidates
+query = "What are the MRI findings in anterior cruciate ligament tear?"
+candidates = retrieve_with_biencoder(query, corpus, biencoder, top_k=50)
+# Rerank with cross-encoder
+pairs = [[query, doc] for doc in candidates]
+scores = crossencoder.predict(pairs)
+# Apply temperature calibration (recommended: T=1.5)
+calibrated_scores = scores / 1.5
+# Sort by calibrated scores
+reranked = sorted(zip(candidates, calibrated_scores), key=lambda x: x[1], reverse=True)
+```
+## Intended Use
+### Primary Use Cases
+- First-stage candidate retrieval for radiology content
+- Medical imaging literature search
+- Radiology question-answering systems (retrieval component)
+### Out-of-Scope Uses
+- General web search
+- Non-medical document retrieval
+- Clinical diagnosis (this is a retrieval model, not a diagnostic tool)
+## Limitations
+1. **Bi-encoder alone underperforms**: Use with cross-encoder reranking for best results
+2. **Domain Specificity**: Optimized for radiology; may underperform on general content
+3. **Language**: English only
+4. **Subspecialty Variance**: Performance varies by subspecialty (0.47-0.83 MRR range)
+## Ethical Considerations
+- This model should not be used as a sole source for clinical decision-making
+- Retrieved documents should be reviewed by qualified medical professionals
+- The model may reflect biases present in radiology educational literature
+## Citation
+```bibtex
+@software{radlit_biencoder_2026,
+  title = {RadLIT-BiEncoder: Domain-Specialized Embeddings for Radiology Retrieval},
+  author = {Matulich, P.},
+  year = {2026},
+  url = {https://huggingface.co/matulichpt/radlit-biencoder},
+  note = {MRR 0.698 standalone, 0.829 with RadLITE pipeline}
+}
+```
+## Related Models
+- [RadLIT-CrossEncoder](https://huggingface.co/matulichpt/radlit-crossencoder) - Second-stage reranking
+- [RadLIT-ColBERT](https://huggingface.co/matulichpt/radlit-colbert) - Late interaction model
+## License
+Apache 2.0 - Free for research and commercial use.