Add practical demo showing PVNS abbreviation understanding
Browse files
README.md
CHANGED
|
@@ -180,6 +180,37 @@ for score, idx in zip(top_results[0], top_results[1]):
|
|
| 180 |
print(f"Score: {score:.4f} - {corpus[idx][:100]}...")
|
| 181 |
```
|
| 182 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 183 |
## Recommended: Full RadLITE Pipeline
|
| 184 |
|
| 185 |
For best results, use RadLIT-BiEncoder as the first stage followed by RadLIT-CrossEncoder for reranking:
|
|
|
|
| 180 |
print(f"Score: {score:.4f} - {corpus[idx][:100]}...")
|
| 181 |
```
|
| 182 |
|
| 183 |
+
## Demo: Radiology Query Understanding
|
| 184 |
+
|
| 185 |
+
```python
|
| 186 |
+
from sentence_transformers import SentenceTransformer, util
|
| 187 |
+
|
| 188 |
+
model = SentenceTransformer('matulichpt/radlit-biencoder')
|
| 189 |
+
|
| 190 |
+
# Sample radiology corpus
|
| 191 |
+
corpus = [
|
| 192 |
+
"HCC typically shows arterial hyperenhancement with washout on portal venous phase per LI-RADS criteria.",
|
| 193 |
+
"Pulmonary embolism appears as filling defects in pulmonary arteries on CTPA.",
|
| 194 |
+
"PVNS shows hemosiderin deposition with low T2 signal and GRE blooming artifact.",
|
| 195 |
+
"Acute stroke shows restricted diffusion: high DWI signal with low ADC values.",
|
| 196 |
+
]
|
| 197 |
+
|
| 198 |
+
# Encode corpus
|
| 199 |
+
corpus_embeddings = model.encode(corpus, convert_to_tensor=True)
|
| 200 |
+
|
| 201 |
+
# Query
|
| 202 |
+
query = "What are the MRI findings in pigmented villonodular synovitis?"
|
| 203 |
+
query_embedding = model.encode(query, convert_to_tensor=True)
|
| 204 |
+
|
| 205 |
+
# Find best match
|
| 206 |
+
scores = util.cos_sim(query_embedding, corpus_embeddings)[0]
|
| 207 |
+
best_idx = scores.argmax()
|
| 208 |
+
print(f"Best match: {corpus[best_idx]}")
|
| 209 |
+
# Output: PVNS shows hemosiderin deposition with low T2 signal and GRE blooming artifact.
|
| 210 |
+
```
|
| 211 |
+
|
| 212 |
+
The model correctly identifies PVNS content even though the query uses the full name and the corpus uses the abbreviation.
|
| 213 |
+
|
| 214 |
## Recommended: Full RadLITE Pipeline
|
| 215 |
|
| 216 |
For best results, use RadLIT-BiEncoder as the first stage followed by RadLIT-CrossEncoder for reranking:
|