matulichpt
/

radlit-biencoder

@@ -180,6 +180,37 @@ for score, idx in zip(top_results[0], top_results[1]):
     print(f"Score: {score:.4f} - {corpus[idx][:100]}...")
 ```
 ## Recommended: Full RadLITE Pipeline
 For best results, use RadLIT-BiEncoder as the first stage followed by RadLIT-CrossEncoder for reranking:

     print(f"Score: {score:.4f} - {corpus[idx][:100]}...")
 ```
+## Demo: Radiology Query Understanding
+```python
+from sentence_transformers import SentenceTransformer, util
+model = SentenceTransformer('matulichpt/radlit-biencoder')
+# Sample radiology corpus
+corpus = [
+    "HCC typically shows arterial hyperenhancement with washout on portal venous phase per LI-RADS criteria.",
+    "Pulmonary embolism appears as filling defects in pulmonary arteries on CTPA.",
+    "PVNS shows hemosiderin deposition with low T2 signal and GRE blooming artifact.",
+    "Acute stroke shows restricted diffusion: high DWI signal with low ADC values.",
+]
+# Encode corpus
+corpus_embeddings = model.encode(corpus, convert_to_tensor=True)
+# Query
+query = "What are the MRI findings in pigmented villonodular synovitis?"
+query_embedding = model.encode(query, convert_to_tensor=True)
+# Find best match
+scores = util.cos_sim(query_embedding, corpus_embeddings)[0]
+best_idx = scores.argmax()
+print(f"Best match: {corpus[best_idx]}")
+# Output: PVNS shows hemosiderin deposition with low T2 signal and GRE blooming artifact.
+```
+The model correctly identifies PVNS content even though the query uses the full name and the corpus uses the abbreviation.
 ## Recommended: Full RadLITE Pipeline
 For best results, use RadLIT-BiEncoder as the first stage followed by RadLIT-CrossEncoder for reranking: