visurg
/

SurgLIME

Model card Files Files and versions

xet

Community

Improve model card

by nielsr HF Staff - opened Apr 21

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

+40

-3

Files changed (1) hide show

README.md +40 -3

README.md CHANGED Viewed

@@ -1,3 +1,40 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+pipeline_tag: video-text-to-text
+---
+# SurgLIME
+SurgLIME is a parameter-efficient Vision-Language Pre-training (VLP) framework designed to learn reliable cross-modal alignments for surgical video understanding. It addresses the challenge of noisy, LLM-generated surgical narratives by using a LoRA-adapted dual-encoder architecture and an automated confidence estimation mechanism that dynamically down-weights uncertain text during training.
+- **Paper:** [Can LLM-Generated Text Empower Surgical Vision-Language Pre-training?](https://huggingface.co/papers/2604.18134)
+- **Repository:** [https://github.com/visurg-ai/SurgLIME](https://github.com/visurg-ai/SurgLIME)
+- **Dataset (LIME):** [huggingface.co/datasets/visurg/LIME](https://huggingface.co/datasets/visurg/LIME)
+## Model Description
+SurgLIME leverages the **LIME** dataset, a large-scale multi-modal dataset derived from surgical videos using human-free, Large Language Model (LLM)-generated narratives. To mitigate the impact of hallucinations and errors in the generated text, SurgLIME introduces:
+1.  **LoRA-adapted dual-encoder architecture:** Preserves foundational medical priors while enabling efficient adaptation.
+2.  **Confidence Estimation Mechanism:** Automatically identifies and down-weights unreliable narratives during contrastive alignment.
+Evaluations on benchmarks like AutoLaparo and Cholec80 demonstrate that SurgLIME achieves competitive zero-shot cross-modal alignment while maintaining robust linear probing performance.
+## Usage
+For installation, data preparation, and evaluation scripts (such as zero-shot surgical phase recognition), please refer to the [official GitHub repository](https://github.com/visurg-ai/SurgLIME).
+```bash
+# Example: Running zero-shot surgical phase recognition
+python zero_shot_autolaparo_LMDB.py
+```
+## Citation
+```bibtex
+@article{surglime2026,
+  title={Can LLM-Generated Text Empower Surgical Vision-Language Pre-training?},
+  author={...},
+  journal={arXiv preprint arXiv:2604.18134},
+  year={2026}
+}
+```