Improve model card

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +40 -3
README.md CHANGED
@@ -1,3 +1,40 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ pipeline_tag: video-text-to-text
4
+ ---
5
+
6
+ # SurgLIME
7
+
8
+ SurgLIME is a parameter-efficient Vision-Language Pre-training (VLP) framework designed to learn reliable cross-modal alignments for surgical video understanding. It addresses the challenge of noisy, LLM-generated surgical narratives by using a LoRA-adapted dual-encoder architecture and an automated confidence estimation mechanism that dynamically down-weights uncertain text during training.
9
+
10
+ - **Paper:** [Can LLM-Generated Text Empower Surgical Vision-Language Pre-training?](https://huggingface.co/papers/2604.18134)
11
+ - **Repository:** [https://github.com/visurg-ai/SurgLIME](https://github.com/visurg-ai/SurgLIME)
12
+ - **Dataset (LIME):** [huggingface.co/datasets/visurg/LIME](https://huggingface.co/datasets/visurg/LIME)
13
+
14
+ ## Model Description
15
+
16
+ SurgLIME leverages the **LIME** dataset, a large-scale multi-modal dataset derived from surgical videos using human-free, Large Language Model (LLM)-generated narratives. To mitigate the impact of hallucinations and errors in the generated text, SurgLIME introduces:
17
+ 1. **LoRA-adapted dual-encoder architecture:** Preserves foundational medical priors while enabling efficient adaptation.
18
+ 2. **Confidence Estimation Mechanism:** Automatically identifies and down-weights unreliable narratives during contrastive alignment.
19
+
20
+ Evaluations on benchmarks like AutoLaparo and Cholec80 demonstrate that SurgLIME achieves competitive zero-shot cross-modal alignment while maintaining robust linear probing performance.
21
+
22
+ ## Usage
23
+
24
+ For installation, data preparation, and evaluation scripts (such as zero-shot surgical phase recognition), please refer to the [official GitHub repository](https://github.com/visurg-ai/SurgLIME).
25
+
26
+ ```bash
27
+ # Example: Running zero-shot surgical phase recognition
28
+ python zero_shot_autolaparo_LMDB.py
29
+ ```
30
+
31
+ ## Citation
32
+
33
+ ```bibtex
34
+ @article{surglime2026,
35
+ title={Can LLM-Generated Text Empower Surgical Vision-Language Pre-training?},
36
+ author={...},
37
+ journal={arXiv preprint arXiv:2604.18134},
38
+ year={2026}
39
+ }
40
+ ```