visurg
/

LEMON_curation_models

Model card Files Files and versions

xet

Community

Add pipeline tag and improve model card

by nielsr HF Staff - opened Mar 24

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

+23

-17

Files changed (1) hide show

README.md +23 -17

README.md CHANGED Viewed

@@ -1,26 +1,30 @@
 ---
 license: apache-2.0
 ---
 <div align="center">
   <img src="https://cdn-uploads.huggingface.co/production/uploads/67d9504a41d31cc626fcecc8/cE7UgFfJJ2gUHJr0SSEhc.png"> </img>
 </div>
 [📚 Paper](https://arxiv.org/abs/2503.19740) - [🤖 GitHub](https://github.com/visurg-ai/LEMON)
-We provide the models used in our data curation pipeline in [📚 LEMON: A Large Endoscopic MONocular Dataset and Foundation Model for Perception in Surgical Settings](https://arxiv.org/abs/2503.19740) to assist with constructing the LEMON dataset (for more details about the LEMON dataset and our
-LemonFM foundation model, please visit our github repository at [🤖 GitHub](https://github.com/visurg-ai/LEMON)) .
 If you use our dataset, model, or code in your research, please cite our paper:
-```
 @misc{che2025lemonlargeendoscopicmonocular,
       title={LEMON: A Large Endoscopic MONocular Dataset and Foundation Model for Perception in Surgical Settings},
-      author={Chengan Che and Chao Wang and Tom Vercauteren and Sophia Tsoka and Luis C. Garcia-Peraza-Herrera},
       year={2025},
       eprint={2503.19740},
       archivePrefix={arXiv},
@@ -29,10 +33,9 @@ If you use our dataset, model, or code in your research, please cite our paper:
 }
 ```
-This Hugging Face repository includes video storyboard classification models, frame classification models, and non-surgical object detection models. The model loader file can be found at [model_loader.py](https://huggingface.co/visurg/Surg3M_curation_models/blob/main/model_loader.py)
 <div align="center">
 <table style="margin-left: auto; margin-right: auto;">
@@ -59,15 +62,16 @@ This Hugging Face repository includes video storyboard classification models, fr
 </table>
 </div>
 The data curation pipeline leading to the clean videos in the LEMON dataset is as follows:
 <div align="center">
   <img src="https://cdn-uploads.huggingface.co/production/uploads/67d9504a41d31cc626fcecc8/jzw36jlPT-V_I-Vm01OzO.png"> </img>
 </div>
-Usage
---------
-**Video classification models** are employed in the step **2** of the data curation pipeline to classify a video storyboard as either surgical or non-surgical, the models usage is as follows:
    ```python
    import torch
    import torchvision
@@ -102,7 +106,8 @@ Usage
    outputs = net(img_tensor)
    ```
-**Frame classification models** are used in the step **3** of the data curation pipeline to classify a frame as either surgical or non-surgical, the models usage is as follows:
    ```python
    import torch
@@ -137,7 +142,8 @@ Usage
    outputs = net(img_tensor)
    ```
-**Non-surgical object detection models** are used to obliterate the non-surgical region in the surgical frames (e.g. user interface information), the models usage is as follows:
    ```python
    import torch
@@ -170,4 +176,4 @@ Usage
    # Extract features from the image
    outputs = net(img_tensor)
-   ```

 ---
 license: apache-2.0
+pipeline_tag: image-classification
+tags:
+- medical
+- surgical
+- endoscopy
 ---
 <div align="center">
   <img src="https://cdn-uploads.huggingface.co/production/uploads/67d9504a41d31cc626fcecc8/cE7UgFfJJ2gUHJr0SSEhc.png"> </img>
 </div>
 [📚 Paper](https://arxiv.org/abs/2503.19740) - [🤖 GitHub](https://github.com/visurg-ai/LEMON)
+This repository provides the models used in the data curation pipeline for the paper [LEMON: A Large Endoscopic MONocular Dataset and Foundation Model for Perception in Surgical Settings](https://arxiv.org/abs/2503.19740). These models assist in constructing the LEMON dataset by filtering and processing surgical video content.
+For more details about the LEMON dataset and our LemonFM foundation model, please visit our [GitHub repository](https://github.com/visurg-ai/LEMON).
+## Citation
 If you use our dataset, model, or code in your research, please cite our paper:
+```bibtex
 @misc{che2025lemonlargeendoscopicmonocular,
       title={LEMON: A Large Endoscopic MONocular Dataset and Foundation Model for Perception in Surgical Settings},
+      author={Chengan Che and Chao Wang and Tom Vercauteren messenger, Sophia Tsoka and Luis C. Garcia-Peraza-Herrera},
       year={2025},
       eprint={2503.19740},
       archivePrefix={arXiv},
 }
 ```
+## Model Overview
+This Hugging Face repository includes video storyboard classification models, frame classification models, and non-surgical object detection models. The model loader file can be found at [model_loader.py](https://huggingface.co/visurg/Surg3M_curation_models/blob/main/model_loader.py).
 <div align="center">
 <table style="margin-left: auto; margin-right: auto;">
 </table>
 </div>
 The data curation pipeline leading to the clean videos in the LEMON dataset is as follows:
 <div align="center">
   <img src="https://cdn-uploads.huggingface.co/production/uploads/67d9504a41d31cc626fcecc8/jzw36jlPT-V_I-Vm01OzO.png"> </img>
 </div>
+## Usage
+### Video classification models
+**Video classification models** are employed in step **2** of the data curation pipeline to classify a video storyboard as either surgical or non-surgical:
    ```python
    import torch
    import torchvision
    outputs = net(img_tensor)
    ```
+### Frame classification models
+**Frame classification models** are used in step **3** of the data curation pipeline to classify a frame as either surgical or non-surgical:
    ```python
    import torch
    outputs = net(img_tensor)
    ```
+### Non-surgical object detection models
+**Non-surgical object detection models** are used to obliterate the non-surgical region in the surgical frames (e.g. user interface information):
    ```python
    import torch
    # Extract features from the image
    outputs = net(img_tensor)
+   ```