Add pipeline tag and improve model card

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +23 -17
README.md CHANGED
@@ -1,26 +1,30 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
3
  ---
4
 
5
  <div align="center">
6
  <img src="https://cdn-uploads.huggingface.co/production/uploads/67d9504a41d31cc626fcecc8/cE7UgFfJJ2gUHJr0SSEhc.png"> </img>
7
  </div>
8
 
9
-
10
-
11
-
12
  [πŸ“š Paper](https://arxiv.org/abs/2503.19740) - [πŸ€– GitHub](https://github.com/visurg-ai/LEMON)
13
 
14
- We provide the models used in our data curation pipeline in [πŸ“š LEMON: A Large Endoscopic MONocular Dataset and Foundation Model for Perception in Surgical Settings](https://arxiv.org/abs/2503.19740) to assist with constructing the LEMON dataset (for more details about the LEMON dataset and our
15
- LemonFM foundation model, please visit our github repository at [πŸ€– GitHub](https://github.com/visurg-ai/LEMON)) .
16
 
 
 
 
17
 
18
  If you use our dataset, model, or code in your research, please cite our paper:
19
 
20
- ```
21
  @misc{che2025lemonlargeendoscopicmonocular,
22
  title={LEMON: A Large Endoscopic MONocular Dataset and Foundation Model for Perception in Surgical Settings},
23
- author={Chengan Che and Chao Wang and Tom Vercauteren and Sophia Tsoka and Luis C. Garcia-Peraza-Herrera},
24
  year={2025},
25
  eprint={2503.19740},
26
  archivePrefix={arXiv},
@@ -29,10 +33,9 @@ If you use our dataset, model, or code in your research, please cite our paper:
29
  }
30
  ```
31
 
 
32
 
33
-
34
- This Hugging Face repository includes video storyboard classification models, frame classification models, and non-surgical object detection models. The model loader file can be found at [model_loader.py](https://huggingface.co/visurg/Surg3M_curation_models/blob/main/model_loader.py)
35
-
36
 
37
  <div align="center">
38
  <table style="margin-left: auto; margin-right: auto;">
@@ -59,15 +62,16 @@ This Hugging Face repository includes video storyboard classification models, fr
59
  </table>
60
  </div>
61
 
62
-
63
  The data curation pipeline leading to the clean videos in the LEMON dataset is as follows:
64
  <div align="center">
65
  <img src="https://cdn-uploads.huggingface.co/production/uploads/67d9504a41d31cc626fcecc8/jzw36jlPT-V_I-Vm01OzO.png"> </img>
66
  </div>
67
 
68
- Usage
69
- --------
70
- **Video classification models** are employed in the step **2** of the data curation pipeline to classify a video storyboard as either surgical or non-surgical, the models usage is as follows:
 
 
71
  ```python
72
  import torch
73
  import torchvision
@@ -102,7 +106,8 @@ Usage
102
  outputs = net(img_tensor)
103
  ```
104
 
105
- **Frame classification models** are used in the step **3** of the data curation pipeline to classify a frame as either surgical or non-surgical, the models usage is as follows:
 
106
 
107
  ```python
108
  import torch
@@ -137,7 +142,8 @@ Usage
137
  outputs = net(img_tensor)
138
  ```
139
 
140
- **Non-surgical object detection models** are used to obliterate the non-surgical region in the surgical frames (e.g. user interface information), the models usage is as follows:
 
141
 
142
  ```python
143
  import torch
@@ -170,4 +176,4 @@ Usage
170
 
171
  # Extract features from the image
172
  outputs = net(img_tensor)
173
- ```
 
1
  ---
2
  license: apache-2.0
3
+ pipeline_tag: image-classification
4
+ tags:
5
+ - medical
6
+ - surgical
7
+ - endoscopy
8
  ---
9
 
10
  <div align="center">
11
  <img src="https://cdn-uploads.huggingface.co/production/uploads/67d9504a41d31cc626fcecc8/cE7UgFfJJ2gUHJr0SSEhc.png"> </img>
12
  </div>
13
 
 
 
 
14
  [πŸ“š Paper](https://arxiv.org/abs/2503.19740) - [πŸ€– GitHub](https://github.com/visurg-ai/LEMON)
15
 
16
+ This repository provides the models used in the data curation pipeline for the paper [LEMON: A Large Endoscopic MONocular Dataset and Foundation Model for Perception in Surgical Settings](https://arxiv.org/abs/2503.19740). These models assist in constructing the LEMON dataset by filtering and processing surgical video content.
 
17
 
18
+ For more details about the LEMON dataset and our LemonFM foundation model, please visit our [GitHub repository](https://github.com/visurg-ai/LEMON).
19
+
20
+ ## Citation
21
 
22
  If you use our dataset, model, or code in your research, please cite our paper:
23
 
24
+ ```bibtex
25
  @misc{che2025lemonlargeendoscopicmonocular,
26
  title={LEMON: A Large Endoscopic MONocular Dataset and Foundation Model for Perception in Surgical Settings},
27
+ author={Chengan Che and Chao Wang and Tom Vercauteren messenger, Sophia Tsoka and Luis C. Garcia-Peraza-Herrera},
28
  year={2025},
29
  eprint={2503.19740},
30
  archivePrefix={arXiv},
 
33
  }
34
  ```
35
 
36
+ ## Model Overview
37
 
38
+ This Hugging Face repository includes video storyboard classification models, frame classification models, and non-surgical object detection models. The model loader file can be found at [model_loader.py](https://huggingface.co/visurg/Surg3M_curation_models/blob/main/model_loader.py).
 
 
39
 
40
  <div align="center">
41
  <table style="margin-left: auto; margin-right: auto;">
 
62
  </table>
63
  </div>
64
 
 
65
  The data curation pipeline leading to the clean videos in the LEMON dataset is as follows:
66
  <div align="center">
67
  <img src="https://cdn-uploads.huggingface.co/production/uploads/67d9504a41d31cc626fcecc8/jzw36jlPT-V_I-Vm01OzO.png"> </img>
68
  </div>
69
 
70
+ ## Usage
71
+
72
+ ### Video classification models
73
+ **Video classification models** are employed in step **2** of the data curation pipeline to classify a video storyboard as either surgical or non-surgical:
74
+
75
  ```python
76
  import torch
77
  import torchvision
 
106
  outputs = net(img_tensor)
107
  ```
108
 
109
+ ### Frame classification models
110
+ **Frame classification models** are used in step **3** of the data curation pipeline to classify a frame as either surgical or non-surgical:
111
 
112
  ```python
113
  import torch
 
142
  outputs = net(img_tensor)
143
  ```
144
 
145
+ ### Non-surgical object detection models
146
+ **Non-surgical object detection models** are used to obliterate the non-surgical region in the surgical frames (e.g. user interface information):
147
 
148
  ```python
149
  import torch
 
176
 
177
  # Extract features from the image
178
  outputs = net(img_tensor)
179
+ ```