Update README.md
Browse files
README.md
CHANGED
|
@@ -9,7 +9,7 @@ license: apache-2.0
|
|
| 9 |
|
| 10 |
We provide the models used in our data curation pipeline in [📚 Surg-3M: A Dataset and Foundation Model for Perception in Surgical Settings](TODO) to assist with constructing the Surg-3M dataset (for more details about the Surg-3M dataset and our
|
| 11 |
SurgFM foundation model, please visit our github repository at [🤖 GitHub](https://github.com/visurg-ai/surg-3m)) .
|
| 12 |
-
This
|
| 13 |
|
| 14 |
|
| 15 |
<div align="center">
|
|
@@ -37,10 +37,15 @@ This huggingface repository includes video storyboard classification models, fra
|
|
| 37 |
</table>
|
| 38 |
</div>
|
| 39 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 40 |
Usage
|
| 41 |
--------
|
| 42 |
-
Video classification
|
| 43 |
-
|
| 44 |
```python
|
| 45 |
import torch
|
| 46 |
from PIL import Image
|
|
@@ -120,8 +125,3 @@ Non-surgical object detection model
|
|
| 120 |
# Extract features from the image
|
| 121 |
outputs = net(img_tensor)
|
| 122 |
```
|
| 123 |
-
|
| 124 |
-
The video processing pipeline leading to the clean videos in the Surg-3M dataset is as follows:
|
| 125 |
-
<div align="center">
|
| 126 |
-
<img src="https://cdn-uploads.huggingface.co/production/uploads/67d9504a41d31cc626fcecc8/yj2S0GMJm2C2AYwbr1p6G.png"> </img>
|
| 127 |
-
</div>
|
|
|
|
| 9 |
|
| 10 |
We provide the models used in our data curation pipeline in [📚 Surg-3M: A Dataset and Foundation Model for Perception in Surgical Settings](TODO) to assist with constructing the Surg-3M dataset (for more details about the Surg-3M dataset and our
|
| 11 |
SurgFM foundation model, please visit our github repository at [🤖 GitHub](https://github.com/visurg-ai/surg-3m)) .
|
| 12 |
+
This Hugging Face repository includes video storyboard classification models, frame classification models, and non-surgical object detection models. The model loader file can be found at [model_loader.py](https://huggingface.co/visurg/Surg3M_curation_models/blob/main/model_loader.py)
|
| 13 |
|
| 14 |
|
| 15 |
<div align="center">
|
|
|
|
| 37 |
</table>
|
| 38 |
</div>
|
| 39 |
|
| 40 |
+
|
| 41 |
+
The video processing pipeline leading to the clean videos in the Surg-3M dataset is as follows:
|
| 42 |
+
<div align="center">
|
| 43 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/67d9504a41d31cc626fcecc8/yj2S0GMJm2C2AYwbr1p6G.png"> </img>
|
| 44 |
+
</div>
|
| 45 |
+
|
| 46 |
Usage
|
| 47 |
--------
|
| 48 |
+
Video classification models are employed in the step 2 of the data curation pipeline to classify a video storyboard as either surgical or non-surgical.
|
|
|
|
| 49 |
```python
|
| 50 |
import torch
|
| 51 |
from PIL import Image
|
|
|
|
| 125 |
# Extract features from the image
|
| 126 |
outputs = net(img_tensor)
|
| 127 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|