Update README.md
Browse files
README.md
CHANGED
|
@@ -38,14 +38,14 @@ This Hugging Face repository includes video storyboard classification models, fr
|
|
| 38 |
</div>
|
| 39 |
|
| 40 |
|
| 41 |
-
The
|
| 42 |
<div align="center">
|
| 43 |
<img src="https://cdn-uploads.huggingface.co/production/uploads/67d9504a41d31cc626fcecc8/yj2S0GMJm2C2AYwbr1p6G.png"> </img>
|
| 44 |
</div>
|
| 45 |
|
| 46 |
Usage
|
| 47 |
--------
|
| 48 |
-
Video classification models are employed in the step 2 of the data curation pipeline to classify a video storyboard as either surgical or non-surgical
|
| 49 |
```python
|
| 50 |
import torch
|
| 51 |
from PIL import Image
|
|
@@ -72,7 +72,7 @@ Video classification models are employed in the step 2 of the data curation pipe
|
|
| 72 |
outputs = net(img_tensor)
|
| 73 |
```
|
| 74 |
|
| 75 |
-
Frame classification
|
| 76 |
|
| 77 |
```python
|
| 78 |
import torch
|
|
@@ -99,7 +99,7 @@ Frame classification model
|
|
| 99 |
outputs = net(img_tensor)
|
| 100 |
```
|
| 101 |
|
| 102 |
-
Non-surgical object detection
|
| 103 |
|
| 104 |
```python
|
| 105 |
import torch
|
|
|
|
| 38 |
</div>
|
| 39 |
|
| 40 |
|
| 41 |
+
The data curation pipeline leading to the clean videos in the Surg-3M dataset is as follows:
|
| 42 |
<div align="center">
|
| 43 |
<img src="https://cdn-uploads.huggingface.co/production/uploads/67d9504a41d31cc626fcecc8/yj2S0GMJm2C2AYwbr1p6G.png"> </img>
|
| 44 |
</div>
|
| 45 |
|
| 46 |
Usage
|
| 47 |
--------
|
| 48 |
+
Video classification models are employed in the step 2 of the data curation pipeline to classify a video storyboard as either surgical or non-surgical, the models usage is as follows:
|
| 49 |
```python
|
| 50 |
import torch
|
| 51 |
from PIL import Image
|
|
|
|
| 72 |
outputs = net(img_tensor)
|
| 73 |
```
|
| 74 |
|
| 75 |
+
Frame classification models are used in the step 3 of the data curation pipeline to classify a frame as either surgical or non-surgical, the models usage is as follows:
|
| 76 |
|
| 77 |
```python
|
| 78 |
import torch
|
|
|
|
| 99 |
outputs = net(img_tensor)
|
| 100 |
```
|
| 101 |
|
| 102 |
+
Non-surgical object detection models are used to obliterate the non-surgical region in the surgical frames (e.g. user interface information), the models usage is as follows:
|
| 103 |
|
| 104 |
```python
|
| 105 |
import torch
|