|
|
--- |
|
|
language: [] |
|
|
license: mit |
|
|
tags: |
|
|
- pytorch |
|
|
- image-segmentation |
|
|
- sam2 |
|
|
- glove |
|
|
- baseball |
|
|
- sports-analytics |
|
|
- computer-vision |
|
|
- custom-model |
|
|
library_name: pytorch |
|
|
datasets: |
|
|
- custom |
|
|
metrics: |
|
|
- dice |
|
|
- iou |
|
|
inference: true |
|
|
widget: [] |
|
|
model-index: |
|
|
- name: glove_labelling |
|
|
results: [] |
|
|
--- |
|
|
|
|
|
# Glove Labelling Model (SAM2 fine-tuned) |
|
|
|
|
|
This repository contains a fine-tuned [SAM2](https://github.com/facebookresearch/sam2) hierarchical image segmentation model adapted for high-precision baseball glove segmentation. |
|
|
|
|
|
### ๐ก What it does |
|
|
|
|
|
Given a frame from a pitching video, this model outputs per-pixel segmentations for: |
|
|
|
|
|
- `glove_outline` |
|
|
- `webbing` |
|
|
- `thumb` |
|
|
- `palm_pocket` |
|
|
- `hand` |
|
|
- `glove_exterior` |
|
|
|
|
|
Trained on individual pitch frame sequences using COCO format masks. |
|
|
|
|
|
--- |
|
|
|
|
|
### ๐ Architecture |
|
|
|
|
|
- Base Model: `SAM2Hierarchical` |
|
|
- Framework: PyTorch |
|
|
- Input shape: `[1, 3, 720, 1280]` RGB frame |
|
|
- Output: Segmentation logits across 6 glove-related classes |
|
|
|
|
|
--- |
|
|
|
|
|
### ๐ง Usage |
|
|
|
|
|
To use the model for inference: |
|
|
|
|
|
```python |
|
|
import torch |
|
|
from PIL import Image |
|
|
import torchvision.transforms as T |
|
|
|
|
|
model = torch.load("pytorch_model.bin", map_location="cpu") |
|
|
model.eval() |
|
|
|
|
|
transform = T.Compose([ |
|
|
T.Resize((720, 1280)), |
|
|
T.ToTensor() |
|
|
]) |
|
|
|
|
|
img = Image.open("example.jpg").convert("RGB") |
|
|
x = transform(img).unsqueeze(0) |
|
|
|
|
|
with torch.no_grad(): |
|
|
output = model(x) |
|
|
|
|
|
# Convert logits to class labels |
|
|
pred_mask = output.argmax(dim=1).squeeze().cpu().numpy() |
|
|
|