OpenGlottal: U-Net (BAGLS-Crop)
This repository contains the pre-trained U-Net weights (og_bagls_unet_crop.pt) for the OpenGlottal toolkit.
Model Description
OpenGlottal is an open-source toolkit for automated glottal area segmentation from high-speed videoendoscopy (HSV). This specific model is a U-Net pixel-level segmenter trained on the BAGLS dataset using a crop-based approach. In this pipeline, a YOLOv8 glottis localizer defines a tight crop which is then processed by this U-Net to extract the glottal area waveform.
The model was presented in the paper A Detection-Gated Pipeline for Robust Glottal Area Waveform Extraction and Clinical Pathology Assessment.
- Code: GitHub - hari-krishnan/openglottal
- Paper: arXiv:2603.02087
Installation
To use this model with the official toolkit:
git clone https://github.com/hari-krishnan/openglottal.git
cd openglottal
pip install -e .
How to use
Python API
import torch
from openglottal import TemporalDetector, UNet, extract_features_unet
from huggingface_hub import hf_hub_download
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Download the weights
unet_path = hf_hub_download(repo_id="hari-krishnan-u/og_bagls_unet_crop", filename="og_bagls_unet_crop.pt")
# Initialize and load U-Net
model = UNet(1, 1, (32, 64, 128, 256)).to(device)
model.load_state_dict(torch.load(unet_path, map_location=device))
model.eval()
# Note: The detection-gated pipeline also requires a detector (YOLO)
# detector = TemporalDetector("path/to/yolo_weights.pt")
# features = extract_features_unet("video.avi", detector, model, device)
CLI
Download the U-Net weight and run displacement extraction:
from huggingface_hub import hf_hub_download
unet_path = hf_hub_download(repo_id="hari-krishnan-u/og_bagls_unet_crop", filename="og_bagls_unet_crop.pt")
Run displacement (LR mode):
openglottal displacement /path/to/video.avi \
--unet-weights "$unet_path" \
--start 0 --end 500 \
--mode lr \
--lr-position 0.5 \
--output results/
Citation
If you use this model or the OpenGlottal toolkit, please cite:
@misc{unnikrishnan2026openglottal,
title = {A Detection-Gated Pipeline for Robust Glottal Area
Waveform Extraction and Clinical Pathology Assessment},
author = {Unnikrishnan, Harikrishnan},
year = {2026},
eprint = {2603.02087},
archivePrefix = {arXiv},
primaryClass = {cs.CV},
url = {https://arxiv.org/abs/2603.02087}
}