OpenGlottal: U-Net (BAGLS Full-frame)
This repository contains the pre-trained U-Net weights (og_bagls_unet_full.pt) for glottal area segmentation, as presented in the paper: A Detection-Gated Pipeline for Robust Glottal Area Waveform Extraction and Clinical Pathology Assessment.
This model is part of the OpenGlottal toolkit, an open-source framework for automated glottal area segmentation from high-speed videoendoscopy (HSV).
- Code: GitHub Repository
- Paper: arXiv:2603.02087
Model Description
The model is a U-Net pixel-level segmenter trained on the BAGLS dataset (N=55,750). It is designed to extract glottal area waveforms (GAW) and left/right (L/R) displacement waveforms along a medial axis to enable clinical assessment of vocal fold function and pathology.
How to use
Installation
pip install openglottal
Python API
You can load the weights and run inference using the following snippet:
import torch
from openglottal import UNet
from huggingface_hub import hf_hub_download
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Download weights from Hugging Face
unet_path = hf_hub_download(repo_id="hari-krishnan-u/og_bagls_unet_full", filename="og_bagls_unet_full.pt")
# Initialize and load model
model = UNet(1, 1, (32, 64, 128, 256)).to(device)
model.load_state_dict(torch.load(unet_path, map_location=device))
model.eval()
CLI
Run displacement extraction in LR mode:
openglottal displacement /path/to/video.avi \
--unet-weights "og_bagls_unet_full.pt" \
--start 0 --end 500 \
--mode lr \
--lr-position 0.5 \
--output results/
Citation
If you use this model or the OpenGlottal toolkit in your research, please cite:
@misc{unnikrishnan2026openglottal,
title = {A Detection-Gated Pipeline for Robust Glottal Area
Waveform Extraction and Clinical Pathology Assessment},
author = {Unnikrishnan, Harikrishnan},
year = {2026},
eprint = {2603.02087},
archivePrefix = {arXiv},
primaryClass = {cs.CV},
url = {https://arxiv.org/abs/2603.02087}
}