OpenGlottal: U-Net (BAGLS-Crop)

This repository contains the pre-trained U-Net weights (og_bagls_unet_crop.pt) for the OpenGlottal toolkit.

Model Description

OpenGlottal is an open-source toolkit for automated glottal area segmentation from high-speed videoendoscopy (HSV). This specific model is a U-Net pixel-level segmenter trained on the BAGLS dataset using a crop-based approach. In this pipeline, a YOLOv8 glottis localizer defines a tight crop which is then processed by this U-Net to extract the glottal area waveform.

The model was presented in the paper A Detection-Gated Pipeline for Robust Glottal Area Waveform Extraction and Clinical Pathology Assessment.

Installation

To use this model with the official toolkit:

git clone https://github.com/hari-krishnan/openglottal.git
cd openglottal
pip install -e .

How to use

Python API

import torch
from openglottal import TemporalDetector, UNet, extract_features_unet
from huggingface_hub import hf_hub_download

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Download the weights
unet_path = hf_hub_download(repo_id="hari-krishnan-u/og_bagls_unet_crop", filename="og_bagls_unet_crop.pt")

# Initialize and load U-Net
model = UNet(1, 1, (32, 64, 128, 256)).to(device)
model.load_state_dict(torch.load(unet_path, map_location=device))
model.eval()

# Note: The detection-gated pipeline also requires a detector (YOLO)
# detector = TemporalDetector("path/to/yolo_weights.pt")
# features = extract_features_unet("video.avi", detector, model, device)

CLI

Download the U-Net weight and run displacement extraction:

from huggingface_hub import hf_hub_download
unet_path = hf_hub_download(repo_id="hari-krishnan-u/og_bagls_unet_crop", filename="og_bagls_unet_crop.pt")

Run displacement (LR mode):

openglottal displacement /path/to/video.avi \
  --unet-weights "$unet_path" \
  --start 0 --end 500 \
  --mode lr \
  --lr-position 0.5 \
  --output results/

Citation

If you use this model or the OpenGlottal toolkit, please cite:

@misc{unnikrishnan2026openglottal,
  title         = {A Detection-Gated Pipeline for Robust Glottal Area
                   Waveform Extraction and Clinical Pathology Assessment},
  author        = {Unnikrishnan, Harikrishnan},
  year          = {2026},
  eprint        = {2603.02087},
  archivePrefix = {arXiv},
  primaryClass  = {cs.CV},
  url           = {https://arxiv.org/abs/2603.02087}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for hari-krishnan-u/og_bagls_unet_crop