OpenGlottal

OpenGlottal is an open-source toolkit for automated glottal area segmentation from high-speed videoendoscopy (HSV). It uses a detection-gated pipeline that combines a YOLOv8 glottis localizer with a U-Net segmenter to ensure robust glottal area waveform extraction.

Sample Usage

Python API

You can use the following snippet to extract features from a video using the pre-trained weights:

import torch
from openglottal import TemporalDetector, UNet, extract_features_unet

device = torch.device("cpu")   # or "cuda" / "mps"

detector = TemporalDetector("weights/og_girafe_yolo.pt")

model = UNet(1, 1, (32, 64, 128, 256)).to(device)
model.load_state_dict(torch.load("weights/og_girafe_unet_full.pt", map_location=device))
model.eval()

features = extract_features_unet("video.avi", detector, model, device)
print(features)
# {'area_mean': 312.4, 'area_std': 98.1, 'open_quotient': 0.61, 'f0': 0.017, ...}

CLI

The toolkit also supports a command-line interface:

openglottal run video.avi \
    --yolo-weights weights/og_girafe_yolo.pt \
    --unet-weights weights/og_girafe_unet_full.pt \
    --pipeline unet \
    --output results/

Citation

@misc{unnikrishnan2026openglottal,
  title         = {A Detection-Gated Pipeline for Robust Glottal Area
                   Waveform Extraction and Clinical Pathology Assessment},
  author        = {Unnikrishnan, Harikrishnan},
  year          = {2026},
  eprint        = {2603.02087},
  archivePrefix = {arXiv},
  primaryClass  = {cs.CV},
  url           = {https://arxiv.org/abs/2603.02087}
}
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for hari-krishnan-u/openglottal_yolo

Finetuned
(156)
this model

Paper for hari-krishnan-u/openglottal_yolo