A Detection-Gated Pipeline for Robust Glottal Area Waveform Extraction and Clinical Pathology Assessment
Paper • 2603.02087 • Published
How to use hari-krishnan-u/openglottal_yolo with ultralytics:
# Couldn't find a valid YOLO version tag.
# Replace XX with the correct version.
from ultralytics import YOLOvXX
model = YOLOvXX.from_pretrained("hari-krishnan-u/openglottal_yolo")
source = 'http://images.cocodataset.org/val2017/000000039769.jpg'
model.predict(source=source, save=True)OpenGlottal is an open-source toolkit for automated glottal area segmentation from high-speed videoendoscopy (HSV). It uses a detection-gated pipeline that combines a YOLOv8 glottis localizer with a U-Net segmenter to ensure robust glottal area waveform extraction.
You can use the following snippet to extract features from a video using the pre-trained weights:
import torch
from openglottal import TemporalDetector, UNet, extract_features_unet
device = torch.device("cpu") # or "cuda" / "mps"
detector = TemporalDetector("weights/og_girafe_yolo.pt")
model = UNet(1, 1, (32, 64, 128, 256)).to(device)
model.load_state_dict(torch.load("weights/og_girafe_unet_full.pt", map_location=device))
model.eval()
features = extract_features_unet("video.avi", detector, model, device)
print(features)
# {'area_mean': 312.4, 'area_std': 98.1, 'open_quotient': 0.61, 'f0': 0.017, ...}
The toolkit also supports a command-line interface:
openglottal run video.avi \
--yolo-weights weights/og_girafe_yolo.pt \
--unet-weights weights/og_girafe_unet_full.pt \
--pipeline unet \
--output results/
@misc{unnikrishnan2026openglottal,
title = {A Detection-Gated Pipeline for Robust Glottal Area
Waveform Extraction and Clinical Pathology Assessment},
author = {Unnikrishnan, Harikrishnan},
year = {2026},
eprint = {2603.02087},
archivePrefix = {arXiv},
primaryClass = {cs.CV},
url = {https://arxiv.org/abs/2603.02087}
}
Base model
Ultralytics/YOLOv8