PhonoQ 2.0 โ€“ Multilingual

This repository hosts the multilingual checkpoint for PhonoQ 2.0, a modernized successor to the original PhonoQ system: https://github.com/TAriasVergara/PhonoQ

PhonoQ 2.0 outputs framewise probability distributions over phonological heads from raw speech audio, built on a self-supervised speech encoder (e.g., wav2vec 2.0 / HuBERT).

What this model outputs

Given an input audio file, the model produces framewise head probabilities for:

  • Manner (9 classes)
  • Vowel height (3 classes)
  • Vowel backness (3 classes)
  • Place of articulation (5 classes)
  • Voicing (2 classes)

Outputs are aligned to the encoder frame rate and returned as probabilities (not hard labels).

How to use

This checkpoint is intended to be used with the PhonoQ 2.0 inference code: https://github.com/abnerLing/PhonoQ-2.0

1) Install PhonoQ 2.0 (from GitHub)

Follow the installation instructions in the GitHub repository (PyTorch is required).

2) Download this checkpoint

wget https://huggingface.co/abnerh/phonoq-2.0-multilingual/resolve/main/best.ckpt
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support