phonoq-2.0-multilingual / README.md

abnerh

Create README

33a849a verified 15 days ago

preview code

raw

history blame contribute delete

1.29 kB

metadata

license: mit
tags:
  - audio
  - speech
  - phonology
  - wav2vec2
  - multilingual
  - pytorch-lightning
language:
  - en
  - es
  - de
  - cs
pipeline_tag: audio-classification

PhonoQ 2.0 – Multilingual

This repository hosts the multilingual checkpoint for PhonoQ 2.0, a modernized successor to the original PhonoQ system: https://github.com/TAriasVergara/PhonoQ

PhonoQ 2.0 outputs framewise probability distributions over phonological heads from raw speech audio, built on a self-supervised speech encoder (e.g., wav2vec 2.0 / HuBERT).

What this model outputs

Given an input audio file, the model produces framewise head probabilities for:

Manner (9 classes)
Vowel height (3 classes)
Vowel backness (3 classes)
Place of articulation (5 classes)
Voicing (2 classes)

Outputs are aligned to the encoder frame rate and returned as probabilities (not hard labels).

How to use

This checkpoint is intended to be used with the PhonoQ 2.0 inference code: https://github.com/abnerLing/PhonoQ-2.0

1) Install PhonoQ 2.0 (from GitHub)

Follow the installation instructions in the GitHub repository (PyTorch is required).

2) Download this checkpoint

wget https://huggingface.co/abnerh/phonoq-2.0-multilingual/resolve/main/best.ckpt