abnerh
/

phonoq-2.0-multilingual

Audio Classification

pytorch-lightning

Model card Files Files and versions

phonoq-2.0-multilingual / README.md

abnerh's picture

Create README

33a849a verified 15 days ago

|

history blame contribute delete

1.29 kB

	---
	license: mit
	tags:
	- audio
	- speech
	- phonology
	- wav2vec2
	- multilingual
	- pytorch-lightning
	language:
	- en
	- es
	- de
	- cs
	pipeline_tag: audio-classification
	---

	# PhonoQ 2.0 – Multilingual

	This repository hosts the multilingual checkpoint for PhonoQ 2.0, a modernized successor to the original PhonoQ system:
	https://github.com/TAriasVergara/PhonoQ

	PhonoQ 2.0 outputs framewise probability distributions over phonological heads from raw speech audio, built on a self-supervised speech encoder (e.g., wav2vec 2.0 / HuBERT).

	## What this model outputs

	Given an input audio file, the model produces framewise head probabilities for:

	- Manner (9 classes)
	- Vowel height (3 classes)
	- Vowel backness (3 classes)
	- Place of articulation (5 classes)
	- Voicing (2 classes)

	Outputs are aligned to the encoder frame rate and returned as probabilities (not hard labels).

	## How to use

	This checkpoint is intended to be used with the PhonoQ 2.0 inference code:
	https://github.com/abnerLing/PhonoQ-2.0

	### 1) Install PhonoQ 2.0 (from GitHub)

	Follow the installation instructions in the GitHub repository (PyTorch is required).

	### 2) Download this checkpoint

	```bash
	wget https://huggingface.co/abnerh/phonoq-2.0-multilingual/resolve/main/best.ckpt