abnerh's picture
Create README
33a849a verified
---
license: mit
tags:
- audio
- speech
- phonology
- wav2vec2
- multilingual
- pytorch-lightning
language:
- en
- es
- de
- cs
pipeline_tag: audio-classification
---
# PhonoQ 2.0 – Multilingual
This repository hosts the **multilingual checkpoint** for **PhonoQ 2.0**, a modernized successor to the original PhonoQ system:
https://github.com/TAriasVergara/PhonoQ
PhonoQ 2.0 outputs **framewise probability distributions** over phonological heads from raw speech audio, built on a self-supervised speech encoder (e.g., wav2vec 2.0 / HuBERT).
## What this model outputs
Given an input audio file, the model produces **framewise head probabilities** for:
- **Manner** (9 classes)
- **Vowel height** (3 classes)
- **Vowel backness** (3 classes)
- **Place of articulation** (5 classes)
- **Voicing** (2 classes)
Outputs are aligned to the encoder frame rate and returned as probabilities (not hard labels).
## How to use
This checkpoint is intended to be used with the PhonoQ 2.0 inference code:
https://github.com/abnerLing/PhonoQ-2.0
### 1) Install PhonoQ 2.0 (from GitHub)
Follow the installation instructions in the GitHub repository (PyTorch is required).
### 2) Download this checkpoint
```bash
wget https://huggingface.co/abnerh/phonoq-2.0-multilingual/resolve/main/best.ckpt