|
|
--- |
|
|
license: mit |
|
|
tags: |
|
|
- audio |
|
|
- speech |
|
|
- phonology |
|
|
- wav2vec2 |
|
|
- multilingual |
|
|
- pytorch-lightning |
|
|
language: |
|
|
- en |
|
|
- es |
|
|
- de |
|
|
- cs |
|
|
pipeline_tag: audio-classification |
|
|
--- |
|
|
|
|
|
# PhonoQ 2.0 – Multilingual |
|
|
|
|
|
This repository hosts the **multilingual checkpoint** for **PhonoQ 2.0**, a modernized successor to the original PhonoQ system: |
|
|
https://github.com/TAriasVergara/PhonoQ |
|
|
|
|
|
PhonoQ 2.0 outputs **framewise probability distributions** over phonological heads from raw speech audio, built on a self-supervised speech encoder (e.g., wav2vec 2.0 / HuBERT). |
|
|
|
|
|
## What this model outputs |
|
|
|
|
|
Given an input audio file, the model produces **framewise head probabilities** for: |
|
|
|
|
|
- **Manner** (9 classes) |
|
|
- **Vowel height** (3 classes) |
|
|
- **Vowel backness** (3 classes) |
|
|
- **Place of articulation** (5 classes) |
|
|
- **Voicing** (2 classes) |
|
|
|
|
|
Outputs are aligned to the encoder frame rate and returned as probabilities (not hard labels). |
|
|
|
|
|
## How to use |
|
|
|
|
|
This checkpoint is intended to be used with the PhonoQ 2.0 inference code: |
|
|
https://github.com/abnerLing/PhonoQ-2.0 |
|
|
|
|
|
### 1) Install PhonoQ 2.0 (from GitHub) |
|
|
|
|
|
Follow the installation instructions in the GitHub repository (PyTorch is required). |
|
|
|
|
|
### 2) Download this checkpoint |
|
|
|
|
|
```bash |
|
|
wget https://huggingface.co/abnerh/phonoq-2.0-multilingual/resolve/main/best.ckpt |
|
|
|