Updated model information
Browse files
README.md
CHANGED
|
@@ -1,3 +1,40 @@
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
+
datasets:
|
| 4 |
+
- mozilla-foundation/common_voice_10_0
|
| 5 |
+
base_model:
|
| 6 |
+
- facebook/wav2vec2-xls-r-300m
|
| 7 |
+
tags:
|
| 8 |
+
- pytorch
|
| 9 |
+
- phoneme-recognition
|
| 10 |
+
pipeline_tag: automatic-speech-recognition
|
| 11 |
---
|
| 12 |
+
|
| 13 |
+
Model Information
|
| 14 |
+
=================
|
| 15 |
+
|
| 16 |
+
Allophant is a multilingual phoneme recognizer trained on spoken sentences in 34 languages, capable of generalizing zero-shot to unseen phoneme inventories.
|
| 17 |
+
|
| 18 |
+
The model is based on [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) and was pre-trained on a subset of the [Common Voice Corpus 10.0](https://huggingface.co/datasets/mozilla-foundation/common_voice_10_0) transcribed with [eSpeak NG](https://github.com/espeak-ng/espeak-ng).
|
| 19 |
+
|
| 20 |
+
| Model Name | UCLA Phonetic Corpus (PER) | UCLA Phonetic Corpus (AER) | Common Voice (PER) | Common Voice (AER) |
|
| 21 |
+
| ---------------- | ---------: | ---------: | -------: | -------: |
|
| 22 |
+
| [Multitask](https://huggingface.co/kgnlp/allophant) | **45.62%** | 19.44% | **34.34%** | **8.36%** |
|
| 23 |
+
| [Hierarchical](https://huggingface.co/kgnlp/allophant-hierarchical) | 46.09% | **19.18%** | 34.35% | 8.56% |
|
| 24 |
+
| **Multitask Shared** | 46.05% | 19.52% | 41.20% | 8.88% |
|
| 25 |
+
| [Baseline Shared](https://huggingface.co/kgnlp/allophant-baseline-shared) | 48.25% | - | 45.35% | - |
|
| 26 |
+
| [Baseline](https://huggingface.co/kgnlp/allophant-baseline) | 57.01% | - | 46.95% | - |
|
| 27 |
+
|
| 28 |
+
Note that our baseline models were trained without phonetic feature classifiers and therefore only support phoneme recognition.
|
| 29 |
+
|
| 30 |
+
Citation
|
| 31 |
+
========
|
| 32 |
+
|
| 33 |
+
```bibtex
|
| 34 |
+
@inproceedings{glocker2023allophant,
|
| 35 |
+
title={Allophant: Cross-lingual Phoneme Recognition with Articulatory Attributes},
|
| 36 |
+
author={Glocker, Kevin and Herygers, Aaricia and Georges, Munir},
|
| 37 |
+
year={2023},
|
| 38 |
+
booktitle={{Proc. Interspeech 2023}},
|
| 39 |
+
month={8}}
|
| 40 |
+
```
|