Update README.md
Browse files
README.md
CHANGED
|
@@ -13,6 +13,10 @@ tags:
|
|
| 13 |
- International Phonetic Alphabet
|
| 14 |
- CTC
|
| 15 |
- multilingual
|
|
|
|
|
|
|
|
|
|
|
|
|
| 16 |
---
|
| 17 |
# Model Card for Wav2Vec2 Large with Common Phone
|
| 18 |
|
|
@@ -42,7 +46,7 @@ The model uses 16 kHz audio to predict the most probable sequence of uttered IPA
|
|
| 42 |
|
| 43 |
### Model Description
|
| 44 |
|
| 45 |
-
This model was created to analyze pathological speech signals. It was optimized with Common Phone, a multilingual corpus for robust acoustic modelling. It comprises more than 11.000 speakers which were carefully selected from Mozilla's Common Voice dataset.
|
| 46 |
Results in terms of phone error rate (PER) in percent:
|
| 47 |
|
| 48 |
| Language | Test PER |
|
|
@@ -60,7 +64,7 @@ Results in terms of phone error rate (PER) in percent:
|
|
| 60 |
- **Languages:** Multilingual (English, French, German, Italian, Russian, Spanish)
|
| 61 |
- **License:** [Creative Commons Zero 1.0 (CC0)](https://creativecommons.org/publicdomain/zero/1.0/deed.en)
|
| 62 |
- **Finetuned from model:** [Wav2Vec2 XLSR-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53)
|
| 63 |
-
- **Finetuning dataset:** [Common Phone](https://
|
| 64 |
|
| 65 |
### Model Sources [optional]
|
| 66 |
|
|
@@ -71,6 +75,4 @@ Results in terms of phone error rate (PER) in percent:
|
|
| 71 |
|
| 72 |
## Contact
|
| 73 |
|
| 74 |
-
[Philipp Klumpp](mailto:philipp-klumpp@live.de)
|
| 75 |
-
|
| 76 |
-
|
|
|
|
| 13 |
- International Phonetic Alphabet
|
| 14 |
- CTC
|
| 15 |
- multilingual
|
| 16 |
+
datasets:
|
| 17 |
+
- pklumpp/CommonPhoneDataset
|
| 18 |
+
base_model:
|
| 19 |
+
- facebook/wav2vec2-large-xlsr-53
|
| 20 |
---
|
| 21 |
# Model Card for Wav2Vec2 Large with Common Phone
|
| 22 |
|
|
|
|
| 46 |
|
| 47 |
### Model Description
|
| 48 |
|
| 49 |
+
This model was created to analyze pathological speech signals. It was optimized with [Common Phone](https://huggingface.co/datasets/pklumpp/CommonPhoneDataset), a multilingual corpus for robust acoustic modelling. It comprises more than 11.000 speakers which were carefully selected from Mozilla's Common Voice dataset.
|
| 50 |
Results in terms of phone error rate (PER) in percent:
|
| 51 |
|
| 52 |
| Language | Test PER |
|
|
|
|
| 64 |
- **Languages:** Multilingual (English, French, German, Italian, Russian, Spanish)
|
| 65 |
- **License:** [Creative Commons Zero 1.0 (CC0)](https://creativecommons.org/publicdomain/zero/1.0/deed.en)
|
| 66 |
- **Finetuned from model:** [Wav2Vec2 XLSR-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53)
|
| 67 |
+
- **Finetuning dataset:** [Common Phone](https://huggingface.co/datasets/pklumpp/CommonPhoneDataset) as published in [**Common Phone: A Multilingual Dataset for Robust Acoustic Modelling**](http://www.lrec-conf.org/proceedings/lrec2022/pdf/2022.lrec-1.81.pdf)
|
| 68 |
|
| 69 |
### Model Sources [optional]
|
| 70 |
|
|
|
|
| 75 |
|
| 76 |
## Contact
|
| 77 |
|
| 78 |
+
[Philipp Klumpp](mailto:philipp-klumpp@live.de)
|
|
|
|
|
|