pklumpp commited on
Commit
205121b
·
verified ·
1 Parent(s): 58edfcf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -5
README.md CHANGED
@@ -13,6 +13,10 @@ tags:
13
  - International Phonetic Alphabet
14
  - CTC
15
  - multilingual
 
 
 
 
16
  ---
17
  # Model Card for Wav2Vec2 Large with Common Phone
18
 
@@ -42,7 +46,7 @@ The model uses 16 kHz audio to predict the most probable sequence of uttered IPA
42
 
43
  ### Model Description
44
 
45
- This model was created to analyze pathological speech signals. It was optimized with Common Phone, a multilingual corpus for robust acoustic modelling. It comprises more than 11.000 speakers which were carefully selected from Mozilla's Common Voice dataset.
46
  Results in terms of phone error rate (PER) in percent:
47
 
48
  | Language | Test PER |
@@ -60,7 +64,7 @@ Results in terms of phone error rate (PER) in percent:
60
  - **Languages:** Multilingual (English, French, German, Italian, Russian, Spanish)
61
  - **License:** [Creative Commons Zero 1.0 (CC0)](https://creativecommons.org/publicdomain/zero/1.0/deed.en)
62
  - **Finetuned from model:** [Wav2Vec2 XLSR-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53)
63
- - **Finetuning dataset:** [Common Phone](https://zenodo.org/records/5846137) as published in [**Common Phone: A Multilingual Dataset for Robust Acoustic Modelling**](http://www.lrec-conf.org/proceedings/lrec2022/pdf/2022.lrec-1.81.pdf)
64
 
65
  ### Model Sources [optional]
66
 
@@ -71,6 +75,4 @@ Results in terms of phone error rate (PER) in percent:
71
 
72
  ## Contact
73
 
74
- [Philipp Klumpp](mailto:philipp-klumpp@live.de)
75
-
76
-
 
13
  - International Phonetic Alphabet
14
  - CTC
15
  - multilingual
16
+ datasets:
17
+ - pklumpp/CommonPhoneDataset
18
+ base_model:
19
+ - facebook/wav2vec2-large-xlsr-53
20
  ---
21
  # Model Card for Wav2Vec2 Large with Common Phone
22
 
 
46
 
47
  ### Model Description
48
 
49
+ This model was created to analyze pathological speech signals. It was optimized with [Common Phone](https://huggingface.co/datasets/pklumpp/CommonPhoneDataset), a multilingual corpus for robust acoustic modelling. It comprises more than 11.000 speakers which were carefully selected from Mozilla's Common Voice dataset.
50
  Results in terms of phone error rate (PER) in percent:
51
 
52
  | Language | Test PER |
 
64
  - **Languages:** Multilingual (English, French, German, Italian, Russian, Spanish)
65
  - **License:** [Creative Commons Zero 1.0 (CC0)](https://creativecommons.org/publicdomain/zero/1.0/deed.en)
66
  - **Finetuned from model:** [Wav2Vec2 XLSR-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53)
67
+ - **Finetuning dataset:** [Common Phone](https://huggingface.co/datasets/pklumpp/CommonPhoneDataset) as published in [**Common Phone: A Multilingual Dataset for Robust Acoustic Modelling**](http://www.lrec-conf.org/proceedings/lrec2022/pdf/2022.lrec-1.81.pdf)
68
 
69
  ### Model Sources [optional]
70
 
 
75
 
76
  ## Contact
77
 
78
+ [Philipp Klumpp](mailto:philipp-klumpp@live.de)