added updates

Browse files

Files changed (5) hide show

README.md +75 -0
VALID_rundi_run_audio_data.csv +6 -0
afrospeech-wav2vec-run_METRICS_VALID.json +1 -0
afrospeech-wav2vec-run_confusion_matrix_VALID.png +0 -0
digits-bar-plot-for-afrospeech-wav2vec-run.png +0 -0

README.md ADDED Viewed

	@@ -0,0 +1,75 @@

+---
+license: apache-2.0
+tags:
+- afro-digits-speech
+datasets:
+- crowd-speech-africa
+metrics:
+- accuracy
+model-index:
+- name: afrospeech-wav2vec-run
+  results:
+  - task:
+      name: Audio Classification
+      type: audio-classification
+    dataset:
+      name: Afro Speech
+      type: chrisjay/crowd-speech-africa
+      args: no
+    metrics:
+       - name: Validation Accuracy
+         type: accuracy
+         value: 0.8
+---
+# afrospeech-wav2vec-run
+This model is a fine-tuned version of [facebook/wav2vec2-base](https://huggingface.co/facebook/wav2vec2-base) on the [crowd-speech-africa](https://huggingface.co/datasets/chrisjay/crowd-speech-africa), which was a crowd-sourced dataset collected using the [afro-speech Space](https://huggingface.co/spaces/chrisjay/afro-speech). It achieves the following results on the [validation set](VALID_rundi_run_audio_data.csv):
+- F1: 0.8
+- Accuracy: 0.8
+The confusion matrix below helps to give a better look at the model's performance across the digits. Through it, we can see the precision and recall of the model as well as other important insights.
+![confusion matrix](afrospeech-wav2vec-run_confusion_matrix_VALID.png)
+## Training and evaluation data
+The model was trained on a mixed audio data from Rundi (`run`).
+- Size of training set: 16
+- Size of validation set: 5
+Below is a distribution of the dataset (training and valdation)
+![digits-bar-plot-for-afrospeech](digits-bar-plot-for-afrospeech-wav2vec-run.png)
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 3e-05
+- train_batch_size: 64
+- eval_batch_size: 64
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- num_epochs: 150
+### Training results
+| Training Loss | Epoch |  Validation Accuracy |
+|:-------------:|:-----:|:--------:|
+|0.00183        | 1    | 0.6  |
+|0.0003991       | 50   | 0.8  |
+| 0.0002174       | 100   | 0.6  |
+|0.0043911       | 150   | 0.4  |
+### Framework versions
+- Transformers 4.21.3
+- Pytorch 1.12.0
+- Datasets 1.14.0
+- Tokenizers 0.12.1

VALID_rundi_run_audio_data.csv ADDED Viewed

	@@ -0,0 +1,6 @@

+audio_path,transcript,lang,lang_code,gender,age,country,accent
+AUDIO_HOMEPATH/data/p34rQlhMwuOkmqhltKEd8lYW2xYPN3do/audio.wav,4,rundi,run,Male,29.0,France,
+AUDIO_HOMEPATH/data/tkyZJyB9IS2PJ8HNgqOybUSdnrTl48up/audio.wav,5,rundi,run,Male,29.0,France,
+AUDIO_HOMEPATH/data/tF3MBOD3L6tHfmRYoO8u1kdwgUlaMPZr/audio.wav,9,rundi,run,Male,29.0,France,
+AUDIO_HOMEPATH/data/Qlqz7bCCQUJuZYiTDgLxFhElL4AncAH7/audio.wav,7,rundi,run,Male,29.0,France,
+AUDIO_HOMEPATH/data/EsElOIcf8DSiOTn1uoihsYiVKjDWXwKH/audio.wav,8,rundi,run,Male,25.0,France,

afrospeech-wav2vec-run_METRICS_VALID.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"acc": 0.8, "f1": 0.8}

afrospeech-wav2vec-run_confusion_matrix_VALID.png ADDED Viewed

digits-bar-plot-for-afrospeech-wav2vec-run.png ADDED Viewed