Update README.md
Browse files
README.md
CHANGED
|
@@ -17,41 +17,44 @@ ConFit is a pioneering organisation dedicated to advancing the fields of speech
|
|
| 17 |
|
| 18 |
Audio classification:
|
| 19 |
|
| 20 |
-
| Dataset | Classes | Task |
|
| 21 |
-
| :---: | :---: | :---: | :---: | :---: |
|
| 22 |
-
| WMMS | 32 | Multi-class | 1697 | 10.42 |
|
| 23 |
-
| MSWC (English) | 271 | Multi-class | 33726 | 0.99 |
|
| 24 |
-
| MSWC (Spanish) | 146 | Multi-class | 11759 | 0.99 |
|
| 25 |
-
| MSWC (Indian) | 14 | Multi-class | 739 | 0.99 |
|
| 26 |
-
| ESC50 | 50 | Multi-class | 2000 | 5.00 |
|
| 27 |
-
|
|
| 28 |
-
|
|
| 29 |
-
|
|
| 30 |
-
|
|
| 31 |
-
|
|
| 32 |
-
|
|
| 33 |
-
|
|
| 34 |
-
|
|
| 35 |
-
|
|
| 36 |
-
|
|
| 37 |
-
|
|
| 38 |
-
|
|
| 39 |
-
|
|
| 40 |
-
|
|
|
|
|
|
|
|
|
|
|
| 41 |
|
| 42 |
Automated audio captioning:
|
| 43 |
|
| 44 |
-
| Dataset |
|
| 45 |
-
| :---: | :---: | :---: |
|
| 46 |
-
| Music4All | | |
|
| 47 |
|
| 48 |
Music, speech, and noise:
|
| 49 |
|
| 50 |
-
| Dataset |
|
| 51 |
-
| :---: | :---: | :---: |
|
| 52 |
-
| MUSAN | | |
|
| 53 |
-
| RIR-Noise | | |
|
| 54 |
-
| ARCA23K | | |
|
| 55 |
|
| 56 |
## Contact Us
|
| 57 |
|
|
|
|
| 17 |
|
| 18 |
Audio classification:
|
| 19 |
|
| 20 |
+
| Dataset | Split Method | Classes | Task | # Clips | Average Duration | Sampling Rate |
|
| 21 |
+
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|
| 22 |
+
| WMMS | TT | 32 | Multi-class | 1697 | 10.42 | 16000 |
|
| 23 |
+
| MSWC (English) | TVT | 271 | Multi-class | 33726 | 0.99 | 16000 |
|
| 24 |
+
| MSWC (Spanish) | TVT | 146 | Multi-class | 11759 | 0.99 | 16000 |
|
| 25 |
+
| MSWC (Indian) | TVT | 14 | Multi-class | 739 | 0.99 | 16000 |
|
| 26 |
+
| ESC50 | 5-fold | 50 | Multi-class | 2000 | 5.00 | 44100 |
|
| 27 |
+
| UrbanSound8K | | 10 | Multi-class | | | |
|
| 28 |
+
| AudioSet | | 527 | Multi-label | | | |
|
| 29 |
+
| MagnaTagATune | | | Multi-label | | | |
|
| 30 |
+
| Medley-solos-DB | | 8 | Multi-class | | | 44100 |
|
| 31 |
+
| Pianos | TVT | 8 | Multi-class | 668 | 4.86 | 16000 |
|
| 32 |
+
| FSD-Kaggle-2019 (curated) | TT | 80 | Multi-label | 9451 | 8.93 | 44100 |
|
| 33 |
+
| GTZAN | TVT | 10 | Multi-class | 930 | 30.02 | 22050 |
|
| 34 |
+
| Nsynth (instrument) | TVT | 11 | Multi-class | 305979 | 4.00 | 16000 |
|
| 35 |
+
| Nsynth (pitch) | TVT | 112 | Multi-class | 305979 | 4.00 | 16000 |
|
| 36 |
+
| CREMA-D | TVT | 6 | Multi-class | 7442 | 2.54 | 16000 |
|
| 37 |
+
| IEMOCAP | 5-fold | 4 | Multi-class | 5531 | 4.52 | 16000 |
|
| 38 |
+
| EmoDB | TT | 7 | Multi-class | 535 | 2.77 | 16000 |
|
| 39 |
+
| EMOVO | 6-fold | 7 | Multi-class | 588 | 3.12 | 48000 |
|
| 40 |
+
| IRMAS | TT | 11 | Multi-label | 9579 | 7.16 | 44100 |
|
| 41 |
+
| RAVDESS | 5-fold | 8 | Multi-class | 2880 | 3.70 | 48000 |
|
| 42 |
+
| TIMIT | TVT | 630 | Multi-class | 6300 | 3.07 | 16000 |
|
| 43 |
+
| LibriSpeech | TT | 2484 | Multi-class | 21933 | 3.75 | 16000 |
|
| 44 |
|
| 45 |
Automated audio captioning:
|
| 46 |
|
| 47 |
+
| Dataset | # Clips | Average Duration | Sampling Rate |
|
| 48 |
+
| :---: | :---: | :---: | :---: |
|
| 49 |
+
| Music4All | | | |
|
| 50 |
|
| 51 |
Music, speech, and noise:
|
| 52 |
|
| 53 |
+
| Dataset | # Clips | Average Duration | Sampling Rate |
|
| 54 |
+
| :---: | :---: | :---: | :---: |
|
| 55 |
+
| MUSAN | 2016 | 195.16 | 16000 |
|
| 56 |
+
| RIR-Noise | 61260 | 1.54 | 16000 |
|
| 57 |
+
| ARCA23K | | | |
|
| 58 |
|
| 59 |
## Contact Us
|
| 60 |
|