hynt commited on
Commit
d7a1e56
·
verified ·
1 Parent(s): 20b99da

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -3
README.md CHANGED
@@ -1,3 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
  # Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition
2
 
3
  **Efficient Conformer [Paper](https://arxiv.org/abs/2109.01163)**
@@ -21,6 +33,13 @@ Install [ctcdecode](https://github.com/parlance/ctcdecode)
21
 
22
  ## Prepare dataset and training pipline
23
 
 
 
 
 
 
 
 
24
  Steps:
25
 
26
  - Prepare a dataset folder that includes the data domains you want to train on, for example: ASRDataset/VLSP2020, ASRDataset/VLSP2021. Inside each VLSP2020 folder, there should be corresponding .wav and .txt files.
@@ -47,9 +66,9 @@ tensorboard --logdir callback_path
47
 
48
  ## LibriSpeech Performance
49
 
50
- | Model | Size | Type | Params (M) | gigaspeech_test/vlsp2023_test_pb/vlsp2023_test_pr gready WER (%)| gigaspeech_test/vlsp2023_test_pb/vlsp2023_test_pr n-gram WER (%) | GPUs |
51
- | :-------------------: |:--------: |:-----:|:----------:|:------:|:------:|:------:|
52
- | Efficient Conformer | Small | CTC | 13.4 | 19.61 / 23.06 / 23.17 | 17.86 / 21.11 / 21.42 | 1 x RTX 3090 |
53
 
54
  In the competition organized by VLSP, I used the Efficient Conformer Large architecture with approximately 127 million parameters. You can find the detailed results in the technical report below:
55
  https://www.overleaf.com/read/nhqjtcpktjyc#3b472e
 
1
+ ---
2
+ tags:
3
+ - speech-to-text
4
+ - vietnamese
5
+ - ai-model
6
+ - deep-learning
7
+ license: apache-2.0
8
+ library_name: pytorch
9
+ model_name: EfficientConformerVietnamese
10
+ language: vi
11
+ ---
12
+
13
  # Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition
14
 
15
  **Efficient Conformer [Paper](https://arxiv.org/abs/2109.01163)**
 
33
 
34
  ## Prepare dataset and training pipline
35
 
36
+ Dataset to train this mini version:
37
+ - Vivos
38
+ - Vietbud_500
39
+ - VLSP2020, VLSP2021, VLSP2022
40
+ - VietMed_labeled
41
+ - Google Fleurs
42
+
43
  Steps:
44
 
45
  - Prepare a dataset folder that includes the data domains you want to train on, for example: ASRDataset/VLSP2020, ASRDataset/VLSP2021. Inside each VLSP2020 folder, there should be corresponding .wav and .txt files.
 
66
 
67
  ## LibriSpeech Performance
68
 
69
+ Model / Test Dataset Gigaspeech_test VLSP2023_pb_test VLSP2023_pr_test
70
+ (greedy/n-gram beam search) (greedy/n-gram beam search) (greedy/n-gram beam search)
71
+ Efficient-Conformer-Small-CTC 19.61/21.11 23.06/21.11 23.17/21.42
72
 
73
  In the competition organized by VLSP, I used the Efficient Conformer Large architecture with approximately 127 million parameters. You can find the detailed results in the technical report below:
74
  https://www.overleaf.com/read/nhqjtcpktjyc#3b472e