Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,91 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
language:
|
| 4 |
+
- uk
|
| 5 |
+
pipeline_tag: automatic-speech-recognition
|
| 6 |
+
---
|
| 7 |
+
|
| 8 |
+
# Flashlight for Ukrainian
|
| 9 |
+
|
| 10 |
+
## Community
|
| 11 |
+
|
| 12 |
+
- Discord: https://bit.ly/discord-uds
|
| 13 |
+
- Speech Recognition: https://t.me/speech_recognition_uk
|
| 14 |
+
- Speech Synthesis: https://t.me/speech_synthesis_uk
|
| 15 |
+
|
| 16 |
+
See other Ukrainian models: https://github.com/egorsmkv/speech-recognition-uk
|
| 17 |
+
|
| 18 |
+
## Overview
|
| 19 |
+
|
| 20 |
+
This repository contains the acoustic model for Ukrainian trained on Flashlight framework: https://github.com/flashlight/flashlight/tree/main/flashlight/app/asr
|
| 21 |
+
|
| 22 |
+
- Architecture: Conformer (30m params)
|
| 23 |
+
- Data in train: Common Voice 10 & Voice of America
|
| 24 |
+
- Trained epochs: 410
|
| 25 |
+
- Train time: around a week (RTX A4000)
|
| 26 |
+
|
| 27 |
+
## Quality
|
| 28 |
+
|
| 29 |
+
- WER: 9.0777% (id est the quality is 90.92%)
|
| 30 |
+
- TER: 1.9839%
|
| 31 |
+
|
| 32 |
+
## Download
|
| 33 |
+
|
| 34 |
+
All files are here: https://github.com/egorsmkv/flashlight-ukrainian/releases/tag/v1.0
|
| 35 |
+
|
| 36 |
+
## How to test?
|
| 37 |
+
|
| 38 |
+
### Run a container with Flashlight running with CPU
|
| 39 |
+
|
| 40 |
+
```bash
|
| 41 |
+
docker-compose up
|
| 42 |
+
|
| 43 |
+
# and in another termianl
|
| 44 |
+
docker exec -it flashlight_cpu bash
|
| 45 |
+
```
|
| 46 |
+
|
| 47 |
+
### Run
|
| 48 |
+
|
| 49 |
+
Just with an AM:
|
| 50 |
+
|
| 51 |
+
```
|
| 52 |
+
/root/flashlight/build/bin/asr/fl_asr_test --am /models/uk_am.bin --datadir '' --emission_dir '' --uselexicon false \
|
| 53 |
+
--test /data/rows.lst --tokens /models/tokens.txt --lexicon /models/lexicon.txt --show
|
| 54 |
+
```
|
| 55 |
+
|
| 56 |
+
With an LM:
|
| 57 |
+
|
| 58 |
+
```
|
| 59 |
+
/root/flashlight/build/bin/asr/fl_asr_decode \
|
| 60 |
+
--am=/models/uk_am.bin \
|
| 61 |
+
--test=/data/labels_absolute.lst \
|
| 62 |
+
--maxload=3477 \
|
| 63 |
+
--nthread_decoder=2 \
|
| 64 |
+
--show \
|
| 65 |
+
--showletters \
|
| 66 |
+
--lexicon=/models/lexicon.txt \
|
| 67 |
+
--uselexicon=false \
|
| 68 |
+
--lm=/models/lm_4gram_500k.binary \
|
| 69 |
+
--lmtype=kenlm \
|
| 70 |
+
--decodertype=wrd \
|
| 71 |
+
--beamsize=200 \
|
| 72 |
+
--beamsizetoken=200 \
|
| 73 |
+
--beamthreshold=20 \
|
| 74 |
+
--lmweight=0.75 \
|
| 75 |
+
--wordscore=0 \
|
| 76 |
+
--eosscore=0 \
|
| 77 |
+
--silscore=0 \
|
| 78 |
+
--unkscore=0 \
|
| 79 |
+
--smearing=max
|
| 80 |
+
```
|
| 81 |
+
|
| 82 |
+
- **labels_absolute.lst** is from https://github.com/egorsmkv/cv10-uk-testset-clean
|
| 83 |
+
- **lm_4gram_500k.binary** is from https://huggingface.co/Yehor/kenlm-ukrainian/tree/main/news/lm-4gram-500k
|
| 84 |
+
|
| 85 |
+
## How to fine-tune on own data?
|
| 86 |
+
|
| 87 |
+
```
|
| 88 |
+
/root/flashlight/build/bin/asr/fl_asr_train continue /models/ --flagsfile /models/train.flags
|
| 89 |
+
```
|
| 90 |
+
|
| 91 |
+
`/models/` must contain .bin files
|