update readme
Browse files
README.md
CHANGED
|
@@ -38,23 +38,18 @@ Pretrained wav2vec2 models are distributed under the apache-2.0 licence. Hence,
|
|
| 38 |
As our wav2vec2 models were trained with Fairseq, then can be used in the different tools that they provide to fine-tune the model for ASR with CTC. The full procedure has been nicely summarized in [this blogpost](https://huggingface.co/blog/fine-tune-wav2vec2-english).
|
| 39 |
|
| 40 |
Please note that due to the nature of CTC, speech-to-text results aren't expected to be state-of-the-art. Moreover, future features might appear depending on the involvement of Fairseq and HuggingFace on this part.
|
| 41 |
-
|
| 42 |
|
| 43 |
-
|
| 44 |
## Integrate to SpeechBrain for ASR, Speaker, Source Separation ...
|
| 45 |
|
| 46 |
-
|
| 47 |
-
|
| 48 |
Pretrained wav2vec models recently gained in popularity. At the same time [SpeechBrain toolkit](https://speechbrain.github.io) came out, proposing a new and simpler way of dealing with state-of-the-art speech & deep-learning technologies.
|
| 49 |
|
| 50 |
While it currently is in beta, SpeechBrain offers two different ways of nicely integrating wav2vec2 models that were trained with Fairseq i.e our LeBenchmark models!
|
| 51 |
|
| 52 |
-
**Work In Progress**
|
| 53 |
-
|
| 54 |
-
The integration of wav2vec2 models is currently under [Pull Request](https://github.com/speechbrain/speechbrain/pull/576). However, the feature can already be used to:
|
| 55 |
1. Extract wav2vec2 features on-the-fly (with a frozen wav2vec2 encoder) to be combined with any speech related architecture. Examples are: E2E ASR with CTC+Att+Language Models; Speaker Recognition or Verification, Source Separation ...
|
| 56 |
2. *Experimental:* To fully benefit from wav2vec2, the best solution remains to fine-tune the model while you train your downstream task. This is very simply allowed within SpeechBrain as just a flag needs to be turned on. Thus, our wav2vec2 models can be fine-tuned while training your favorite ASR pipeline or Speaker Recognizer.
|
| 57 |
|
|
|
|
|
|
|
| 58 |
## Referencing LeBenchmark
|
| 59 |
|
| 60 |
```
|
|
|
|
| 38 |
As our wav2vec2 models were trained with Fairseq, then can be used in the different tools that they provide to fine-tune the model for ASR with CTC. The full procedure has been nicely summarized in [this blogpost](https://huggingface.co/blog/fine-tune-wav2vec2-english).
|
| 39 |
|
| 40 |
Please note that due to the nature of CTC, speech-to-text results aren't expected to be state-of-the-art. Moreover, future features might appear depending on the involvement of Fairseq and HuggingFace on this part.
|
|
|
|
| 41 |
|
|
|
|
| 42 |
## Integrate to SpeechBrain for ASR, Speaker, Source Separation ...
|
| 43 |
|
|
|
|
|
|
|
| 44 |
Pretrained wav2vec models recently gained in popularity. At the same time [SpeechBrain toolkit](https://speechbrain.github.io) came out, proposing a new and simpler way of dealing with state-of-the-art speech & deep-learning technologies.
|
| 45 |
|
| 46 |
While it currently is in beta, SpeechBrain offers two different ways of nicely integrating wav2vec2 models that were trained with Fairseq i.e our LeBenchmark models!
|
| 47 |
|
|
|
|
|
|
|
|
|
|
| 48 |
1. Extract wav2vec2 features on-the-fly (with a frozen wav2vec2 encoder) to be combined with any speech related architecture. Examples are: E2E ASR with CTC+Att+Language Models; Speaker Recognition or Verification, Source Separation ...
|
| 49 |
2. *Experimental:* To fully benefit from wav2vec2, the best solution remains to fine-tune the model while you train your downstream task. This is very simply allowed within SpeechBrain as just a flag needs to be turned on. Thus, our wav2vec2 models can be fine-tuned while training your favorite ASR pipeline or Speaker Recognizer.
|
| 50 |
|
| 51 |
+
**If interested, simply follow this [tutorial](https://colab.research.google.com/drive/17Hu1pxqhfMisjkSgmM2CnZxfqDyn2hSY?usp=sharing)**
|
| 52 |
+
|
| 53 |
## Referencing LeBenchmark
|
| 54 |
|
| 55 |
```
|