Automatic Speech Recognition
Transformers
Safetensors
DiCoW
speech
whisper
multilingual
speaker-diarization
meeting-transcription
BUT-FIT
custom_code
Instructions to use BUT-FIT/DiCoW_v3_2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use BUT-FIT/DiCoW_v3_2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="BUT-FIT/DiCoW_v3_2", trust_remote_code=True)# Load model directly from transformers import AutoModelForSpeechSeq2Seq model = AutoModelForSpeechSeq2Seq.from_pretrained("BUT-FIT/DiCoW_v3_2", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -10,7 +10,7 @@ tags:
|
|
| 10 |
- DiCoW
|
| 11 |
- BUT-FIT
|
| 12 |
pipeline_tag: automatic-speech-recognition
|
| 13 |
-
license:
|
| 14 |
datasets:
|
| 15 |
- microsoft/NOTSOFAR
|
| 16 |
- edinburghcstr/ami
|
|
@@ -20,6 +20,8 @@ datasets:
|
|
| 20 |
|
| 21 |
This repository hosts the **DiCoW\_v3.2** model developed by [BUT Speech@FIT](https://github.com/BUTSpeechFIT), tailored for **multi-talker automatic speech recognition (MT-ASR)**.
|
| 22 |
|
|
|
|
|
|
|
| 23 |
## 🔧 Key Improvements over DiCoW v1
|
| 24 |
|
| 25 |
* **FDDT (Frame-Level Diarization Dependent Transformation)** before positional embeddings
|
|
|
|
| 10 |
- DiCoW
|
| 11 |
- BUT-FIT
|
| 12 |
pipeline_tag: automatic-speech-recognition
|
| 13 |
+
license: cc-by-4.0
|
| 14 |
datasets:
|
| 15 |
- microsoft/NOTSOFAR
|
| 16 |
- edinburghcstr/ami
|
|
|
|
| 20 |
|
| 21 |
This repository hosts the **DiCoW\_v3.2** model developed by [BUT Speech@FIT](https://github.com/BUTSpeechFIT), tailored for **multi-talker automatic speech recognition (MT-ASR)**.
|
| 22 |
|
| 23 |
+
This model is available under the terms of CC BY 4.0. It incorporates an MIT-licensed base model and CC BY 4.0 licensed training data.
|
| 24 |
+
|
| 25 |
## 🔧 Key Improvements over DiCoW v1
|
| 26 |
|
| 27 |
* **FDDT (Frame-Level Diarization Dependent Transformation)** before positional embeddings
|