Automatic Speech Recognition
Transformers
Safetensors
DiCoW
speech
whisper
multilingual
speaker-diarization
meeting-transcription
target-speaker-asr
BUT-FIT
custom_code
Instructions to use bohatey/DiCoW_v3_2_SF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use bohatey/DiCoW_v3_2_SF with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="bohatey/DiCoW_v3_2_SF", trust_remote_code=True)# Load model directly from transformers import AutoModelForSpeechSeq2Seq model = AutoModelForSpeechSeq2Seq.from_pretrained("bohatey/DiCoW_v3_2_SF", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
| import torch | |
| from transformers import WhisperTimeStampLogitsProcessor | |
| class WhisperTimeStampLogitsProcessorCustom(WhisperTimeStampLogitsProcessor): | |
| def __call__(self, input_ids: torch.LongTensor, scores: torch.FloatTensor) -> torch.FloatTensor: | |
| scores_processed = super().__call__(input_ids, scores) | |
| # Enable to early exit from silence via eos token | |
| if input_ids.shape[1] == self.begin_index: | |
| scores_processed[:, self.eos_token_id] = scores[:, self.eos_token_id] | |
| return scores_processed | |