falche
/

WhisperWithJPDiarization

Model card Files Files and versions

NekoMikoReimu commited on Sep 16, 2024

Commit

be9bc8b

·

verified ·

1 Parent(s): afc4029

Update README.md

Files changed (1) hide show

README.md +13 -3

README.md CHANGED Viewed

@@ -1,3 +1,13 @@
----
-license: cc-by-nc-4.0
----

+---
+license: cc-by-nc-4.0
+language:
+- ja
+- en
+---
+A diarization pipeline for Whisper large-v2 that uses a custom-tuned segmentation model and custom filtering on the audio (low-pass filter, equalizer, etc.) for improved performance.
+Can be given a video file or mp3/wav file.
+Performance is considerably better than default JP whisper for most tasks involving Japanese content, with the exception of singing/karaoke.
+Requires ffmpeg, pyannote and facebookresearch's demux model. Torch is also strongly encouraged.