Question about fine-tuning

#27

by jz703 - opened Oct 18, 2023

Oct 18, 2023

•

edited Oct 18, 2023

Hi, I'd like to fine tuning on a dataset that the target output should be the phoneme of the words.(e.g. "examination and testimony" should be "ɪɡzæmənˈeɪʃən ˈænd tˈɛstɪmˌoʊni") I just want to know if this is possible with whisper if I build the vocab carefully. And is this more like a transcription task or translation. Thank you!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment