Japanese/Korean Support && Recommend Models for Subtitle Transcription with Timestamps

by nemozxy123 - opened Apr 15

Apr 15

Thank you to the OpenMOSS-Team for your work and contributions to the open-source community. I have two questions regarding this series of models:

The model card states that it only supports Chinese and English, but there are Japanese and Korean sample audio files in the GitHub repository. I tried transcribing a Japanese audio clip, and it worked fine. Did the model card omit this information?
For subtitle transcription tasks, which model is more recommended? 4B/8B Instruct/Thinking? Are there any official prompt recommendations?
Thanks again for your great work!

kiiic

OpenMOSS org Apr 15

Thank you very much for your interest in MOSS-Audio and for your kind words about our work.

Yes, the current model card is incomplete in that regard. MOSS-Audio does support Japanese, Korean, and some other languages in addition to Chinese and English. That said, its strongest performance is still in Chinese and English.

For ASR or subtitle transcription tasks, we generally recommend using the Instruct models rather than the Thinking models.

Thank you again for your support.

kiiic

OpenMOSS org Apr 15

This comment has been hidden (marked as Resolved)

kiiic changed discussion status to closed Apr 15

kiiic changed discussion status to open Apr 15

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment