Kocom SenseVoice Sohee (ASR)

Model Summary

This is a SenseVoiceSmall model fine-tuned for Korean ASR using synthetic speech generated with Qwen3-TTS CustomVoice (speaker: Sohee). The training data is derived from internal intent/assistant dialogue texts and rendered into audio with a single synthetic speaker.

Base Model

  • iic/SenseVoiceSmall

Training Data

Text sources:

  • test/data/intent_classify/community_train/output_community_dataset.json
  • test/data/intent_classify/run_20260121_141638/output_vui_dataset.json

Audio:

  • Generated via Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice
  • Speaker: Sohee
  • Language: Korean
  • 16 kHz, mono, PCM_16

Counts (from manifest):

  • Total items: 8076
  • Roles: user 5068, assistant 3008

Split:

  • Train/val = 95/5 random split (seed=42)

Training Procedure

  • Fine-tuning with FunASR train_ds.py
  • Max epochs: 50
  • Batch type: token, batch_size=6000
  • Learning rate: 2e-4
  • Deepspeed: disabled

Evaluation

No public benchmark results are reported. Evaluate on your own test set to validate quality for your domain.

Intended Use

Korean ASR for the specific synthetic Sohee voice and domain-style commands. Best suited for controlled or synthetic audio similar to the training data.

Limitations

  • Trained on synthetic speech from a single voice; generalization to real-world speech, accents, noise, or other speakers is limited.
  • Domain coverage is restricted to the intent texts used during training.

Usage (Local)

from funasr import AutoModel
from funasr.utils.postprocess_utils import rich_transcription_postprocess

model = AutoModel(
    model="/data/sapie/tax/kocom/SenseVoice/exp_kocom_sohee",
    trust_remote_code=True,
    remote_code="/data/sapie/tax/kocom/SenseVoice/model.py",
    device="cuda:0",
)

res = model.generate(
    input="/path/to/audio.wav",
    cache={},
    language="ko",
    use_itn=False,
    batch_size_s=60,
    merge_vad=True,
    merge_length_s=15,
)
print(rich_transcription_postprocess(res[0]["text"]))

License

Not specified. Please add your intended license before publishing.

Contact

Add contact or maintainer information here.

Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support