Upload README.md with huggingface_hub

191296e verified about 2 months ago

933 Bytes

license: other
license_name: glm-4-voice
license_link: https://github.com/THUDM/GLM-4-Voice/blob/main/MODEL_LICENSE
tags:
  - speech-to-speech
  - audio
  - emotion
  - kimi-audio
  - glm-4-voice

glm-4-voice-decoder-emo-ft

Built with glm-4.

Fine-tuned GLM-4-Voice decoder weights for emotion-preserving Chinese ↔ English speech-to-speech translation, used together with the Kimi-Audio Emotion-Aware S2ST training / inference pipeline.

Files

File	Size	Role
`epoch500_emoft.pt`	~425 MB	Fine-tuned flow checkpoint (emotion-preserving)
`hift.pt`	~79 MB	HiFT vocoder checkpoint

Usage

git clone https://github.com/<YOUR_GH_USER>/kimi-audio-release
cd kimi-audio-release
./scripts/download_weights.sh
# the two files will be placed under glm_4_voice_decoder/

'EOF'