Text-to-Speech
KimiAudio
Safetensors
English
Chinese
audio
audio-language-model
speech-recognition
audio-understanding
audio-generation
chat
custom_code
Instructions to use moonshotai/Kimi-Audio-7B-Instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- KimiAudio
How to use moonshotai/Kimi-Audio-7B-Instruct with KimiAudio:
# Example usage for KimiAudio # pip install git+https://github.com/MoonshotAI/Kimi-Audio.git from kimia_infer.api.kimia import KimiAudio model = KimiAudio(model_path="moonshotai/Kimi-Audio-7B-Instruct", load_detokenizer=True) sampling_params = { "audio_temperature": 0.8, "audio_top_k": 10, "text_temperature": 0.0, "text_top_k": 5, } # For ASR asr_audio = "asr_example.wav" messages_asr = [ {"role": "user", "message_type": "text", "content": "Please transcribe the following audio:"}, {"role": "user", "message_type": "audio", "content": asr_audio} ] _, text = model.generate(messages_asr, **sampling_params, output_type="text") print(text) # For Q&A qa_audio = "qa_example.wav" messages_conv = [{"role": "user", "message_type": "audio", "content": qa_audio}] wav, text = model.generate(messages_conv, **sampling_params, output_type="both") sf.write("output_audio.wav", wav.cpu().view(-1).numpy(), 24000) print(text) - Notebooks
- Google Colab
- Kaggle
Free studio vocal data for Kimi Audio vocal pipeline benchmarking
#21 opened 28 days ago
by
MachineAI87
Add Kimi-Audio EOS and pad token ids
2
#20 opened 3 months ago
by
tunglinwood
Kaggle code needs update
#19 opened 10 months ago
by
elijahross
Fix incorrect unk_id assignment
#16 opened 12 months ago
by
codecho
Request: DOI
#14 opened about 1 year ago
by
huseyinyolcu
supported languages?
👍 1
1
#12 opened about 1 year ago
by
nononameneeded2001
About the weight files of the Whisper Encoder
1
#11 opened about 1 year ago
by
codecho
how can I fine tune this for farsi?
#10 opened about 1 year ago
by
uncleMehrzad
Cannot Run Model in Hugging Face Spaces: AutoProcessor/Processor Not Found
#9 opened about 1 year ago
by
ranagame
Будет ли поддержка Русского языка?
#8 opened about 1 year ago
by
fduches2
A video on how to set up this in a Colab notebook
1
#7 opened about 1 year ago
by
ritheshSree
Vocoder Architecture?
#6 opened about 1 year ago
by
yukiarimo
Base model?
1
#4 opened about 1 year ago
by
deltanym
Issue with long audio (~1 min) output, or prompt instruct following
👀 1
2
#2 opened about 1 year ago
by
JosephusCheung
Update correct task tag
1
#1 opened about 1 year ago
by
reach-vb