YUGOROU's picture
Update README.md
e7ae14f verified
metadata
library_name: mlx-audio-plus
base_model:
  - ResembleAI/chatterbox
tags:
  - mlx
  - multilingual
  - tts
  - text-to-speech
  - japanese
language:
  - ar
  - da
  - de
  - el
  - en
  - es
  - fi
  - fr
  - he
  - hi
  - it
  - ja
  - ko
  - ms
  - nl
  - 'no'
  - pl
  - pt
  - ru
  - sv
  - sw
  - tr
  - zh
pipeline_tag: text-to-speech

๐Ÿšจใƒขใƒ‡ใƒซใฎๅฎŒๅ…จใชๅ‹•ไฝœ็ขบ่ชใŒใพใ ใงใใฆใ„ใพใ›ใ‚“๏ผ (๐ŸšจWe're still working on fully testing the model!)

YUGOROU/Chatterbox-Multilingual-MLX-4bit

Chatterbox Multilingual TTS converted to MLX format for Apple Silicon devices.

๐ŸŒ Supported Languages (23 languages)

Arabic, Danish, German, Greek, English, Spanish, Finnish, French, Hebrew, Hindi, Italian, Japanese, Korean, Malay, Dutch, Norwegian, Polish, Portuguese, Russian, Swedish, Swahili, Turkish, Chinese

๐Ÿ“ฅ Installation

pip install -U mlx-audio-plus

๐Ÿš€ Usage

Command Line

mlx_audio.tts.generate \\
    --model {model_name} \\
    --text "ใ“ใ‚“ใซใกใฏใ€ๅ…ƒๆฐ—ใงใ™ใ‹๏ผŸ" \\
    --ref_audio reference.wav

Python

from mlx_audio.tts.generate import generate_audio

generate_audio(
    text="ใ“ใ‚“ใซใกใฏใ€ๅ…ƒๆฐ—ใงใ™ใ‹๏ผŸ",
    model="{model_name}",
    ref_audio="reference.wav",
    file_prefix="output",
)

๐Ÿ“Š Model Details

  • Base Model: ResembleAI/chatterbox
  • Tokenizer: 2454 tokens (Multilingual)
  • Quantization: {'4-bit' if '4bit' in model_name else '8-bit' if '8bit' in model_name else 'fp16'}
  • Framework: MLX (Apple Silicon optimized)

๐Ÿ”— Related