--- license: mit base_model: ResembleAI/chatterbox-turbo-ONNX tags: - text-to-speech - tts - onnx - webgpu - transformers.js --- # Chatterbox Turbo - WebGPU Compatible This is a WebGPU-compatible version of [ResembleAI/chatterbox-turbo-ONNX](https://huggingface.co/ResembleAI/chatterbox-turbo-ONNX). ## Changes from Original The original model contains `int64` Cast operations and tensors that WebGPU cannot execute. This version converts all `int64` operations to `int32`, enabling direct WebGPU inference. ### Modifications Made: - **conditional_decoder**: 521 Cast nodes inserted (376 Shape/Range ops) - **speech_encoder**: 350 Cast nodes inserted (243 Shape/Range ops) - **language_model**: 3 Cast nodes inserted - **embed_tokens**: 1 Cast node inserted ## Usage with Transformers.js ```javascript import { AutoModel, AutoProcessor } from '@huggingface/transformers'; const model = await AutoModel.from_pretrained('spacekaren/chatterbox-turbo-webgpu', { device: 'webgpu', dtype: 'q4f16', }); const processor = await AutoProcessor.from_pretrained('spacekaren/chatterbox-turbo-webgpu'); ``` ## Model Size - **Total**: ~539 MB (q4f16 quantization) - Same architecture as original, just int64→int32 conversion ## License MIT (same as original) ## Credits - Original model: [ResembleAI/chatterbox-turbo-ONNX](https://huggingface.co/ResembleAI/chatterbox-turbo-ONNX) - Conversion script: [local.core/scripts/convert_int64_to_int32.py](https://github.com/anthropics/lama)