File size: 1,500 Bytes
bf19996 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 | ---
license: mit
base_model: ResembleAI/chatterbox-turbo-ONNX
tags:
- text-to-speech
- tts
- onnx
- webgpu
- transformers.js
---
# Chatterbox Turbo - WebGPU Compatible
This is a WebGPU-compatible version of [ResembleAI/chatterbox-turbo-ONNX](https://huggingface.co/ResembleAI/chatterbox-turbo-ONNX).
## Changes from Original
The original model contains `int64` Cast operations and tensors that WebGPU cannot execute.
This version converts all `int64` operations to `int32`, enabling direct WebGPU inference.
### Modifications Made:
- **conditional_decoder**: 521 Cast nodes inserted (376 Shape/Range ops)
- **speech_encoder**: 350 Cast nodes inserted (243 Shape/Range ops)
- **language_model**: 3 Cast nodes inserted
- **embed_tokens**: 1 Cast node inserted
## Usage with Transformers.js
```javascript
import { AutoModel, AutoProcessor } from '@huggingface/transformers';
const model = await AutoModel.from_pretrained('spacekaren/chatterbox-turbo-webgpu', {
device: 'webgpu',
dtype: 'q4f16',
});
const processor = await AutoProcessor.from_pretrained('spacekaren/chatterbox-turbo-webgpu');
```
## Model Size
- **Total**: ~539 MB (q4f16 quantization)
- Same architecture as original, just int64→int32 conversion
## License
MIT (same as original)
## Credits
- Original model: [ResembleAI/chatterbox-turbo-ONNX](https://huggingface.co/ResembleAI/chatterbox-turbo-ONNX)
- Conversion script: [local.core/scripts/convert_int64_to_int32.py](https://github.com/anthropics/lama)
|