File size: 1,500 Bytes
bf19996
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
---
license: mit
base_model: ResembleAI/chatterbox-turbo-ONNX
tags:
  - text-to-speech
  - tts
  - onnx
  - webgpu
  - transformers.js
---

# Chatterbox Turbo - WebGPU Compatible

This is a WebGPU-compatible version of [ResembleAI/chatterbox-turbo-ONNX](https://huggingface.co/ResembleAI/chatterbox-turbo-ONNX).

## Changes from Original

The original model contains `int64` Cast operations and tensors that WebGPU cannot execute.
This version converts all `int64` operations to `int32`, enabling direct WebGPU inference.

### Modifications Made:
- **conditional_decoder**: 521 Cast nodes inserted (376 Shape/Range ops)
- **speech_encoder**: 350 Cast nodes inserted (243 Shape/Range ops)
- **language_model**: 3 Cast nodes inserted
- **embed_tokens**: 1 Cast node inserted

## Usage with Transformers.js

```javascript
import { AutoModel, AutoProcessor } from '@huggingface/transformers';

const model = await AutoModel.from_pretrained('spacekaren/chatterbox-turbo-webgpu', {
  device: 'webgpu',
  dtype: 'q4f16',
});

const processor = await AutoProcessor.from_pretrained('spacekaren/chatterbox-turbo-webgpu');
```

## Model Size

- **Total**: ~539 MB (q4f16 quantization)
- Same architecture as original, just int64→int32 conversion

## License

MIT (same as original)

## Credits

- Original model: [ResembleAI/chatterbox-turbo-ONNX](https://huggingface.co/ResembleAI/chatterbox-turbo-ONNX)
- Conversion script: [local.core/scripts/convert_int64_to_int32.py](https://github.com/anthropics/lama)