revert: upload unmodified spacekaren/chatterbox-turbo-webgpu models (our INT64 fix was breaking them)

bf19996 verified 4 days ago

1.5 kB

license: mit
base_model: ResembleAI/chatterbox-turbo-ONNX
tags:
  - text-to-speech
  - tts
  - onnx
  - webgpu
  - transformers.js

Chatterbox Turbo - WebGPU Compatible

This is a WebGPU-compatible version of ResembleAI/chatterbox-turbo-ONNX.

Changes from Original

The original model contains int64 Cast operations and tensors that WebGPU cannot execute. This version converts all int64 operations to int32, enabling direct WebGPU inference.

Modifications Made:

conditional_decoder: 521 Cast nodes inserted (376 Shape/Range ops)
speech_encoder: 350 Cast nodes inserted (243 Shape/Range ops)
language_model: 3 Cast nodes inserted
embed_tokens: 1 Cast node inserted

Usage with Transformers.js

import { AutoModel, AutoProcessor } from '@huggingface/transformers';

const model = await AutoModel.from_pretrained('spacekaren/chatterbox-turbo-webgpu', {
  device: 'webgpu',
  dtype: 'q4f16',
});

const processor = await AutoProcessor.from_pretrained('spacekaren/chatterbox-turbo-webgpu');

Model Size

Total: ~539 MB (q4f16 quantization)
Same architecture as original, just int64→int32 conversion

License

MIT (same as original)

Credits

Original model: ResembleAI/chatterbox-turbo-ONNX
Conversion script: local.core/scripts/convert_int64_to_int32.py