Help exporting my fine-tuned Chatterbox Multilingual to ONNX (new language)

#9
by arshambz - opened

Hi everyone,
I fine-tuned Chatterbox Multilingual (PyTorch) with my own dataset to support an additional language (e.g., fa / Persian), and the fine-tuned model works correctly in PyTorch inference.

Now I want to export my fine-tuned model to ONNX, ideally in the same format/layout as this repo, which provides multiple ONNX graphs (e.g. speech_encoder.onnx, embed_tokens.onnx, conditional_decoder.onnx, and several language_model*.onnx variants like fp16/q4).
Hugging Face

Goal

Export the fine-tuned model to ONNX for fast inference (onnxruntime).

Match this repo’s structure (multiple ONNX components + optional fp16/quantized variants).

If possible, keep dynamic shapes where supported.

What I tried

optimum-cli export onnx (generic export)

torch.onnx.export (manual export)
…but I’m not sure how to reproduce the same split-graph pipeline used here (LM split + decoder + embeddings, plus fp16/q4 variants)
Questions

What is the recommended export pipeline to reproduce this repo’s ONNX artifacts (speech_encoder, embed_tokens, conditional_decoder, language_model*)?
Hugging Face

For a new language, do I need to update tokenizer.json / tokenizer_config.json / generation_config.json as well, and is there any special handling for language tokens?
Hugging Face

Which parts typically change with language fine-tuning — do I need to re-export all ONNX components or only the language model pieces?

Any tips for validating correctness (PyTorch vs ONNX outputs) + known gotchas (unsupported ops, dynamic axes, etc.)?

Sign up or log in to comment