ONNX

A fork of voicekit-team/T-one with ONNX and CUDA support.

arXiv Conference Code

Official model for our INTERSPEECH 2026 paper "A Data-Centric Framework for Addressing Phonetic and Prosodic Challenges in Russian Speech Generative Models" (arXiv:2507.13563). Part of the Balalaika Russian speech data-processing pipeline — code: https://github.com/lab260ru/balalaika. If you use this resource, please cite it.

Solves the problem of extremely slow operation of the model on some devices and adds the ability to run inference directly from the GPU code.

!pip install git+https://github.com/NikiPshg/T-one-cuda-onnx.git

Usage example

from tone import StreamingCTCPipeline, read_audio, read_example_audio


audio = read_example_audio() # or read_audio("your_audio.flac")
# device_id device_id if the graphics card is not found, the CPU is used
pipeline = StreamingCTCPipeline.from_hugging_face(device_id=0)
print(pipeline.forward_offline(audio))  # offline recognition using onnx cuda

Contact

Citation

If you use this resource, please cite our INTERSPEECH 2026 paper:

@inproceedings{borodin2026balalaika,
  title     = {A Data-Centric Framework for Addressing Phonetic and Prosodic Challenges in Russian Speech Generative Models},
  author    = {Borodin, Kirill and Vasiliev, Nikita and Kudryavtsev, Vasiliy and Maslov, Maxim and Gorodnichev, Mikhail and Rogov, Oleg and Mkrtchian, Grach},
  booktitle = {Proc. INTERSPEECH 2026},
  year      = {2026},
  note      = {arXiv:2507.13563},
  url       = {https://arxiv.org/abs/2507.13563}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including lab260/T-one-onnx-fix

Paper for lab260/T-one-onnx-fix