A fork of voicekit-team/T-one with ONNX and CUDA support.

Official model for our INTERSPEECH 2026 paper "A Data-Centric Framework for Addressing Phonetic and Prosodic Challenges in Russian Speech Generative Models" (arXiv:2507.13563). Part of the Balalaika Russian speech data-processing pipeline — code: https://github.com/lab260ru/balalaika. If you use this resource, please cite it.

Solves the problem of extremely slow operation of the model on some devices and adds the ability to run inference directly from the GPU code.

!pip install git+https://github.com/NikiPshg/T-one-cuda-onnx.git

Usage example

from tone import StreamingCTCPipeline, read_audio, read_example_audio


audio = read_example_audio() # or read_audio("your_audio.flac")
# device_id device_id if the graphics card is not found, the CPU is used
pipeline = StreamingCTCPipeline.from_hugging_face(device_id=0)
print(pipeline.forward_offline(audio))  # offline recognition using onnx cuda

Contact

Email: kborodin.research@gmail.com
Telegram: @korallll_ai

Citation

If you use this resource, please cite our INTERSPEECH 2026 paper:

@inproceedings{borodin2026balalaika,
  title     = {A Data-Centric Framework for Addressing Phonetic and Prosodic Challenges in Russian Speech Generative Models},
  author    = {Borodin, Kirill and Vasiliev, Nikita and Kudryavtsev, Vasiliy and Maslov, Maxim and Gorodnichev, Mikhail and Rogov, Oleg and Mkrtchian, Grach},
  booktitle = {Proc. INTERSPEECH 2026},
  year      = {2026},
  note      = {arXiv:2507.13563},
  url       = {https://arxiv.org/abs/2507.13563}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including lab260/T-one-onnx-fix

Balalaika models

Collection

5 items • Updated Mar 2 • 5

Paper for lab260/T-one-onnx-fix

A Data-Centric Framework for Addressing Phonetic and Prosodic Challenges in Russian Speech Generative Models

Paper • 2507.13563 • Published Jul 17, 2025 • 53