|
|
--- |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- en |
|
|
base_model: |
|
|
- hexgrad/Kokoro-82M |
|
|
pipeline_tag: text-to-speech |
|
|
--- |
|
|
|
|
|
## Introduction |
|
|
|
|
|
This repository hosts the [Kokoro](https://huggingface.co/hexgrad/Kokoro-82M) model for the [React Native Executorch](https://www.npmjs.com/package/react-native-executorch) library. |
|
|
It includes the model divided into 4 parts, each of them exported for xnnpack backend in .pte format, ready for use in the ExecuTorch runtime. |
|
|
|
|
|
As it stands for now, the models are exported with static input shapes: for 32, 64 and 128 input tokens, with methods |
|
|
`forward_32`, `forward_64`, and `forward_128` respectively. |
|
|
|
|
|
If you'd like to run these models in your own ExecuTorch runtime, refer to the |
|
|
[official documentation](https://pytorch.org/executorch/stable/index.html) for setup instructions. |
|
|
|
|
|
|
|
|
## Compatibility |
|
|
|
|
|
These models were exported using v1.0.0 version of ExecuTorch and no forward compatibility is guaranteed. |
|
|
Older versions of the runtime may not work with these files. |
|
|
|
|
|
The models are intended to be used within the React Native ExecuTorch package. If you want to use them outside the package, |
|
|
make sure your runtime is compatible with the ExecuTorch version used to export the .pte files and follow the |
|
|
[example script](https://github.com/NorbertKlockiewicz/kokoro-export/blob/main/demo/inference_example.py) to run the models. |
|
|
|
|
|
## Repository Structure |
|
|
|
|
|
The repository contains 3 main directories: |
|
|
- `phonemizer` - data files required by the [Phonemis](https://github.com/IgorSwat/Phonemis) package - responsible for input preprocessing part |
|
|
of React Native ExecuTorch Kokoro pipeline. |
|
|
- `voices` - a collection of pre-computed speaker embeddings used by the Kokoro model to synthesize speech with specific vocal characteristics. |
|
|
- `xnnpack` - exported, XNNPACK-optimized Kokoro runtime modules. |