File size: 4,535 Bytes
85ff424
 
 
 
 
 
 
 
 
24f8f79
85ff424
 
 
 
 
240687c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
---
title: Kokoclone
emoji: 💻
colorFrom: blue
colorTo: pink
sdk: gradio
sdk_version: 6.8.0
app_file: app.py
pinned: false
python_version: 3.12.12
license: apache-2.0
short_description: Kokoro, But It Clones Voices Now
---

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
# KokoClone

[![Hugging Face Space](https://img.shields.io/badge/🤗%20Hugging%20Face-Live%20Demo-blue)](https://huggingface.co/spaces/PatnaikAshish/kokoclone)
[![Hugging Face Models](https://img.shields.io/badge/🤗%20Models-Repository-orange)](https://huggingface.co/PatnaikAshish/kokoclone)
[![Python](https://img.shields.io/badge/Python-3.10+-3776AB.svg?logo=python\&logoColor=white)]
[![License](https://img.shields.io/badge/License-Apache_2.0-green.svg)](https://opensource.org/licenses/Apache-2.0)



##  What is KokoClone?

**KokoClone** is a fast, real-time compatible multilingual voice cloning system built on top of **Kokoro-ONNX**, one of the fastest open-source neural TTS engines available today.

It allows you to:

* Type text in multiple languages
* Provide a short 3–10 second reference audio clip
* Instantly generate speech in that same voice


Just text → voice → cloned output.


##  Why Kokoro?

KokoClone is powered by **Kokoro-ONNX**, a highly optimized neural TTS engine designed for:

*  Extremely fast inference
*  Natural prosody and expressive speech
*  Lightweight ONNX runtime compatibility
*  Real-time deployment on CPU
*  Even faster performance with GPU

Unlike many heavy TTS systems, Kokoro is lightweight and responsive — making KokoClone suitable for real-time applications, voice assistants, demos, and interactive tools.


##  Features

###  Multilingual Speech Generation

Generate native speech in:

* English (`en`)
* Hindi (`hi`)
* French (`fr`)
* Japanese (`ja`)
* Chinese (`zh`)
* Italian (`it`)
* Portuguese (`pt`)
* Spanish (`es`)


###  Zero-Shot Voice Cloning

Upload a short voice sample and KokoClone transfers its vocal characteristics to the generated speech.


###  Real-Time Friendly

Built on Kokoro’s efficient ONNX runtime pipeline, KokoClone runs smoothly on:

* Standard laptops (CPU)
* Workstations (GPU)


###  Automatic Model Handling

On first run, required model files are downloaded automatically and placed in the correct directories.


###  Built-in Web Interface

Includes a clean and responsive Gradio UI for quick testing and demos.



##  Live Demo

Try it instantly without installing anything:

👉 **[KokoClone on Hugging Face Spaces](https://huggingface.co/spaces/PatnaikAshish/kokoclone)**



##  Installation

Recommended: Use `conda` for a clean environment.

###  Clone the Repository

```bash
git clone https://github.com/Ashish-Patnaik/kokoclone.git
cd kokoclone
```

###  Create Environment

```bash
conda create -n kokoclone python=3.12.12 -y
conda activate kokoclone
```



##  Install Dependencies

###  CPU Installation (Recommended for most users)

```bash
pip install torch torchaudio --index-url https://download.pytorch.org/whl/cpu
pip install -r requirements.txt
```

###  GPU Installation (NVIDIA users)

```bash
pip install -r requirements.txt
pip install kokoro-onnx[gpu]
```



##  Usage

KokoClone can be used in three ways:



###  Web Interface

Launch the Gradio app:

```bash
python app.py
```

Then open the browser interface to:

* Enter text
* Select language
* Upload a reference voice
* Generate cloned speech



###  Command Line

```bash
python cli.py --text "Hello from KokoClone" --lang en --ref reference.wav --out output.wav
```



###  Python API

```python
from core.cloner import KokoClone

cloner = KokoClone()

cloner.generate(
    text="This voice is cloned using KokoClone.",
    lang="en",
    reference_audio="reference.wav",
    output_path="output.wav"
)
```



##  Project Structure

```
app.py              → Gradio Web Interface
cli.py              → Command-line tool
core/cloner.py      → Core inference engine
inference.py        → Example usage script
model/              → Downloaded TTS model weights
voice/              → Voice embeddings
```



##  Use Cases

* Voice assistant prototypes
* Real-time TTS demos
* Multilingual narration tools
* Content creation
* Research experiments
* Interactive AI applications



##  Acknowledgments

This project builds upon:

* **Kokoro-ONNX** — for fast and efficient neural speech synthesis
* **Kanade Tokenizer** — for voice conversion architecture


##  License

Licensed under the Apache 2.0 License.