Update README.md
#8
by
Jingyi321
- opened
README.md
CHANGED
|
@@ -27,6 +27,11 @@ tags:
|
|
| 27 |
|
| 28 |
|
| 29 |
*Latest News* ๐ฅ
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 30 |
- [2025/07] We support **Python inference** on **macOS** and **Windows** with usage of the prebuilt-lib!
|
| 31 |
- [2025/06] We **finally** released and **open-sourced** the **ONNX** model and the corresponding **preprocessing code**! Now you can deploy **TEN VAD** on **any platform** and **any hardware architecture**!
|
| 32 |
- [2025/06] We are excited to announce the release of **WASM+JS** for Web WASM Support.
|
|
@@ -39,10 +44,12 @@ tags:
|
|
| 39 |
- [Introduction](#introduction)
|
| 40 |
- [Key Features](#key-features)
|
| 41 |
- [High-Performance](#1-high-performance)
|
|
|
|
| 42 |
- [Agent-Friendly](#2-agent-friendly)
|
| 43 |
- [Lightweight](#3-lightweight)
|
| 44 |
- [Multiple Programming Languages and Platforms](#4-multiple-programming-languages-and-platforms)
|
| 45 |
- [Supported Sampling Rate and Hop Size](#5-supproted-sampling-rate-and-hop-size)
|
|
|
|
| 46 |
- [Installation](#installation)
|
| 47 |
- [Quick Start](#quick-start)
|
| 48 |
- [Python Usage](#python-usage)
|
|
@@ -108,7 +115,11 @@ The precision-recall curves comparing the performance of WebRTC VAD (pitch-based
|
|
| 108 |
<img src="./examples/images/PR_Curves_testset.png" width="800">
|
| 109 |
</div>
|
| 110 |
|
| 111 |
-
Note that the default threshold of 0.5 is used to generate binary speech indicators (0 for non-speech signal, 1 for speech signal). This threshold needs to be tuned according to your domain-specific task.
|
|
|
|
|
|
|
|
|
|
|
|
|
| 112 |
|
| 113 |
```
|
| 114 |
cd ./examples
|
|
@@ -202,6 +213,12 @@ TEN VAD provides cross-platform C compatibility across five operating systems (L
|
|
| 202 |
### **5. Supproted sampling rate and hop size:**
|
| 203 |
TEN VAD operates on 16kHz audio input with configurable hop sizes (optimized frame configurations: 160/256 samples=10/16ms). Other sampling rates must be resampled to 16kHz.
|
| 204 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 205 |
## **Installation**
|
| 206 |
```
|
| 207 |
git clone https://huggingface.co/TEN-framework/ten-vad
|
|
@@ -538,7 +555,7 @@ Most questions can be answered by using DeepWiki, it is fast, intutive to use an
|
|
| 538 |
|
| 539 |
## License
|
| 540 |
|
| 541 |
-
This project is licensed
|
| 542 |
|
| 543 |
|
| 544 |
|
|
|
|
| 27 |
|
| 28 |
|
| 29 |
*Latest News* ๐ฅ
|
| 30 |
+
- [2025/11] **WASM** build guide and browser test demo are now available in `lib/Web` and `examples`.
|
| 31 |
+
- [2025/11] We supported **Python** inference with **ONNX model** on **Linux**, **macOS** thanks to Guy Nicholson!
|
| 32 |
+
- [2025/11] We supported **Golang** on **Linux**, **macOS** and **Windows** with usage of the prebuilt-libs thanks to hylarucoder!
|
| 33 |
+
- [2025/11] We supported Java on **Linux, macOS, Windows, Android** with usage of the prebuilt-libs thanks to ZhangYang!
|
| 34 |
+
- [2025/07] ๐ Exciting news! **TEN VAD** is now integrated into **k2-fsa/sherpa-onnx**, thanks to the fantastic work by Fangjun Kuang! You can now achieve more precise speech segment extraction and enjoy an enhanced ASR experience! Refer to the [documentation](https://k2-fsa.github.io/sherpa/onnx/vad/ten-vad.html) and give it a try!
|
| 35 |
- [2025/07] We support **Python inference** on **macOS** and **Windows** with usage of the prebuilt-lib!
|
| 36 |
- [2025/06] We **finally** released and **open-sourced** the **ONNX** model and the corresponding **preprocessing code**! Now you can deploy **TEN VAD** on **any platform** and **any hardware architecture**!
|
| 37 |
- [2025/06] We are excited to announce the release of **WASM+JS** for Web WASM Support.
|
|
|
|
| 44 |
- [Introduction](#introduction)
|
| 45 |
- [Key Features](#key-features)
|
| 46 |
- [High-Performance](#1-high-performance)
|
| 47 |
+
- [Performance Comparison](#11-performance-comparison)
|
| 48 |
- [Agent-Friendly](#2-agent-friendly)
|
| 49 |
- [Lightweight](#3-lightweight)
|
| 50 |
- [Multiple Programming Languages and Platforms](#4-multiple-programming-languages-and-platforms)
|
| 51 |
- [Supported Sampling Rate and Hop Size](#5-supproted-sampling-rate-and-hop-size)
|
| 52 |
+
- [Developers Testimonial](#developers-testimonial)
|
| 53 |
- [Installation](#installation)
|
| 54 |
- [Quick Start](#quick-start)
|
| 55 |
- [Python Usage](#python-usage)
|
|
|
|
| 115 |
<img src="./examples/images/PR_Curves_testset.png" width="800">
|
| 116 |
</div>
|
| 117 |
|
| 118 |
+
Note that the default threshold of 0.5 is used to generate binary speech indicators (0 for non-speech signal, 1 for speech signal). This threshold needs to be tuned according to your domain-specific task.
|
| 119 |
+
|
| 120 |
+
#### **1.1 Performance Comparison**
|
| 121 |
+
|
| 122 |
+
Developers can reproduce the performance comparison PR curves for **TEN VAD** and **Silero VAD** on the open-source testset (as shown in the figure above) by executing the following script on Linux x64 with a simply one line of code. The output figure will be saved in the same directory as the script.
|
| 123 |
|
| 124 |
```
|
| 125 |
cd ./examples
|
|
|
|
| 213 |
### **5. Supproted sampling rate and hop size:**
|
| 214 |
TEN VAD operates on 16kHz audio input with configurable hop sizes (optimized frame configurations: 160/256 samples=10/16ms). Other sampling rates must be resampled to 16kHz.
|
| 215 |
|
| 216 |
+
## **Developers Testimonial**
|
| 217 |
+
> "We selected TEN VAD because it provides faster and more accurate sentence-end detection in Japanese compared to other VADs, while still being lightweight and fast enough for live use." - LiveCap,Hakase shojo.
|
| 218 |
+
|
| 219 |
+
> "TEN VAD's overall performance is better than Silero VAD. Its high accuracy and low resource consumption helped us improve efficiency and significantly reduce costs." - Rustpbx.
|
| 220 |
+
|
| 221 |
+
|
| 222 |
## **Installation**
|
| 223 |
```
|
| 224 |
git clone https://huggingface.co/TEN-framework/ten-vad
|
|
|
|
| 555 |
|
| 556 |
## License
|
| 557 |
|
| 558 |
+
This project is licensed pursuant to the Apache 2.0 with additional conditions. Refer to the "LICENSE" file in the root directory for detailed information. Note that `pitch_est.cc` contains modified code derived from [LPCNet](https://github.com/xiph/LPCNet), which is [BSD-2-Clause](https://spdx.org/licenses/BSD-2-Clause.html) and [BSD-3-Clause](https://spdx.org/licenses/BSD-3-Clause.html) licensed, refer to the NOTICES file in the root directory for detailed information.
|
| 559 |
|
| 560 |
|
| 561 |
|