marcus-daily
commited on
Commit
·
e6cfb72
1
Parent(s):
27ab809
Add initial model card
Browse files
README.md
CHANGED
|
@@ -1,3 +1,34 @@
|
|
| 1 |
-
---
|
| 2 |
-
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
pipeline_tag: voice-activity-detection
|
| 3 |
+
license: bsd-2-clause
|
| 4 |
+
tags:
|
| 5 |
+
- speech-processing
|
| 6 |
+
- semantic-vad
|
| 7 |
+
- multilingual
|
| 8 |
+
datasets:
|
| 9 |
+
- pipecat-ai/smart-turn-data-v3-train
|
| 10 |
+
- pipecat-ai/smart-turn-data-v3-test
|
| 11 |
+
---
|
| 12 |
+
|
| 13 |
+
# Smart Turn v3
|
| 14 |
+
|
| 15 |
+
**Smart Turn v3** is an open‑source semantic Voice Activity Detection (VAD) model that tells you whether a speaker has finished their turn by analysing the raw waveform, not the transcript.
|
| 16 |
+
|
| 17 |
+
## Links
|
| 18 |
+
|
| 19 |
+
* [Blog post: Smart Turn v3](https://www.daily.co/blog/)
|
| 20 |
+
* [GitHub repo](https://github.com/pipecat-ai/smart-turn) with training and inference code
|
| 21 |
+
* [Datasets](https://github.com/pipecat-ai/datasets) with training and inference code
|
| 22 |
+
|
| 23 |
+
|
| 24 |
+
## Model architecture
|
| 25 |
+
|
| 26 |
+
* Backbone : Whisper Tiny encoder
|
| 27 |
+
* Head : shallow linear classifier
|
| 28 |
+
* Params : 8 M (int8)
|
| 29 |
+
* Checkpoint: 8 MB ONNX
|
| 30 |
+
|
| 31 |
+
|
| 32 |
+
## How to use
|
| 33 |
+
|
| 34 |
+
Please see the blog post and GitHub repo for more information on using the model, either standalone or with Pipecat.
|