pipecat-ai
/

smart-turn-v3

Voice Activity Detection

speech-processing

Model card Files Files and versions

marcus-daily commited on Sep 11

Commit

e6cfb72

·

1 Parent(s): 27ab809

Add initial model card

Files changed (1) hide show

README.md +34 -3

README.md CHANGED Viewed

@@ -1,3 +1,34 @@
----
-license: bsd-2-clause
----

+---
+pipeline_tag: voice-activity-detection
+license: bsd-2-clause
+tags:
+  - speech-processing
+  - semantic-vad
+  - multilingual
+datasets:
+  - pipecat-ai/smart-turn-data-v3-train
+  - pipecat-ai/smart-turn-data-v3-test
+---
+# Smart Turn v3
+**Smart Turn v3** is an open‑source semantic Voice Activity Detection (VAD) model that tells you whether a speaker has finished their turn by analysing the raw waveform, not the transcript.
+## Links
+* [Blog post: Smart Turn v3](https://www.daily.co/blog/)
+* [GitHub repo](https://github.com/pipecat-ai/smart-turn) with training and inference code
+* [Datasets](https://github.com/pipecat-ai/datasets) with training and inference code
+## Model architecture
+* Backbone : Whisper Tiny encoder
+* Head     : shallow linear classifier
+* Params   : 8 M (int8)
+* Checkpoint: 8 MB ONNX
+## How to use
+Please see the blog post and GitHub repo for more information on using the model, either standalone or with Pipecat.