| --- |
| title: README |
| emoji: 🎙️ |
| colorFrom: purple |
| colorTo: blue |
| sdk: static |
| pinned: false |
| thumbnail: >- |
| https://cdn-uploads.huggingface.co/production/uploads/6925c2ee090bdc6b345198ba/NKvn4WwF9my5CrmuIYR8p.png |
| --- |
| |
|  |
|
|
| ## Real-Time Voice AI Models |
|
|
| Krisp builds ultra-low latency AI models for **voice AI agents**, **contact centers**, and **real-time communication** — powering **80B+ monthly audio minutes** across **200M+ devices**. Trusted by Discord, Twilio, RingCentral, Vonage, and more. |
|
|
| --- |
|
|
| ### VIVA SDK — Voice Intelligence for Voice AI Agents |
|
|
| Server-side models that sit in front of your VAD or STT pipeline, optimized for on-server CPU deployment. |
|
|
| | Model | Description | |
| |---|---| |
| | **Voice Isolation** | Removes background noise, secondary voices, and cross-talk from agent audio streams. Language and accent independent. | |
| | **Turn Prediction** | Predicts when a speaker is likely to finish their turn, enabling natural conversation flow without awkward pauses or premature interruptions. Audio-only, no transcription required. | |
| | **Voice Activity Detection** | Accurate real-time speech and silence detection for clean voice pipelines. Reduces false triggers and improves system responsiveness. | |
|
|
| ### RTC SDK — Speech Enhancement for Human-to-Human Calls |
|
|
| Client-side and server-side models for noise cancellation, accent conversion, and voice translation. |
|
|
| | Model | Description | |
| |---|---| |
| | **Noise Cancellation** | Inbound and outbound noise removal, background voice cancellation, and de-reverberation. | |
| | **Accent Conversion** | Real-time accent transformation for contact center agents — improves CSAT and AHT. | |
| | **Voice Translation** | Bidirectional real-time voice-to-voice translation across multiple languages. | |
|
|
| --- |
|
|
| ### Open Datasets & Benchmarks |
|
|
| - [`turn-taking-test-v1`](https://huggingface.co/datasets/Krisp-AI/turn-taking-test-v1) — 4 hours of annotated conversational audio for turn prediction benchmarking (976 shift + 1,754 hold cases, 30 speakers) |
|
|
| --- |
|
|
| ### Integrations |
|
|
| Works with **Pipecat** · **LiveKit** · **Vapi** · **Daily** — available as compiled C, JS/WASM, Python, Node.js, Go, and Rust bindings. |
|
|
| --- |
|
|
| **[Website](https://krisp.ai)** · **[Developer SDK](https://krisp.ai/developers/)** · **[Documentation](https://sdk-docs.krisp.ai/)** · **[Engineering Blog](https://krisp.ai/blog/category/company/engineering/)** · **[LinkedIn](https://www.linkedin.com/company/krisp/)** · **[Twitter](https://twitter.com/krispHQ)** |