File size: 2,613 Bytes
2c7e805
 
ad93fbf
 
 
2c7e805
 
ad93fbf
 
2c7e805
 
ad93fbf
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
---
title: README
emoji: 🎙️
colorFrom: purple
colorTo: blue
sdk: static
pinned: false
thumbnail: >-
  https://cdn-uploads.huggingface.co/production/uploads/6925c2ee090bdc6b345198ba/NKvn4WwF9my5CrmuIYR8p.png
---

![Krisp AI](https://cdn-uploads.huggingface.co/production/uploads/6925c2ee090bdc6b345198ba/NKvn4WwF9my5CrmuIYR8p.png)

## Real-Time Voice AI Models

Krisp builds ultra-low latency AI models for **voice AI agents**, **contact centers**, and **real-time communication** — powering **80B+ monthly audio minutes** across **200M+ devices**. Trusted by Discord, Twilio, RingCentral, Vonage, and more.

---

### VIVA SDK — Voice Intelligence for Voice AI Agents

Server-side models that sit in front of your VAD or STT pipeline, optimized for on-server CPU deployment.

| Model | Description |
|---|---|
| **Voice Isolation** | Removes background noise, secondary voices, and cross-talk from agent audio streams. Language and accent independent. |
| **Turn Prediction** | Predicts when a speaker is likely to finish their turn, enabling natural conversation flow without awkward pauses or premature interruptions. Audio-only, no transcription required. |
| **Voice Activity Detection** | Accurate real-time speech and silence detection for clean voice pipelines. Reduces false triggers and improves system responsiveness. |

### RTC SDK — Speech Enhancement for Human-to-Human Calls

Client-side and server-side models for noise cancellation, accent conversion, and voice translation.

| Model | Description |
|---|---|
| **Noise Cancellation** | Inbound and outbound noise removal, background voice cancellation, and de-reverberation. |
| **Accent Conversion** | Real-time accent transformation for contact center agents — improves CSAT and AHT. |
| **Voice Translation** | Bidirectional real-time voice-to-voice translation across multiple languages. |

---

### Open Datasets & Benchmarks

- [`turn-taking-test-v1`](https://huggingface.co/datasets/Krisp-AI/turn-taking-test-v1) — 4 hours of annotated conversational audio for turn prediction benchmarking (976 shift + 1,754 hold cases, 30 speakers)

---

### Integrations

Works with **Pipecat** · **LiveKit** · **Vapi** · **Daily**  — available as compiled C, JS/WASM, Python, Node.js, Go, and Rust bindings.

---

**[Website](https://krisp.ai)** · **[Developer SDK](https://krisp.ai/developers/)** · **[Documentation](https://sdk-docs.krisp.ai/)** · **[Engineering Blog](https://krisp.ai/blog/category/company/engineering/)** · **[LinkedIn](https://www.linkedin.com/company/krisp/)** · **[Twitter](https://twitter.com/krispHQ)**