YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
language: en license: mit tags: - speaker-diarization - audio - fluidaudio - ios library_name: fluidaudio
ZantOS FluidAudio Diarization Model
Production speaker diarization model for ZantOS iOS app using FluidAudio.
Model Details
- ID: zant-os/zant-diarize-fluidaudio
- Version: 0.6.1
- Backend: FluidAudio + CoreML
- Input: 16kHz mono WAV
- Output: Speaker segments with timestamps and IDs
Performance
| Dataset | DER | Speedup | Device |
|---|---|---|---|
| AMI (test) | TBD | 50x | iPhone 15 Pro |
| zant-echo-golden v1.0.0 | TBD | 50x | iPhone 15 Pro |
Usage
This model is consumed via the ZantAMI v0.1 contract.
Swift Integration
import FluidAudio
// Models are downloaded automatically via Swift Package Manager
let engine = ProductionDiarizationEngine.shared
await engine.initialize()
let result = try await engine.diarize(audioURL: url)
// Returns: [SpeakerSegment(start: 0.0, end: 5.3, speaker: "You", confidence: 0.85)]
Model Architecture
FluidAudio uses:
- Segmentation: Voice Activity Detection
- Embedding: WeSpeaker model (256-dimensional)
- Clustering: Hierarchical clustering with cosine similarity
License
MIT License
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support