YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.


language: en license: mit tags: - speaker-diarization - audio - fluidaudio - ios library_name: fluidaudio

ZantOS FluidAudio Diarization Model

Production speaker diarization model for ZantOS iOS app using FluidAudio.

Model Details

  • ID: zant-os/zant-diarize-fluidaudio
  • Version: 0.6.1
  • Backend: FluidAudio + CoreML
  • Input: 16kHz mono WAV
  • Output: Speaker segments with timestamps and IDs

Performance

Dataset DER Speedup Device
AMI (test) TBD 50x iPhone 15 Pro
zant-echo-golden v1.0.0 TBD 50x iPhone 15 Pro

Usage

This model is consumed via the ZantAMI v0.1 contract.

Swift Integration

import FluidAudio

// Models are downloaded automatically via Swift Package Manager
let engine = ProductionDiarizationEngine.shared
await engine.initialize()

let result = try await engine.diarize(audioURL: url)
// Returns: [SpeakerSegment(start: 0.0, end: 5.3, speaker: "You", confidence: 0.85)]

Model Architecture

FluidAudio uses:
- Segmentation: Voice Activity Detection
- Embedding: WeSpeaker model (256-dimensional)
- Clustering: Hierarchical clustering with cosine similarity

License

MIT License
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support