mlx-community/mimi-encoder-mlx

The encoder half of Kyutai's Mimi neural audio codec, converted to MLX format for native inference on Apple Silicon and consumed by the xocialize/mimi-encoder-mlx-swift Swift port. Refer to the original model card for full details.

Model

  • Family: Mimi neural audio codec (Kyutai / Moshi — Défossez et al., arXiv:2410.00037)
  • This artifact: the encoder only (SEANet conv encoder → causal transformer → stride-2 downsample → split RVQ)
  • Input: 24000 Hz, mono
  • Output: [16, T] codebook-index grid at 12.5 Hz (1 semantic + 15 acoustic codebooks)
  • Precision: fp32 (145 tensors)

Files

  • encoder.safetensors — the MLX encoder weights (fp32), extracted/converted from kyutai/mimi.

Usage (Swift / MLX)

import MimiCodecEncoder

let encoder = MimiEncoder(config: .qwen3TTS12Hz)
try encoder.loadWeights(from: encoderWeightsURL)   // encoder.safetensors
let codes = encoder.encode(audio: audioArray)      // [16, T]

Source

License

CC-BY-4.0 (Kyutai) — permissive, attribution required. This is a derivative (encoder-only, format-converted) of kyutai/mimi; attribution to Kyutai is retained.

Downloads last month

-

Downloads are not tracked for this model. How to track
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mlx-community/mimi-encoder-mlx

Base model

kyutai/mimi
Finetuned
(5)
this model

Paper for mlx-community/mimi-encoder-mlx