MMS-LID-256 CoreML

CoreML conversion of facebook/mms-lid-256 for native inference on Apple Silicon (macOS 14+ / iOS 17+).

Identifies 256 spoken languages from raw audio waveform. No Python required at runtime.

Model Details

Property Value
Source facebook/mms-lid-256
Architecture Wav2Vec2 for Sequence Classification
Languages 256
Input Raw waveform [1, N], 16kHz float32
Output Logits [1, 256]
Size 1.8 GB
Precision FP16
Min deployment macOS 14 / iOS 17
Compute units CPU + GPU (ANE causes 13x slowdown)

Benchmark Results

Tested on Apple Silicon (M1, Metal GPU, .cpuAndGPU):

Audio Predicted Confidence Inference Time
Russian (10s) rus 89.1% ~0.25s
English (30s) eng β€” ~4.0s

⚠️ Do NOT use .computeUnits = .all β€” ANE causes 13x slowdown (3.3s vs 0.25s for 10s audio) due to data transfer overhead between ANE and GPU.

Usage (Swift)

import CoreML

let compiledURL = try MLModel.compileModel(at: URL(fileURLWithPath: "MmsLid256.mlpackage"))
let config = MLModelConfiguration()
config.computeUnits = .cpuAndGPU  // .all causes 13x slowdown β€” do NOT use ANE
let model = try MLModel(contentsOf: compiledURL, configuration: config)

// pcm: [Float] β€” 16kHz mono audio samples (max 480000 = 30s)
let inputArray = try MLMultiArray(shape: [1, NSNumber(value: pcm.count)], dataType: .float32)
for (i, sample) in pcm.enumerated() {
    inputArray[[0, NSNumber(value: i)]] = NSNumber(value: sample)
}

let input = try MLDictionaryFeatureProvider(dictionary: [
    "input_values": MLFeatureValue(multiArray: inputArray),
])
let output = try model.prediction(from: input)

Files

  • MmsLid256.mlpackage/ β€” CoreML model
  • mms_lid_256_labels.json β€” language label mapping (index β†’ ISO 639-3 code)

Conversion

Converted via torch.jit.trace β†’ coremltools 9.0. See conversion script.

Full Inference Code

Complete Swift CLI with audio loading, inference, and result formatting: github.com/beshkenadze/lid-bench

License

CC-BY-NC 4.0 (same as the original model)

Downloads last month
5
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for beshkenadze/mms-lid-256-coreml

Quantized
(2)
this model