MiA Emb 4B CoreML

MiA-Emb-4B converted to CoreML for Apple Silicon (M1/M2/M3/M4)

Model Details

Format: CoreML ML Program (.mlpackage)
Precision: FP16
Input: input_ids (1, 512), attention_mask (1, 512)
Output: embeddings (1, 3584)
Target: macOS 14+ / iOS 17+ / Apple Silicon

Usage

import coremltools as ct
import numpy as np

# Load model
model = ct.models.MLModel("coreml_fp16.mlpackage")

# Prepare inputs (use your tokenizer)
input_ids = np.zeros((1, 512), dtype=np.int32)
attention_mask = np.ones((1, 512), dtype=np.int32)

# Run inference
output = model.predict({"input_ids": input_ids, "attention_mask": attention_mask})
embeddings = output["embeddings"]  # Shape: (1, 3584)

Performance

Benchmarked on Apple M4:

Inference: ~80-100ms per embedding
Load time: ~13s (first load, cached after)

Conversion

Converted using coremltools 8.1 with custom op handlers for:

new_ones (GitHub issue #2040)
Bitwise ops (and, or) with int→bool casting

License

Same license as base model: MindscapeRAG/MiA-Emb-4B

Downloads last month: 3

Model tree for maxiboch/MiA-Emb-4B-CoreML

Base model

Qwen/Qwen3-4B-Base

Finetuned

Qwen/Qwen3-Embedding-4B

Finetuned

MindscapeRAG/MiA-Emb-4B

Quantized

(2)

this model