bge m3 CoreML
BGE-M3 converted to CoreML for Apple Silicon (M1/M2/M3/M4)
Model Details
- Format: CoreML ML Program (
.mlpackage) - Precision: FP16
- Input:
input_ids(1, 512),attention_mask(1, 512) - Output:
embeddings(1, 1024) - Target: macOS 14+ / iOS 17+ / Apple Silicon
Usage
import coremltools as ct
import numpy as np
# Load model
model = ct.models.MLModel("coreml_fp16.mlpackage")
# Prepare inputs (use your tokenizer)
input_ids = np.zeros((1, 512), dtype=np.int32)
attention_mask = np.ones((1, 512), dtype=np.int32)
# Run inference
output = model.predict({"input_ids": input_ids, "attention_mask": attention_mask})
embeddings = output["embeddings"] # Shape: (1, 1024)
Performance
Benchmarked on Apple M4:
- Inference: ~80-100ms per embedding
- Load time: ~13s (first load, cached after)
Conversion
Converted using coremltools 8.1 with custom op handlers for:
new_ones(GitHub issue #2040)- Bitwise ops (
and,or) with int→bool casting
License
Same license as base model: BAAI/bge-m3
- Downloads last month
- 3
Model tree for maxiboch/bge-m3-CoreML
Base model
BAAI/bge-m3