--- license: apache-2.0 tags: - coreml - apple-silicon - embeddings - sentence-transformers library_name: coremltools base_model: MindscapeRAG/MiA-Emb-4B pipeline_tag: feature-extraction --- # MiA Emb 4B CoreML MiA-Emb-4B converted to CoreML for Apple Silicon (M1/M2/M3/M4) ## Model Details - **Format**: CoreML ML Program (`.mlpackage`) - **Precision**: FP16 - **Input**: `input_ids` (1, 512), `attention_mask` (1, 512) - **Output**: `embeddings` (1, 3584) - **Target**: macOS 14+ / iOS 17+ / Apple Silicon ## Usage ```python import coremltools as ct import numpy as np # Load model model = ct.models.MLModel("coreml_fp16.mlpackage") # Prepare inputs (use your tokenizer) input_ids = np.zeros((1, 512), dtype=np.int32) attention_mask = np.ones((1, 512), dtype=np.int32) # Run inference output = model.predict({"input_ids": input_ids, "attention_mask": attention_mask}) embeddings = output["embeddings"] # Shape: (1, 3584) ``` ## Performance Benchmarked on Apple M4: - **Inference**: ~80-100ms per embedding - **Load time**: ~13s (first load, cached after) ## Conversion Converted using coremltools 8.1 with custom op handlers for: - `new_ones` (GitHub issue #2040) - Bitwise ops (`and`, `or`) with int→bool casting ## License Same license as base model: MindscapeRAG/MiA-Emb-4B