tamikisg's picture
Upload folder using huggingface_hub
e2647c4 verified
metadata
license: mit
language:
  - multilingual
  - en
  - ja
  - zh
  - ko
  - de
  - fr
  - es
  - it
  - pt
  - ru
library_name: coreml
tags:
  - sentence-transformers
  - embeddings
  - coreml
  - ios
  - macos
  - multilingual
  - e5
  - semantic-search
base_model: intfloat/multilingual-e5-small
pipeline_tag: sentence-similarity

Multilingual E5 Small - CoreML

This is a CoreML conversion of intfloat/multilingual-e5-small for iOS/macOS deployment.

Model Description

Multilingual E5 Small is a multilingual sentence embedding model optimized for semantic search and retrieval tasks. This CoreML version enables on-device inference on Apple platforms.

Key Features

  • Multilingual: Supports 100+ languages including English, Japanese, Chinese, Korean, German, French, Spanish, and more
  • Search-optimized: Designed specifically for retrieval tasks
  • Cross-lingual: Can match queries in one language to documents in another
  • On-device: Runs locally on iPhone/iPad/Mac without internet

Model Details

Property Value
Base Model intfloat/multilingual-e5-small
Embedding Dimensions 384
Max Sequence Length 256 (configurable up to 512)
Model Size ~224 MB
Precision Float16
Minimum iOS 17.0
Minimum macOS 14.0

Usage

Input Format

E5 models use prefixes to distinguish between queries and documents:

  • Query: "query: your search query here"
  • Document: "passage: your document text here"

Swift Example

import CoreML

// Load model
let config = MLModelConfiguration()
config.computeUnits = .cpuAndNeuralEngine
let model = try MLModel(contentsOf: modelURL, configuration: config)

// Prepare inputs (after tokenization)
let inputIds: MLMultiArray = // tokenized input
let attentionMask: MLMultiArray = // attention mask

// Run inference
let input = try MLDictionaryFeatureProvider(dictionary: [
    "input_ids": MLFeatureValue(multiArray: inputIds),
    "attention_mask": MLFeatureValue(multiArray: attentionMask)
])
let output = try model.prediction(from: input)
let embeddings = output.featureValue(for: "embeddings")?.multiArrayValue

Tokenizer

Use the included tokenizer.json with swift-transformers:

import Tokenizers

let tokenizer = try await AutoTokenizer.from(modelFolder: tokenizerURL)
let encoded = tokenizer.encode(text: "query: your text")

Performance

Tested on iOS with Neural Engine:

Device Inference Time
iPhone 15 Pro ~15ms
iPhone 13 ~25ms
M1 Mac ~10ms

Accuracy Comparison

Tested with 10 mixed Japanese/English technical queries:

Model Accuracy Avg Score
Apple NLEmbedding 20% 0.558
This Model (E5) 100% 0.860

Files

  • MultilingualE5Small.mlpackage/ - CoreML model package
  • tokenizer.json - Tokenizer vocabulary and configuration
  • tokenizer_config.json - Tokenizer settings

Conversion

Converted using coremltools with FP16 precision:

import coremltools as ct

mlmodel = ct.convert(
    traced_model,
    inputs=[
        ct.TensorType(name="input_ids", shape=(1, 256), dtype=np.int32),
        ct.TensorType(name="attention_mask", shape=(1, 256), dtype=np.int32),
    ],
    outputs=[
        ct.TensorType(name="embeddings", dtype=np.float16),
    ],
    convert_to="mlprogram",
    minimum_deployment_target=ct.target.iOS17,
    compute_precision=ct.precision.FLOAT16,
)

License

MIT License (same as the base model)

Citation

@article{wang2024multilingual,
  title={Multilingual E5 Text Embeddings: A Technical Report},
  author={Wang, Liang and Yang, Nan and Huang, Xiaolong and Yang, Linjun and Majumder, Rangan and Wei, Furu},
  journal={arXiv preprint arXiv:2402.05672},
  year={2024}
}

Acknowledgments