File size: 4,128 Bytes
e2647c4 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 | ---
license: mit
language:
- multilingual
- en
- ja
- zh
- ko
- de
- fr
- es
- it
- pt
- ru
library_name: coreml
tags:
- sentence-transformers
- embeddings
- coreml
- ios
- macos
- multilingual
- e5
- semantic-search
base_model: intfloat/multilingual-e5-small
pipeline_tag: sentence-similarity
---
# Multilingual E5 Small - CoreML
This is a CoreML conversion of [intfloat/multilingual-e5-small](https://huggingface.co/intfloat/multilingual-e5-small) for iOS/macOS deployment.
## Model Description
Multilingual E5 Small is a multilingual sentence embedding model optimized for semantic search and retrieval tasks. This CoreML version enables on-device inference on Apple platforms.
### Key Features
- **Multilingual**: Supports 100+ languages including English, Japanese, Chinese, Korean, German, French, Spanish, and more
- **Search-optimized**: Designed specifically for retrieval tasks
- **Cross-lingual**: Can match queries in one language to documents in another
- **On-device**: Runs locally on iPhone/iPad/Mac without internet
## Model Details
| Property | Value |
|----------|-------|
| Base Model | intfloat/multilingual-e5-small |
| Embedding Dimensions | 384 |
| Max Sequence Length | 256 (configurable up to 512) |
| Model Size | ~224 MB |
| Precision | Float16 |
| Minimum iOS | 17.0 |
| Minimum macOS | 14.0 |
## Usage
### Input Format
E5 models use prefixes to distinguish between queries and documents:
- **Query**: `"query: your search query here"`
- **Document**: `"passage: your document text here"`
### Swift Example
```swift
import CoreML
// Load model
let config = MLModelConfiguration()
config.computeUnits = .cpuAndNeuralEngine
let model = try MLModel(contentsOf: modelURL, configuration: config)
// Prepare inputs (after tokenization)
let inputIds: MLMultiArray = // tokenized input
let attentionMask: MLMultiArray = // attention mask
// Run inference
let input = try MLDictionaryFeatureProvider(dictionary: [
"input_ids": MLFeatureValue(multiArray: inputIds),
"attention_mask": MLFeatureValue(multiArray: attentionMask)
])
let output = try model.prediction(from: input)
let embeddings = output.featureValue(for: "embeddings")?.multiArrayValue
```
### Tokenizer
Use the included `tokenizer.json` with [swift-transformers](https://github.com/huggingface/swift-transformers):
```swift
import Tokenizers
let tokenizer = try await AutoTokenizer.from(modelFolder: tokenizerURL)
let encoded = tokenizer.encode(text: "query: your text")
```
## Performance
Tested on iOS with Neural Engine:
| Device | Inference Time |
|--------|----------------|
| iPhone 15 Pro | ~15ms |
| iPhone 13 | ~25ms |
| M1 Mac | ~10ms |
## Accuracy Comparison
Tested with 10 mixed Japanese/English technical queries:
| Model | Accuracy | Avg Score |
|-------|----------|-----------|
| Apple NLEmbedding | 20% | 0.558 |
| **This Model (E5)** | **100%** | **0.860** |
## Files
- `MultilingualE5Small.mlpackage/` - CoreML model package
- `tokenizer.json` - Tokenizer vocabulary and configuration
- `tokenizer_config.json` - Tokenizer settings
## Conversion
Converted using coremltools with FP16 precision:
```python
import coremltools as ct
mlmodel = ct.convert(
traced_model,
inputs=[
ct.TensorType(name="input_ids", shape=(1, 256), dtype=np.int32),
ct.TensorType(name="attention_mask", shape=(1, 256), dtype=np.int32),
],
outputs=[
ct.TensorType(name="embeddings", dtype=np.float16),
],
convert_to="mlprogram",
minimum_deployment_target=ct.target.iOS17,
compute_precision=ct.precision.FLOAT16,
)
```
## License
MIT License (same as the base model)
## Citation
```bibtex
@article{wang2024multilingual,
title={Multilingual E5 Text Embeddings: A Technical Report},
author={Wang, Liang and Yang, Nan and Huang, Xiaolong and Yang, Linjun and Majumder, Rangan and Wei, Furu},
journal={arXiv preprint arXiv:2402.05672},
year={2024}
}
```
## Acknowledgments
- Original model by [intfloat](https://huggingface.co/intfloat)
- CoreML conversion for [ReadMD](https://github.com/user/readmd) iOS app
|