File size: 4,128 Bytes
e2647c4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
---
license: mit
language:
  - multilingual
  - en
  - ja
  - zh
  - ko
  - de
  - fr
  - es
  - it
  - pt
  - ru
library_name: coreml
tags:
  - sentence-transformers
  - embeddings
  - coreml
  - ios
  - macos
  - multilingual
  - e5
  - semantic-search
base_model: intfloat/multilingual-e5-small
pipeline_tag: sentence-similarity
---

# Multilingual E5 Small - CoreML

This is a CoreML conversion of [intfloat/multilingual-e5-small](https://huggingface.co/intfloat/multilingual-e5-small) for iOS/macOS deployment.

## Model Description

Multilingual E5 Small is a multilingual sentence embedding model optimized for semantic search and retrieval tasks. This CoreML version enables on-device inference on Apple platforms.

### Key Features

- **Multilingual**: Supports 100+ languages including English, Japanese, Chinese, Korean, German, French, Spanish, and more
- **Search-optimized**: Designed specifically for retrieval tasks
- **Cross-lingual**: Can match queries in one language to documents in another
- **On-device**: Runs locally on iPhone/iPad/Mac without internet

## Model Details

| Property | Value |
|----------|-------|
| Base Model | intfloat/multilingual-e5-small |
| Embedding Dimensions | 384 |
| Max Sequence Length | 256 (configurable up to 512) |
| Model Size | ~224 MB |
| Precision | Float16 |
| Minimum iOS | 17.0 |
| Minimum macOS | 14.0 |

## Usage

### Input Format

E5 models use prefixes to distinguish between queries and documents:

- **Query**: `"query: your search query here"`
- **Document**: `"passage: your document text here"`

### Swift Example

```swift
import CoreML

// Load model
let config = MLModelConfiguration()
config.computeUnits = .cpuAndNeuralEngine
let model = try MLModel(contentsOf: modelURL, configuration: config)

// Prepare inputs (after tokenization)
let inputIds: MLMultiArray = // tokenized input
let attentionMask: MLMultiArray = // attention mask

// Run inference
let input = try MLDictionaryFeatureProvider(dictionary: [
    "input_ids": MLFeatureValue(multiArray: inputIds),
    "attention_mask": MLFeatureValue(multiArray: attentionMask)
])
let output = try model.prediction(from: input)
let embeddings = output.featureValue(for: "embeddings")?.multiArrayValue
```

### Tokenizer

Use the included `tokenizer.json` with [swift-transformers](https://github.com/huggingface/swift-transformers):

```swift
import Tokenizers

let tokenizer = try await AutoTokenizer.from(modelFolder: tokenizerURL)
let encoded = tokenizer.encode(text: "query: your text")
```

## Performance

Tested on iOS with Neural Engine:

| Device | Inference Time |
|--------|----------------|
| iPhone 15 Pro | ~15ms |
| iPhone 13 | ~25ms |
| M1 Mac | ~10ms |

## Accuracy Comparison

Tested with 10 mixed Japanese/English technical queries:

| Model | Accuracy | Avg Score |
|-------|----------|-----------|
| Apple NLEmbedding | 20% | 0.558 |
| **This Model (E5)** | **100%** | **0.860** |

## Files

- `MultilingualE5Small.mlpackage/` - CoreML model package
- `tokenizer.json` - Tokenizer vocabulary and configuration
- `tokenizer_config.json` - Tokenizer settings

## Conversion

Converted using coremltools with FP16 precision:

```python
import coremltools as ct

mlmodel = ct.convert(
    traced_model,
    inputs=[
        ct.TensorType(name="input_ids", shape=(1, 256), dtype=np.int32),
        ct.TensorType(name="attention_mask", shape=(1, 256), dtype=np.int32),
    ],
    outputs=[
        ct.TensorType(name="embeddings", dtype=np.float16),
    ],
    convert_to="mlprogram",
    minimum_deployment_target=ct.target.iOS17,
    compute_precision=ct.precision.FLOAT16,
)
```

## License

MIT License (same as the base model)

## Citation

```bibtex
@article{wang2024multilingual,
  title={Multilingual E5 Text Embeddings: A Technical Report},
  author={Wang, Liang and Yang, Nan and Huang, Xiaolong and Yang, Linjun and Majumder, Rangan and Wei, Furu},
  journal={arXiv preprint arXiv:2402.05672},
  year={2024}
}
```

## Acknowledgments

- Original model by [intfloat](https://huggingface.co/intfloat)
- CoreML conversion for [ReadMD](https://github.com/user/readmd) iOS app