tamikisg's picture
Upload folder using huggingface_hub
e2647c4 verified
---
license: mit
language:
- multilingual
- en
- ja
- zh
- ko
- de
- fr
- es
- it
- pt
- ru
library_name: coreml
tags:
- sentence-transformers
- embeddings
- coreml
- ios
- macos
- multilingual
- e5
- semantic-search
base_model: intfloat/multilingual-e5-small
pipeline_tag: sentence-similarity
---
# Multilingual E5 Small - CoreML
This is a CoreML conversion of [intfloat/multilingual-e5-small](https://huggingface.co/intfloat/multilingual-e5-small) for iOS/macOS deployment.
## Model Description
Multilingual E5 Small is a multilingual sentence embedding model optimized for semantic search and retrieval tasks. This CoreML version enables on-device inference on Apple platforms.
### Key Features
- **Multilingual**: Supports 100+ languages including English, Japanese, Chinese, Korean, German, French, Spanish, and more
- **Search-optimized**: Designed specifically for retrieval tasks
- **Cross-lingual**: Can match queries in one language to documents in another
- **On-device**: Runs locally on iPhone/iPad/Mac without internet
## Model Details
| Property | Value |
|----------|-------|
| Base Model | intfloat/multilingual-e5-small |
| Embedding Dimensions | 384 |
| Max Sequence Length | 256 (configurable up to 512) |
| Model Size | ~224 MB |
| Precision | Float16 |
| Minimum iOS | 17.0 |
| Minimum macOS | 14.0 |
## Usage
### Input Format
E5 models use prefixes to distinguish between queries and documents:
- **Query**: `"query: your search query here"`
- **Document**: `"passage: your document text here"`
### Swift Example
```swift
import CoreML
// Load model
let config = MLModelConfiguration()
config.computeUnits = .cpuAndNeuralEngine
let model = try MLModel(contentsOf: modelURL, configuration: config)
// Prepare inputs (after tokenization)
let inputIds: MLMultiArray = // tokenized input
let attentionMask: MLMultiArray = // attention mask
// Run inference
let input = try MLDictionaryFeatureProvider(dictionary: [
"input_ids": MLFeatureValue(multiArray: inputIds),
"attention_mask": MLFeatureValue(multiArray: attentionMask)
])
let output = try model.prediction(from: input)
let embeddings = output.featureValue(for: "embeddings")?.multiArrayValue
```
### Tokenizer
Use the included `tokenizer.json` with [swift-transformers](https://github.com/huggingface/swift-transformers):
```swift
import Tokenizers
let tokenizer = try await AutoTokenizer.from(modelFolder: tokenizerURL)
let encoded = tokenizer.encode(text: "query: your text")
```
## Performance
Tested on iOS with Neural Engine:
| Device | Inference Time |
|--------|----------------|
| iPhone 15 Pro | ~15ms |
| iPhone 13 | ~25ms |
| M1 Mac | ~10ms |
## Accuracy Comparison
Tested with 10 mixed Japanese/English technical queries:
| Model | Accuracy | Avg Score |
|-------|----------|-----------|
| Apple NLEmbedding | 20% | 0.558 |
| **This Model (E5)** | **100%** | **0.860** |
## Files
- `MultilingualE5Small.mlpackage/` - CoreML model package
- `tokenizer.json` - Tokenizer vocabulary and configuration
- `tokenizer_config.json` - Tokenizer settings
## Conversion
Converted using coremltools with FP16 precision:
```python
import coremltools as ct
mlmodel = ct.convert(
traced_model,
inputs=[
ct.TensorType(name="input_ids", shape=(1, 256), dtype=np.int32),
ct.TensorType(name="attention_mask", shape=(1, 256), dtype=np.int32),
],
outputs=[
ct.TensorType(name="embeddings", dtype=np.float16),
],
convert_to="mlprogram",
minimum_deployment_target=ct.target.iOS17,
compute_precision=ct.precision.FLOAT16,
)
```
## License
MIT License (same as the base model)
## Citation
```bibtex
@article{wang2024multilingual,
title={Multilingual E5 Text Embeddings: A Technical Report},
author={Wang, Liang and Yang, Nan and Huang, Xiaolong and Yang, Linjun and Majumder, Rangan and Wei, Furu},
journal={arXiv preprint arXiv:2402.05672},
year={2024}
}
```
## Acknowledgments
- Original model by [intfloat](https://huggingface.co/intfloat)
- CoreML conversion for [ReadMD](https://github.com/user/readmd) iOS app