|
|
--- |
|
|
license: mit |
|
|
language: |
|
|
- multilingual |
|
|
- en |
|
|
- ja |
|
|
- zh |
|
|
- ko |
|
|
- de |
|
|
- fr |
|
|
- es |
|
|
- it |
|
|
- pt |
|
|
- ru |
|
|
library_name: coreml |
|
|
tags: |
|
|
- sentence-transformers |
|
|
- embeddings |
|
|
- coreml |
|
|
- ios |
|
|
- macos |
|
|
- multilingual |
|
|
- e5 |
|
|
- semantic-search |
|
|
base_model: intfloat/multilingual-e5-small |
|
|
pipeline_tag: sentence-similarity |
|
|
--- |
|
|
|
|
|
# Multilingual E5 Small - CoreML |
|
|
|
|
|
This is a CoreML conversion of [intfloat/multilingual-e5-small](https://huggingface.co/intfloat/multilingual-e5-small) for iOS/macOS deployment. |
|
|
|
|
|
## Model Description |
|
|
|
|
|
Multilingual E5 Small is a multilingual sentence embedding model optimized for semantic search and retrieval tasks. This CoreML version enables on-device inference on Apple platforms. |
|
|
|
|
|
### Key Features |
|
|
|
|
|
- **Multilingual**: Supports 100+ languages including English, Japanese, Chinese, Korean, German, French, Spanish, and more |
|
|
- **Search-optimized**: Designed specifically for retrieval tasks |
|
|
- **Cross-lingual**: Can match queries in one language to documents in another |
|
|
- **On-device**: Runs locally on iPhone/iPad/Mac without internet |
|
|
|
|
|
## Model Details |
|
|
|
|
|
| Property | Value | |
|
|
|----------|-------| |
|
|
| Base Model | intfloat/multilingual-e5-small | |
|
|
| Embedding Dimensions | 384 | |
|
|
| Max Sequence Length | 256 (configurable up to 512) | |
|
|
| Model Size | ~224 MB | |
|
|
| Precision | Float16 | |
|
|
| Minimum iOS | 17.0 | |
|
|
| Minimum macOS | 14.0 | |
|
|
|
|
|
## Usage |
|
|
|
|
|
### Input Format |
|
|
|
|
|
E5 models use prefixes to distinguish between queries and documents: |
|
|
|
|
|
- **Query**: `"query: your search query here"` |
|
|
- **Document**: `"passage: your document text here"` |
|
|
|
|
|
### Swift Example |
|
|
|
|
|
```swift |
|
|
import CoreML |
|
|
|
|
|
// Load model |
|
|
let config = MLModelConfiguration() |
|
|
config.computeUnits = .cpuAndNeuralEngine |
|
|
let model = try MLModel(contentsOf: modelURL, configuration: config) |
|
|
|
|
|
// Prepare inputs (after tokenization) |
|
|
let inputIds: MLMultiArray = // tokenized input |
|
|
let attentionMask: MLMultiArray = // attention mask |
|
|
|
|
|
// Run inference |
|
|
let input = try MLDictionaryFeatureProvider(dictionary: [ |
|
|
"input_ids": MLFeatureValue(multiArray: inputIds), |
|
|
"attention_mask": MLFeatureValue(multiArray: attentionMask) |
|
|
]) |
|
|
let output = try model.prediction(from: input) |
|
|
let embeddings = output.featureValue(for: "embeddings")?.multiArrayValue |
|
|
``` |
|
|
|
|
|
### Tokenizer |
|
|
|
|
|
Use the included `tokenizer.json` with [swift-transformers](https://github.com/huggingface/swift-transformers): |
|
|
|
|
|
```swift |
|
|
import Tokenizers |
|
|
|
|
|
let tokenizer = try await AutoTokenizer.from(modelFolder: tokenizerURL) |
|
|
let encoded = tokenizer.encode(text: "query: your text") |
|
|
``` |
|
|
|
|
|
## Performance |
|
|
|
|
|
Tested on iOS with Neural Engine: |
|
|
|
|
|
| Device | Inference Time | |
|
|
|--------|----------------| |
|
|
| iPhone 15 Pro | ~15ms | |
|
|
| iPhone 13 | ~25ms | |
|
|
| M1 Mac | ~10ms | |
|
|
|
|
|
## Accuracy Comparison |
|
|
|
|
|
Tested with 10 mixed Japanese/English technical queries: |
|
|
|
|
|
| Model | Accuracy | Avg Score | |
|
|
|-------|----------|-----------| |
|
|
| Apple NLEmbedding | 20% | 0.558 | |
|
|
| **This Model (E5)** | **100%** | **0.860** | |
|
|
|
|
|
## Files |
|
|
|
|
|
- `MultilingualE5Small.mlpackage/` - CoreML model package |
|
|
- `tokenizer.json` - Tokenizer vocabulary and configuration |
|
|
- `tokenizer_config.json` - Tokenizer settings |
|
|
|
|
|
## Conversion |
|
|
|
|
|
Converted using coremltools with FP16 precision: |
|
|
|
|
|
```python |
|
|
import coremltools as ct |
|
|
|
|
|
mlmodel = ct.convert( |
|
|
traced_model, |
|
|
inputs=[ |
|
|
ct.TensorType(name="input_ids", shape=(1, 256), dtype=np.int32), |
|
|
ct.TensorType(name="attention_mask", shape=(1, 256), dtype=np.int32), |
|
|
], |
|
|
outputs=[ |
|
|
ct.TensorType(name="embeddings", dtype=np.float16), |
|
|
], |
|
|
convert_to="mlprogram", |
|
|
minimum_deployment_target=ct.target.iOS17, |
|
|
compute_precision=ct.precision.FLOAT16, |
|
|
) |
|
|
``` |
|
|
|
|
|
## License |
|
|
|
|
|
MIT License (same as the base model) |
|
|
|
|
|
## Citation |
|
|
|
|
|
```bibtex |
|
|
@article{wang2024multilingual, |
|
|
title={Multilingual E5 Text Embeddings: A Technical Report}, |
|
|
author={Wang, Liang and Yang, Nan and Huang, Xiaolong and Yang, Linjun and Majumder, Rangan and Wei, Furu}, |
|
|
journal={arXiv preprint arXiv:2402.05672}, |
|
|
year={2024} |
|
|
} |
|
|
``` |
|
|
|
|
|
## Acknowledgments |
|
|
|
|
|
- Original model by [intfloat](https://huggingface.co/intfloat) |
|
|
- CoreML conversion for [ReadMD](https://github.com/user/readmd) iOS app |
|
|
|