| --- |
| license: other |
| license_name: lfm1.0 |
| license_link: https://huggingface.co/LiquidAI/LFM2.5-Embedding-350M/blob/main/LICENSE |
| --- |
| |
| # Introduction |
|
|
| This repository hosts the [LFM2.5-Embedding-350M](https://huggingface.co/LiquidAI/LFM2.5-Embedding-350M) model for the [React Native ExecuTorch](https://www.npmjs.com/package/react-native-executorch) library. It includes the model exported for both the **XNNPACK** (Android / generic CPU) and **MLX** (Apple GPU) delegates, ready for use in the **ExecuTorch** runtime. |
|
|
| If you'd like to run these models in your own ExecuTorch runtime, refer to the [official documentation](https://pytorch.org/executorch/stable/index.html) for setup instructions. |
|
|
| ## Compatibility |
|
|
| If you intend to use this model outside of React Native ExecuTorch, make sure your runtime is compatible with the **ExecuTorch** version used to export the `.pte` files. If you work with React Native ExecuTorch, the constants from the library will guarantee compatibility with the runtime used behind the scenes. |
|
|
| The **MLX** variant requires a physical Apple Silicon device (it does not run on the iOS simulator). The **XNNPACK** variant runs everywhere. |
|
|
| ## Repository Structure |
|
|
| - `xnnpack/` — `.pte` file partitioned for the XNNPACK delegate. |
| - `mlx/` — `.pte` file partitioned for the MLX delegate (Apple Silicon only). |
| - `tokenizer.json` — HuggingFace fast-tokenizer dump. Wire this to `tokenizerSource`. |
| - `config.json`, `tokenizer_config.json` — upstream model/tokenizer configs, kept for reference and for non-RNE consumers. |
|
|
| The `.pte` path goes to `modelSource`; `tokenizer.json` is shared across all variants. |
|
|
| ## Model details |
|
|
| - Architecture: LFM2.5-350M bidirectional backbone (hybrid conv + attention, hidden size 1024) + CLS pooling + L2 normalize. The exported graph bakes in CLS pooling and L2 normalization, so the runner consumes `(input_ids, attention_mask)` and receives the final unit-norm embedding directly. |
| - Output dimension: **1024**. |
| - Similarity metric: **cosine** (embeddings are L2-normalized, so a dot product equals cosine). |
| - Prompts: the model is trained with asymmetric `query: ` / `document: ` text prefixes. Prepend `query: ` to search queries and `document: ` to indexed passages for best retrieval quality. |
|
|