--- license: other license_name: lfm1.0 license_link: https://huggingface.co/LiquidAI/LFM2.5-Embedding-350M/blob/main/LICENSE --- # Introduction This repository hosts the [LFM2.5-Embedding-350M](https://huggingface.co/LiquidAI/LFM2.5-Embedding-350M) model for the [React Native ExecuTorch](https://www.npmjs.com/package/react-native-executorch) library. It includes the model exported for both the **XNNPACK** (Android / generic CPU) and **MLX** (Apple GPU) delegates, ready for use in the **ExecuTorch** runtime. If you'd like to run these models in your own ExecuTorch runtime, refer to the [official documentation](https://pytorch.org/executorch/stable/index.html) for setup instructions. ## Compatibility If you intend to use this model outside of React Native ExecuTorch, make sure your runtime is compatible with the **ExecuTorch** version used to export the `.pte` files. If you work with React Native ExecuTorch, the constants from the library will guarantee compatibility with the runtime used behind the scenes. The **MLX** variant requires a physical Apple Silicon device (it does not run on the iOS simulator). The **XNNPACK** variant runs everywhere. ## Repository Structure - `xnnpack/` — `.pte` file partitioned for the XNNPACK delegate. - `mlx/` — `.pte` file partitioned for the MLX delegate (Apple Silicon only). - `tokenizer.json` — HuggingFace fast-tokenizer dump. Wire this to `tokenizerSource`. - `config.json`, `tokenizer_config.json` — upstream model/tokenizer configs, kept for reference and for non-RNE consumers. The `.pte` path goes to `modelSource`; `tokenizer.json` is shared across all variants. ## Model details - Architecture: LFM2.5-350M bidirectional backbone (hybrid conv + attention, hidden size 1024) + CLS pooling + L2 normalize. The exported graph bakes in CLS pooling and L2 normalization, so the runner consumes `(input_ids, attention_mask)` and receives the final unit-norm embedding directly. - Output dimension: **1024**. - Similarity metric: **cosine** (embeddings are L2-normalized, so a dot product equals cosine). - Prompts: the model is trained with asymmetric `query: ` / `document: ` text prefixes. Prepend `query: ` to search queries and `document: ` to indexed passages for best retrieval quality.