nklockiewicz's picture
Update README.md
688034c verified
|
Raw
History Blame Contribute Delete
2.27 kB
---
license: other
license_name: lfm1.0
license_link: https://huggingface.co/LiquidAI/LFM2.5-Embedding-350M/blob/main/LICENSE
---
# Introduction
This repository hosts the [LFM2.5-Embedding-350M](https://huggingface.co/LiquidAI/LFM2.5-Embedding-350M) model for the [React Native ExecuTorch](https://www.npmjs.com/package/react-native-executorch) library. It includes the model exported for both the **XNNPACK** (Android / generic CPU) and **MLX** (Apple GPU) delegates, ready for use in the **ExecuTorch** runtime.
If you'd like to run these models in your own ExecuTorch runtime, refer to the [official documentation](https://pytorch.org/executorch/stable/index.html) for setup instructions.
## Compatibility
If you intend to use this model outside of React Native ExecuTorch, make sure your runtime is compatible with the **ExecuTorch** version used to export the `.pte` files. If you work with React Native ExecuTorch, the constants from the library will guarantee compatibility with the runtime used behind the scenes.
The **MLX** variant requires a physical Apple Silicon device (it does not run on the iOS simulator). The **XNNPACK** variant runs everywhere.
## Repository Structure
- `xnnpack/` — `.pte` file partitioned for the XNNPACK delegate.
- `mlx/` — `.pte` file partitioned for the MLX delegate (Apple Silicon only).
- `tokenizer.json` — HuggingFace fast-tokenizer dump. Wire this to `tokenizerSource`.
- `config.json`, `tokenizer_config.json` — upstream model/tokenizer configs, kept for reference and for non-RNE consumers.
The `.pte` path goes to `modelSource`; `tokenizer.json` is shared across all variants.
## Model details
- Architecture: LFM2.5-350M bidirectional backbone (hybrid conv + attention, hidden size 1024) + CLS pooling + L2 normalize. The exported graph bakes in CLS pooling and L2 normalization, so the runner consumes `(input_ids, attention_mask)` and receives the final unit-norm embedding directly.
- Output dimension: **1024**.
- Similarity metric: **cosine** (embeddings are L2-normalized, so a dot product equals cosine).
- Prompts: the model is trained with asymmetric `query: ` / `document: ` text prefixes. Prepend `query: ` to search queries and `document: ` to indexed passages for best retrieval quality.