Introduction

This repository hosts the LFM2.5-Embedding-350M model for the React Native ExecuTorch library. It includes the model exported for both the XNNPACK (Android / generic CPU) and MLX (Apple GPU) delegates, ready for use in the ExecuTorch runtime.

If you'd like to run these models in your own ExecuTorch runtime, refer to the official documentation for setup instructions.

Compatibility

If you intend to use this model outside of React Native ExecuTorch, make sure your runtime is compatible with the ExecuTorch version used to export the .pte files. If you work with React Native ExecuTorch, the constants from the library will guarantee compatibility with the runtime used behind the scenes.

The MLX variant requires a physical Apple Silicon device (it does not run on the iOS simulator). The XNNPACK variant runs everywhere.

Repository Structure

  • xnnpack/ โ€” .pte file partitioned for the XNNPACK delegate.
  • mlx/ โ€” .pte file partitioned for the MLX delegate (Apple Silicon only).
  • tokenizer.json โ€” HuggingFace fast-tokenizer dump. Wire this to tokenizerSource.
  • config.json, tokenizer_config.json โ€” upstream model/tokenizer configs, kept for reference and for non-RNE consumers.

The .pte path goes to modelSource; tokenizer.json is shared across all variants.

Model details

  • Architecture: LFM2.5-350M bidirectional backbone (hybrid conv + attention, hidden size 1024) + CLS pooling + L2 normalize. The exported graph bakes in CLS pooling and L2 normalization, so the runner consumes (input_ids, attention_mask) and receives the final unit-norm embedding directly.
  • Output dimension: 1024.
  • Similarity metric: cosine (embeddings are L2-normalized, so a dot product equals cosine).
  • Prompts: the model is trained with asymmetric query: / document: text prefixes. Prepend query: to search queries and document: to indexed passages for best retrieval quality.
Downloads last month
11
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Collection including software-mansion/react-native-executorch-lfm2.5-embedding-350m