software-mansion
/

react-native-executorch-lfm2.5-colbert-350m

Model card Files Files and versions

react-native-executorch-lfm2.5-colbert-350m / README.md

nklockiewicz's picture

Update README.md

f28d943 verified 13 days ago

|

History Blame Contribute Delete

1.96 kB

	---
	license: other
	license_name: lfm1.0
	license_link: https://huggingface.co/LiquidAI/LFM2.5-ColBERT-350M/blob/main/LICENSE
	---

	# Introduction

	This repository hosts the [LFM2.5-ColBERT-350M](https://huggingface.co/LiquidAI/LFM2.5-ColBERT-350M) late-interaction retrieval model for the [React Native ExecuTorch](https://www.npmjs.com/package/react-native-executorch) library, exported for the XNNPACK (Android / generic CPU) and MLX (Apple GPU) delegates.

	Unlike a standard sentence embedder (one vector per text), ColBERT is a
	multi-vector / late-interaction model: it produces one vector per token
	(`[numTokens, 128]`). Relevance is computed with MaxSim (for each query
	token, the max dot product over document tokens, summed). Use it when you want
	stronger retrieval quality than single-vector embeddings — e.g. RAG / search.

	## Compatibility

	The MLX variant requires a physical Apple Silicon device (it does not run
	on the iOS simulator). The XNNPACK variant runs everywhere. Make sure your
	runtime matches the ExecuTorch version used to export these `.pte` files; with
	React Native ExecuTorch the library constants guarantee this.

	### Using it (late interaction)

	The model is a per-token embedder; scoring is the consumer's concern:

	1. Prepend the role marker the model was trained with: `"[Q] "` for queries,
	`"[D] "` for documents.
	2. Run `forward` to get the per-token `[S, 128]` matrix for each text.
	3. Score query↔document with MaxSim, optionally excluding the document
	skiplist token ids (punctuation) so they don't contribute. The skiplist
	for this model (from its `config_sentence_transformers.json`) tokenizes to:
	`[510..524, 535..541, 568..573, 600..603]` (32 ids).

	## Repository Structure

	- `xnnpack/`, `mlx/` — the partitioned `.pte` files + per-backend `config.json`.
	- `tokenizer.json` — wire to `tokenizerSource`.
	- `config.json`, `tokenizer_config.json` — reference metadata.