nklockiewicz's picture
Update README.md
f28d943 verified
|
Raw
History Blame Contribute Delete
1.96 kB
---
license: other
license_name: lfm1.0
license_link: https://huggingface.co/LiquidAI/LFM2.5-ColBERT-350M/blob/main/LICENSE
---
# Introduction
This repository hosts the [LFM2.5-ColBERT-350M](https://huggingface.co/LiquidAI/LFM2.5-ColBERT-350M) late-interaction retrieval model for the [React Native ExecuTorch](https://www.npmjs.com/package/react-native-executorch) library, exported for the **XNNPACK** (Android / generic CPU) and **MLX** (Apple GPU) delegates.
Unlike a standard sentence embedder (one vector per text), ColBERT is a
**multi-vector / late-interaction** model: it produces **one vector per token**
(`[numTokens, 128]`). Relevance is computed with **MaxSim** (for each query
token, the max dot product over document tokens, summed). Use it when you want
stronger retrieval quality than single-vector embeddings — e.g. RAG / search.
## Compatibility
The **MLX** variant requires a physical Apple Silicon device (it does not run
on the iOS simulator). The **XNNPACK** variant runs everywhere. Make sure your
runtime matches the ExecuTorch version used to export these `.pte` files; with
React Native ExecuTorch the library constants guarantee this.
### Using it (late interaction)
The model is a per-token embedder; scoring is the consumer's concern:
1. Prepend the role marker the model was trained with: `"[Q] "` for queries,
`"[D] "` for documents.
2. Run `forward` to get the per-token `[S, 128]` matrix for each text.
3. Score query↔document with **MaxSim**, optionally excluding the document
**skiplist** token ids (punctuation) so they don't contribute. The skiplist
for this model (from its `config_sentence_transformers.json`) tokenizes to:
`[510..524, 535..541, 568..573, 600..603]` (32 ids).
## Repository Structure
- `xnnpack/`, `mlx/` — the partitioned `.pte` files + per-backend `config.json`.
- `tokenizer.json` — wire to `tokenizerSource`.
- `config.json`, `tokenizer_config.json` — reference metadata.