|
|
--- |
|
|
language: |
|
|
- en |
|
|
tags: |
|
|
- two-tower |
|
|
- dual-encoder |
|
|
- semantic-search |
|
|
- document-retrieval |
|
|
- information-retrieval |
|
|
license: mit |
|
|
datasets: |
|
|
- ms_marco |
|
|
--- |
|
|
|
|
|
# mlx7-two-tower-retrieval |
|
|
|
|
|
This is a Two-Tower (Dual Encoder) model for document retrieval. |
|
|
|
|
|
## Model Description |
|
|
|
|
|
The Two-Tower model maps queries and documents to dense vector representations in the same semantic space, allowing for efficient similarity-based retrieval. |
|
|
|
|
|
### Architecture |
|
|
|
|
|
- **Tokenizer**: Character-level tokenization |
|
|
- **Embedding**: Lookup embeddings with 64-dimensional vectors |
|
|
- **Encoder**: Mean pooling with 128-dimensional hidden layer |
|
|
|
|
|
## Intended Use |
|
|
|
|
|
This model is designed for semantic search applications where traditional keyword matching is insufficient. It can be used to: |
|
|
|
|
|
- Encode documents and queries into dense vector representations |
|
|
- Retrieve relevant documents for a given query using vector similarity |
|
|
- Build semantic search engines |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- Limited context window (maximum sequence length of 64 tokens) |
|
|
- English-language focused |
|
|
- No contextual understanding beyond simple semantic similarity |
|
|
|
|
|
## Training |
|
|
|
|
|
- **Dataset**: MS MARCO passage retrieval dataset |
|
|
- **Training Method**: Contrastive learning with triplet loss |
|
|
- **Hardware**: NVIDIA GPU |
|
|
|