Add model card
Browse files- model_card.md +47 -0
model_card.md
ADDED
|
@@ -0,0 +1,47 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language:
|
| 3 |
+
- en
|
| 4 |
+
tags:
|
| 5 |
+
- two-tower
|
| 6 |
+
- dual-encoder
|
| 7 |
+
- semantic-search
|
| 8 |
+
- document-retrieval
|
| 9 |
+
- information-retrieval
|
| 10 |
+
license: mit
|
| 11 |
+
datasets:
|
| 12 |
+
- ms_marco
|
| 13 |
+
---
|
| 14 |
+
|
| 15 |
+
# mlx7-two-tower-retrieval
|
| 16 |
+
|
| 17 |
+
This is a Two-Tower (Dual Encoder) model for document retrieval.
|
| 18 |
+
|
| 19 |
+
## Model Description
|
| 20 |
+
|
| 21 |
+
The Two-Tower model maps queries and documents to dense vector representations in the same semantic space, allowing for efficient similarity-based retrieval.
|
| 22 |
+
|
| 23 |
+
### Architecture
|
| 24 |
+
|
| 25 |
+
- **Tokenizer**: Character-level tokenization
|
| 26 |
+
- **Embedding**: Lookup embeddings with 64-dimensional vectors
|
| 27 |
+
- **Encoder**: Mean pooling with 128-dimensional hidden layer
|
| 28 |
+
|
| 29 |
+
## Intended Use
|
| 30 |
+
|
| 31 |
+
This model is designed for semantic search applications where traditional keyword matching is insufficient. It can be used to:
|
| 32 |
+
|
| 33 |
+
- Encode documents and queries into dense vector representations
|
| 34 |
+
- Retrieve relevant documents for a given query using vector similarity
|
| 35 |
+
- Build semantic search engines
|
| 36 |
+
|
| 37 |
+
## Limitations
|
| 38 |
+
|
| 39 |
+
- Limited context window (maximum sequence length of 64 tokens)
|
| 40 |
+
- English-language focused
|
| 41 |
+
- No contextual understanding beyond simple semantic similarity
|
| 42 |
+
|
| 43 |
+
## Training
|
| 44 |
+
|
| 45 |
+
- **Dataset**: MS MARCO passage retrieval dataset
|
| 46 |
+
- **Training Method**: Contrastive learning with triplet loss
|
| 47 |
+
- **Hardware**: NVIDIA GPU
|