Kogero
/

ms-marco-dual-encoder

+---
+language: en
+license: mit
+datasets:
+- microsoft/ms_marco
+tags:
+- dual-encoder
+- two-tower
+- neural-ir
+- information-retrieval
+---
+# MS MARCO Dual Encoder Model
+This repository contains a Dual Encoder (Two-Tower) model trained on the Microsoft MS MARCO dataset for information retrieval tasks.
+## Model Details
+- **Architecture**: Two-Tower (Dual Encoder)
+- **Embedding Dimension**: 128
+- **Training Strategy**: Triplet loss with margin 0.2
+- **Vocabulary Size**: 50,001
+- **Dataset Size**: 5,000
+- **Parameters**:
+  - Query Tower: 16,512
+  - Document Tower: 16,512
+  - Total: 33,024
+- **Training Device**: cuda
+## Usage
+```python
+import torch
+from model import QryTower, DocTower
+# Load the models
+embedding_dim = 128
+qry_model = QryTower(embedding_dim)
+doc_model = DocTower(embedding_dim)
+qry_model.load_state_dict(torch.load("qry_tower.pth"))
+doc_model.load_state_dict(torch.load("doc_tower.pth"))
+# Get embeddings for query and document
+query_embedding = qry_model(preprocessed_query)
+document_embedding = doc_model(preprocessed_document)
+# Calculate similarity
+similarity = torch.cosine_similarity(query_embedding, document_embedding)
+```
+## Training
+This model was trained for 5 epochs with a batch size of 32 and learning rate of 0.001.
+## License
+MIT