Sentence Similarity
sentence-transformers
PyTorch
Safetensors
Transformers
Polish
xlm-roberta
feature-extraction
information-retrieval
text-embeddings-inference
Instructions to use sdadas/mmlw-retrieval-e5-base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use sdadas/mmlw-retrieval-e5-base with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("sdadas/mmlw-retrieval-e5-base") sentences = [ "query: Jak dożyć 100 lat?", "passage: Trzeba zdrowo się odżywiać i uprawiać sport.", "passage: Trzeba pić alkohol, imprezować i jeździć szybkimi autami.", "passage: Gdy trwała kampania politycy zapewniali, że rozprawią się z zakazem niedzielnego handlu." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Transformers
How to use sdadas/mmlw-retrieval-e5-base with Transformers:
# Load model directly from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained("sdadas/mmlw-retrieval-e5-base") model = AutoModel.from_pretrained("sdadas/mmlw-retrieval-e5-base") - Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -59,4 +59,17 @@ print(answers[best_answer])
|
|
| 59 |
The model achieves **NDCG@10** of **56.09** on the Polish Information Retrieval Benchmark. See [PIRB Leaderboard](https://huggingface.co/spaces/sdadas/pirb) for detailed results.
|
| 60 |
|
| 61 |
## Acknowledgements
|
| 62 |
-
This model was trained with the A100 GPU cluster support delivered by the Gdansk University of Technology within the TASK center initiative.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 59 |
The model achieves **NDCG@10** of **56.09** on the Polish Information Retrieval Benchmark. See [PIRB Leaderboard](https://huggingface.co/spaces/sdadas/pirb) for detailed results.
|
| 60 |
|
| 61 |
## Acknowledgements
|
| 62 |
+
This model was trained with the A100 GPU cluster support delivered by the Gdansk University of Technology within the TASK center initiative.
|
| 63 |
+
|
| 64 |
+
## Citation
|
| 65 |
+
|
| 66 |
+
```bibtex
|
| 67 |
+
@article{dadas2024pirb,
|
| 68 |
+
title={{PIRB}: A Comprehensive Benchmark of Polish Dense and Hybrid Text Retrieval Methods},
|
| 69 |
+
author={Sławomir Dadas and Michał Perełkiewicz and Rafał Poświata},
|
| 70 |
+
year={2024},
|
| 71 |
+
eprint={2402.13350},
|
| 72 |
+
archivePrefix={arXiv},
|
| 73 |
+
primaryClass={cs.CL}
|
| 74 |
+
}
|
| 75 |
+
```
|