Sentence Similarity
sentence-transformers
PyTorch
Safetensors
Transformers
German
bert
feature-extraction
information retrieval
ir
documents retrieval
passage retrieval
beir
benchmark
qrel
sts
semantic search
text-embeddings-inference
Instructions to use PM-AI/bi-encoder_msmarco_bert-base_german with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use PM-AI/bi-encoder_msmarco_bert-base_german with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("PM-AI/bi-encoder_msmarco_bert-base_german") sentences = [ "Das ist eine glückliche Person", "Das ist ein glücklicher Hund", "Das ist eine sehr glückliche Person", "Heute ist ein sonniger Tag" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Transformers
How to use PM-AI/bi-encoder_msmarco_bert-base_german with Transformers:
# Load model directly from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained("PM-AI/bi-encoder_msmarco_bert-base_german") model = AutoModel.from_pretrained("PM-AI/bi-encoder_msmarco_bert-base_german") - Inference
- Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -36,8 +36,6 @@ Details are presented below.
|
|
| 36 |
|
| 37 |
The model can be easily used with [Sentence Transformer](https://github.com/UKPLab/sentence-transformers) library.
|
| 38 |
|
| 39 |
-
tl;dr ... [go to evaluation results first](#evaluation)
|
| 40 |
-
|
| 41 |
## Training Data
|
| 42 |
The model is based on training with samples from **[MSMARCO Passage Ranking](https://microsoft.github.io/msmarco/#ranking)** dataset.
|
| 43 |
It contains about 500.000 questions and 8.8 million passages.
|
|
|
|
| 36 |
|
| 37 |
The model can be easily used with [Sentence Transformer](https://github.com/UKPLab/sentence-transformers) library.
|
| 38 |
|
|
|
|
|
|
|
| 39 |
## Training Data
|
| 40 |
The model is based on training with samples from **[MSMARCO Passage Ranking](https://microsoft.github.io/msmarco/#ranking)** dataset.
|
| 41 |
It contains about 500.000 questions and 8.8 million passages.
|