Salesforce
/

SFR-Embedding-Mistral

Feature Extraction

sentence-transformers

Eval Results (legacy)

text-generation-inference

text-embeddings-inference

Model card Files Files and versions

yeliu918 commited on Feb 21, 2024

Commit

df32ef5

·

1 Parent(s): 0f85ffc

update README

Files changed (1) hide show

README.md +4 -2

README.md CHANGED Viewed

@@ -3330,7 +3330,7 @@ model = AutoModel.from_pretrained('Salesforce/SFR-Embedding-Mistral')
 # get the embeddings
 max_length = 4096
 input_texts = queries + passages
-batch_dict = tokenizer(input_texts, max_length=max_length - 1, padding=True, truncation=True, return_tensors="pt")
 outputs = model(**batch_dict)
 embeddings = last_token_pool(outputs.last_hidden_state, batch_dict['attention_mask'])
@@ -3369,7 +3369,9 @@ print(scores.tolist())
 # [[86.71537780761719, 36.645721435546875], [35.00497055053711, 82.07388305664062]]
 ```
-Code for MTEB evaluation will be added soon.
 SFR-Embedding Team (∗indicates lead contributors).
 * Rui Meng*

 # get the embeddings
 max_length = 4096
 input_texts = queries + passages
+batch_dict = tokenizer(input_texts, max_length=max_length, padding=True, truncation=True, return_tensors="pt")
 outputs = model(**batch_dict)
 embeddings = last_token_pool(outputs.last_hidden_state, batch_dict['attention_mask'])
 # [[86.71537780761719, 36.645721435546875], [35.00497055053711, 82.07388305664062]]
 ```
+### MTEB Benchmark Evaluation
+Check out [unilm/e5](https://github.com/microsoft/unilm/tree/master/e5) to reproduce evaluation results on the [BEIR](https://arxiv.org/abs/2104.08663) and [MTEB](https://arxiv.org/abs/2210.07316) benchmark.
 SFR-Embedding Team (∗indicates lead contributors).
 * Rui Meng*