Spaces:

flax-sentence-embeddings
/

sentence-embeddings

Runtime error

Trent commited on Jul 26, 2021

Commit

113ad6b

1 Parent(s): 6a49bc1

Contributions

Files changed (1) hide show

app.py CHANGED Viewed

@@ -18,10 +18,15 @@ Hi! This is the demo for the [flax sentence embeddings](https://huggingface.co/f
 We trained three general-purpose flax-sentence-embeddings models: a **distilroberta base**, a **mpnet base** and a **minilm-l6**.
 The models were trained on a dataset comprising of [1 Billion+ training corpus](https://huggingface.co/flax-sentence-embeddings/all_datasets_v4_MiniLM-L6#training-data) with the v3 setup.
-In addition, we trained [20 models](https://huggingface.co/flax-sentence-embeddings) focused on general-purpose, QuestionAnswering and Code search.
 We also uploaded [8 datasets](https://huggingface.co/flax-sentence-embeddings) specialized for Question Answering, Sentence-Similiarity and Gender Evaluation.
 You can view our models and datasets [here](https://huggingface.co/flax-sentence-embeddings).
 ## Contributions
 - 20 performant Sentence Embedding models that can be utilized for Sentence Simliarity / Asymmetric QA / Search & Clustering.

 We trained three general-purpose flax-sentence-embeddings models: a **distilroberta base**, a **mpnet base** and a **minilm-l6**.
 The models were trained on a dataset comprising of [1 Billion+ training corpus](https://huggingface.co/flax-sentence-embeddings/all_datasets_v4_MiniLM-L6#training-data) with the v3 setup.
+In addition, we trained [20 models](https://huggingface.co/flax-sentence-embeddings) focused on general-purpose, QuestionAnswering and Code search and achieved SOTA on multiple benchmarks.
 We also uploaded [8 datasets](https://huggingface.co/flax-sentence-embeddings) specialized for Question Answering, Sentence-Similiarity and Gender Evaluation.
 You can view our models and datasets [here](https://huggingface.co/flax-sentence-embeddings).
+| Model     | [FullEvaluation](https://docs.google.com/spreadsheets/d/1vXJrIg38cEaKjOG5y4I4PQwAQFUmCkohbViJ9zj_Emg/edit#gid=1809754143) Average     | 20Newsgroups Clustering | StackOverflow DupQuestions | Twitter SemEval2015    |
+|-----------|---------------------------------------|-------|-------|-------|
+| paraphrase-mpnet-base-v2 (previous SOTA)  | 67.97 | 47.79 | 49.03 | 72.36 |
+| all_datasets_v3_roberta-large (400k steps) | **70.22** | 50.12 | 52.18 | 75.28 |
 ## Contributions
 - 20 performant Sentence Embedding models that can be utilized for Sentence Simliarity / Asymmetric QA / Search & Clustering.