PartAI
/

Tooka-SBERT-V2-Large

Sentence Similarity

sentence-transformers

Safetensors

Persian

bert

feature-extraction

loss:CachedMultipleNegativesRankingLoss

text-embeddings-inference

Model card Files Files and versions

xet

Community

mohalisad

ghazal-zamani commited on Apr 16, 2025

Commit

bd8f88c

verified ·

1 Parent(s): 72c52cc

Update README.md (#3)

Browse files

- Update README.md (3fa21d66f42938a051b30ce9a4a15f79512d91f2)

Co-authored-by: Ghazal Zamaninejad <ghazal-zamani@users.noreply.huggingface.co>

Files changed (1) hide show

README.md +10 -10

README.md CHANGED Viewed

@@ -6,7 +6,7 @@ base_model:
 library_name: sentence-transformers
 ---
-# SentenceTransformer
 This model is a Sentence Transformers model trained for semantic textual similarity and embedding tasks. It maps sentences and paragraphs to a dense vector space, where semantically similar texts are close together.
@@ -28,7 +28,7 @@ Then you can load this model and run inference.
 from sentence_transformers import SentenceTransformer
 # Download from the 🤗 Hub
-model = SentenceTransformer("PartAI/Tooka-SBERT")
 # Run inference
 sentences = [
     'درنا از پرندگان مهاجر با پاهای بلند و گردن دراز است.',
@@ -73,14 +73,14 @@ For *Retrieval* and *Reranking* tasks, we follow the same asymmetric structure,
 - `"متن: "` to documents
-| Model                                                                          | Pair-Classification-Avg | Classification-Avg | Retrieval-Avg | Reranking-Avg | Overall-Avg |
-|--------------------------------------------------------------------------------|-------------------------|--------------------|---------------|---------------|-------------|
-| [multilingual-e5-base](https://huggingface.co/intfloat/multilingual-e5-base)   | 70.76                   | 69.71              | 63.90         | 76.01         | 69.33       |
-| [multilingual-e5-large](https://huggingface.co/intfloat/multilingual-e5-large) | 72.55                   | 72.18              | **65.36**     | **78.52**     | **71.44**   |
-| [jina-embeddings-v3](https://huggingface.co/jinaai/jina-embeddings-v3)         | 71.88                   | **79.27**          | 65.18         | 64.62         | 71.37       |
-| tooka-sbert-large-v1                                                           | **81.52**               | 71.54              | 45.61         | 60.44         | 62.54       |
-| tooka-sbert-base-v2                                                            | 75.69                   | 72.16              | 61.24         | 73.40         | 69.49       |
-| tooka-sbert-large-v2                                                           | 80.24                   | 74.73              | 59.80         | 73.44         | 70.54       |
 ### Task-Specific Datasets in PTEB

 library_name: sentence-transformers
 ---
+# TookaSBERT-Large2
 This model is a Sentence Transformers model trained for semantic textual similarity and embedding tasks. It maps sentences and paragraphs to a dense vector space, where semantically similar texts are close together.
 from sentence_transformers import SentenceTransformer
 # Download from the 🤗 Hub
+model = SentenceTransformer("PartAI/TookaSBERT-Large2")
 # Run inference
 sentences = [
     'درنا از پرندگان مهاجر با پاهای بلند و گردن دراز است.',
 - `"متن: "` to documents
+| Model                                                                          | #Params | Pair-Classification-Avg | Classification-Avg | Retrieval-Avg | Reranking-Avg | Tasks-Avg |
+|--------------------------------------------------------------------------------|:-------:|-------------------------|--------------------|---------------|---------------|-----------|
+| [multilingual-e5-base](https://huggingface.co/intfloat/multilingual-e5-base)   |  278M   | 70.76                   | 69.71              | 63.90         | 76.01         | 70.09     |
+| [multilingual-e5-large](https://huggingface.co/intfloat/multilingual-e5-large) |  560M   | 72.55                   | 72.18              | **65.36**     | **78.52**     | **72.15** |
+| [jina-embeddings-v3](https://huggingface.co/jinaai/jina-embeddings-v3)         |  572M   | 71.88                   | **79.27**          | 65.18         | 64.62         | 70.24     |
+| tooka-sbert-large-v1                                                           |  353M   | **81.52**               | 71.54              | 45.61         | 60.44         | 64.78     |
+| tooka-sbert-base-v2                                                            |  123M   | 75.69                   | 72.16              | 61.24         | 73.40         | 70.62     |
+| tooka-sbert-large-v2                                                           |  353M   | 80.24                   | 74.73              | 59.80         | 73.44         | 72.05     |
 ### Task-Specific Datasets in PTEB