| | --- |
| | inference: false |
| | language: |
| | - ar |
| | library_name: sentence-transformers |
| | tags: |
| | - mteb |
| | - sentence-transformers |
| | - sentence-similarity |
| | - feature-extraction |
| | - generated_from_trainer |
| | - dataset_size:557850 |
| | - loss:MatryoshkaLoss |
| | - loss:MultipleNegativesRankingLoss |
| | base_model: sentence-transformers/LaBSE |
| | datasets: |
| | - Omartificial-Intelligence-Space/Arabic-NLi-Triplet |
| | metrics: |
| | - pearson_cosine |
| | - spearman_cosine |
| | - pearson_manhattan |
| | - spearman_manhattan |
| | - pearson_euclidean |
| | - spearman_euclidean |
| | - pearson_dot |
| | - spearman_dot |
| | - pearson_max |
| | - spearman_max |
| | widget: |
| | - source_sentence: ذكر متوازن بعناية يقف على قدم واحدة بالقرب من منطقة شاطئ المحيط النظيفة |
| | sentences: |
| | - رجل يقدم عرضاً |
| | - هناك رجل بالخارج قرب الشاطئ |
| | - رجل يجلس على أريكه |
| | - source_sentence: رجل يقفز إلى سريره القذر |
| | sentences: |
| | - السرير قذر. |
| | - رجل يضحك أثناء غسيل الملابس |
| | - الرجل على القمر |
| | - source_sentence: الفتيات بالخارج |
| | sentences: |
| | - امرأة تلف الخيط إلى كرات بجانب كومة من الكرات |
| | - فتيان يركبان في جولة متعة |
| | - >- |
| | ثلاث فتيات يقفون سوية في غرفة واحدة تستمع وواحدة تكتب على الحائط والثالثة |
| | تتحدث إليهن |
| | - source_sentence: الرجل يرتدي قميصاً أزرق. |
| | sentences: |
| | - >- |
| | رجل يرتدي قميصاً أزرق يميل إلى الجدار بجانب الطريق مع شاحنة زرقاء وسيارة |
| | حمراء مع الماء في الخلفية. |
| | - كتاب القصص مفتوح |
| | - رجل يرتدي قميص أسود يعزف على الجيتار. |
| | - source_sentence: يجلس شاب ذو شعر أشقر على الحائط يقرأ جريدة بينما تمر امرأة وفتاة شابة. |
| | sentences: |
| | - ذكر شاب ينظر إلى جريدة بينما تمر إمرأتان بجانبه |
| | - رجل يستلقي على وجهه على مقعد في الحديقة. |
| | - الشاب نائم بينما الأم تقود ابنتها إلى الحديقة |
| | pipeline_tag: sentence-similarity |
| | model-index: |
| | - name: Omartificial-Intelligence-Space/Arabic-labse-Matryoshka |
| | results: |
| | - dataset: |
| | config: ar |
| | name: MTEB MintakaRetrieval (ar) |
| | revision: efa78cc2f74bbcd21eff2261f9e13aebe40b814e |
| | split: test |
| | type: mintaka/mmteb-mintaka |
| | metrics: |
| | - type: main_score |
| | value: 14.585 |
| | - type: map_at_1 |
| | value: 8.352 |
| | - type: map_at_3 |
| | value: 10.917 |
| | - type: map_at_5 |
| | value: 11.634 |
| | - type: map_at_10 |
| | value: 12.254 |
| | - type: ndcg_at_1 |
| | value: 8.352 |
| | - type: ndcg_at_3 |
| | value: 11.794 |
| | - type: ndcg_at_5 |
| | value: 13.085 |
| | - type: ndcg_at_10 |
| | value: 14.585 |
| | - type: recall_at_1 |
| | value: 8.352 |
| | - type: recall_at_3 |
| | value: 14.344 |
| | - type: recall_at_5 |
| | value: 17.476 |
| | - type: recall_at_10 |
| | value: 22.106 |
| | - type: precision_at_1 |
| | value: 8.352 |
| | - type: precision_at_3 |
| | value: 4.781 |
| | - type: precision_at_5 |
| | value: 3.495 |
| | - type: precision_at_10 |
| | value: 2.211 |
| | - type: mrr_at_1 |
| | value: 8.3522 |
| | - type: mrr_at_3 |
| | value: 10.9169 |
| | - type: mrr_at_5 |
| | value: 11.6341 |
| | - type: mrr_at_10 |
| | value: 12.2543 |
| | task: |
| | type: Retrieval |
| | - dataset: |
| | config: ar |
| | name: MTEB MIRACLRetrievalHardNegatives (ar) |
| | revision: 95c8db7d4a6e9c1d8a60601afd63d553ae20a2eb |
| | split: dev |
| | type: miracl/mmteb-miracl-hardnegatives |
| | metrics: |
| | - type: main_score |
| | value: 18.836 |
| | - type: map_at_1 |
| | value: 6.646 |
| | - type: map_at_3 |
| | value: 10.692 |
| | - type: map_at_5 |
| | value: 11.969 |
| | - type: map_at_10 |
| | value: 13.446 |
| | - type: ndcg_at_1 |
| | value: 10.5 |
| | - type: ndcg_at_3 |
| | value: 13.645 |
| | - type: ndcg_at_5 |
| | value: 15.504 |
| | - type: ndcg_at_10 |
| | value: 18.836 |
| | - type: recall_at_1 |
| | value: 6.646 |
| | - type: recall_at_3 |
| | value: 15.361 |
| | - type: recall_at_5 |
| | value: 19.925 |
| | - type: recall_at_10 |
| | value: 28.6 |
| | - type: precision_at_1 |
| | value: 10.5 |
| | - type: precision_at_3 |
| | value: 8.533 |
| | - type: precision_at_5 |
| | value: 6.9 |
| | - type: precision_at_10 |
| | value: 5.21 |
| | - type: mrr_at_1 |
| | value: 10.5 |
| | - type: mrr_at_3 |
| | value: 16.25 |
| | - type: mrr_at_5 |
| | value: 17.68 |
| | - type: mrr_at_10 |
| | value: 19.1759 |
| | task: |
| | type: Retrieval |
| | - dataset: |
| | config: ar |
| | name: MTEB MLQARetrieval (ar) |
| | revision: 397ed406c1a7902140303e7faf60fff35b58d285 |
| | split: validation |
| | type: mlqa/mmteb-mlqa |
| | metrics: |
| | - type: main_score |
| | value: 61.582 |
| | - type: map_at_1 |
| | value: 47.195 |
| | - type: map_at_3 |
| | value: 54.03 |
| | - type: map_at_5 |
| | value: 55.77 |
| | - type: map_at_10 |
| | value: 56.649 |
| | - type: ndcg_at_1 |
| | value: 47.195 |
| | - type: ndcg_at_3 |
| | value: 56.295 |
| | - type: ndcg_at_5 |
| | value: 59.417 |
| | - type: ndcg_at_10 |
| | value: 61.582 |
| | - type: recall_at_1 |
| | value: 47.195 |
| | - type: recall_at_3 |
| | value: 62.863 |
| | - type: recall_at_5 |
| | value: 70.406 |
| | - type: recall_at_10 |
| | value: 77.176 |
| | - type: precision_at_1 |
| | value: 47.195 |
| | - type: precision_at_3 |
| | value: 20.954 |
| | - type: precision_at_5 |
| | value: 14.081 |
| | - type: precision_at_10 |
| | value: 7.718 |
| | - type: mrr_at_1 |
| | value: 47.1954 |
| | - type: mrr_at_3 |
| | value: 54.0297 |
| | - type: mrr_at_5 |
| | value: 55.7705 |
| | - type: mrr_at_10 |
| | value: 56.6492 |
| | task: |
| | type: Retrieval |
| | - dataset: |
| | config: default |
| | name: MTEB SadeemQuestionRetrieval (ar) |
| | revision: 3cb0752b182e5d5d740df547748b06663c8e0bd9 |
| | split: test |
| | type: sadeem/mmteb-sadeem |
| | metrics: |
| | - type: main_score |
| | value: 57.653 |
| | - type: map_at_1 |
| | value: 25.084 |
| | - type: map_at_3 |
| | value: 46.338 |
| | - type: map_at_5 |
| | value: 47.556 |
| | - type: map_at_10 |
| | value: 48.207 |
| | - type: ndcg_at_1 |
| | value: 25.084 |
| | - type: ndcg_at_3 |
| | value: 53.91 |
| | - type: ndcg_at_5 |
| | value: 56.102 |
| | - type: ndcg_at_10 |
| | value: 57.653 |
| | - type: recall_at_1 |
| | value: 25.084 |
| | - type: recall_at_3 |
| | value: 76.017 |
| | - type: recall_at_5 |
| | value: 81.331 |
| | - type: recall_at_10 |
| | value: 86.07 |
| | - type: precision_at_1 |
| | value: 25.084 |
| | - type: precision_at_3 |
| | value: 25.339 |
| | - type: precision_at_5 |
| | value: 16.266 |
| | - type: precision_at_10 |
| | value: 8.607 |
| | - type: mrr_at_1 |
| | value: 23.1211 |
| | - type: mrr_at_3 |
| | value: 44.9657 |
| | - type: mrr_at_5 |
| | value: 46.3037 |
| | - type: mrr_at_10 |
| | value: 46.8749 |
| | task: |
| | type: Retrieval |
| | - dataset: |
| | config: default |
| | name: MTEB BIOSSES (default) |
| | revision: d3fb88f8f02e40887cd149695127462bbcf29b4a |
| | split: test |
| | type: mteb/biosses-sts |
| | metrics: |
| | - type: cosine_pearson |
| | value: 76.46793440999714 |
| | - type: cosine_spearman |
| | value: 76.66439745271298 |
| | - type: euclidean_pearson |
| | value: 76.52075972347127 |
| | - type: euclidean_spearman |
| | value: 76.66439745271298 |
| | - type: main_score |
| | value: 76.66439745271298 |
| | - type: manhattan_pearson |
| | value: 76.68001857069733 |
| | - type: manhattan_spearman |
| | value: 76.73066402288269 |
| | task: |
| | type: STS |
| | - dataset: |
| | config: default |
| | name: MTEB SICK-R (default) |
| | revision: 20a6d6f312dd54037fe07a32d58e5e168867909d |
| | split: test |
| | type: mteb/sickr-sts |
| | metrics: |
| | - type: cosine_pearson |
| | value: 79.67657890693198 |
| | - type: cosine_spearman |
| | value: 77.03286420274621 |
| | - type: euclidean_pearson |
| | value: 78.1960735272073 |
| | - type: euclidean_spearman |
| | value: 77.032855497919 |
| | - type: main_score |
| | value: 77.03286420274621 |
| | - type: manhattan_pearson |
| | value: 78.25627275994229 |
| | - type: manhattan_spearman |
| | value: 77.00430810589081 |
| | task: |
| | type: STS |
| | - dataset: |
| | config: default |
| | name: MTEB STS12 (default) |
| | revision: a0d554a64d88156834ff5ae9920b964011b16384 |
| | split: test |
| | type: mteb/sts12-sts |
| | metrics: |
| | - type: cosine_pearson |
| | value: 83.94288954523996 |
| | - type: cosine_spearman |
| | value: 79.21432176112556 |
| | - type: euclidean_pearson |
| | value: 81.21333251943913 |
| | - type: euclidean_spearman |
| | value: 79.2152067330468 |
| | - type: main_score |
| | value: 79.21432176112556 |
| | - type: manhattan_pearson |
| | value: 81.16910737482634 |
| | - type: manhattan_spearman |
| | value: 79.08756466301445 |
| | task: |
| | type: STS |
| | - dataset: |
| | config: default |
| | name: MTEB STS13 (default) |
| | revision: 7e90230a92c190f1bf69ae9002b8cea547a64cca |
| | split: test |
| | type: mteb/sts13-sts |
| | metrics: |
| | - type: cosine_pearson |
| | value: 77.48393909963059 |
| | - type: cosine_spearman |
| | value: 79.54963868861196 |
| | - type: euclidean_pearson |
| | value: 79.28416002197451 |
| | - type: euclidean_spearman |
| | value: 79.54963861790114 |
| | - type: main_score |
| | value: 79.54963868861196 |
| | - type: manhattan_pearson |
| | value: 79.18653917582513 |
| | - type: manhattan_spearman |
| | value: 79.46713533414295 |
| | task: |
| | type: STS |
| | - dataset: |
| | config: default |
| | name: MTEB STS14 (default) |
| | revision: 6031580fec1f6af667f0bd2da0a551cf4f0b2375 |
| | split: test |
| | type: mteb/sts14-sts |
| | metrics: |
| | - type: cosine_pearson |
| | value: 78.51596313692846 |
| | - type: cosine_spearman |
| | value: 78.84601702652395 |
| | - type: euclidean_pearson |
| | value: 78.55199809961427 |
| | - type: euclidean_spearman |
| | value: 78.84603362286225 |
| | - type: main_score |
| | value: 78.84601702652395 |
| | - type: manhattan_pearson |
| | value: 78.52780170677605 |
| | - type: manhattan_spearman |
| | value: 78.77744294039178 |
| | task: |
| | type: STS |
| | - dataset: |
| | config: default |
| | name: MTEB STS15 (default) |
| | revision: ae752c7c21bf194d8b67fd573edf7ae58183cbe3 |
| | split: test |
| | type: mteb/sts15-sts |
| | metrics: |
| | - type: cosine_pearson |
| | value: 84.53393478889929 |
| | - type: cosine_spearman |
| | value: 85.60821849381648 |
| | - type: euclidean_pearson |
| | value: 85.32813923250558 |
| | - type: euclidean_spearman |
| | value: 85.6081835456016 |
| | - type: main_score |
| | value: 85.60821849381648 |
| | - type: manhattan_pearson |
| | value: 85.32782097916476 |
| | - type: manhattan_spearman |
| | value: 85.58098670898562 |
| | task: |
| | type: STS |
| | - dataset: |
| | config: default |
| | name: MTEB STS16 (default) |
| | revision: 4d8694f8f0e0100860b497b999b3dbed754a0513 |
| | split: test |
| | type: mteb/sts16-sts |
| | metrics: |
| | - type: cosine_pearson |
| | value: 77.00196998325856 |
| | - type: cosine_spearman |
| | value: 79.930951699069 |
| | - type: euclidean_pearson |
| | value: 79.43196738390897 |
| | - type: euclidean_spearman |
| | value: 79.93095112410258 |
| | - type: main_score |
| | value: 79.930951699069 |
| | - type: manhattan_pearson |
| | value: 79.33744358111427 |
| | - type: manhattan_spearman |
| | value: 79.82939266539601 |
| | task: |
| | type: STS |
| | - dataset: |
| | config: ar-ar |
| | name: MTEB STS17 (ar-ar) |
| | revision: faeb762787bd10488a50c8b5be4a3b82e411949c |
| | split: test |
| | type: mteb/sts17-crosslingual-sts |
| | metrics: |
| | - type: cosine_pearson |
| | value: 81.60289529424327 |
| | - type: cosine_spearman |
| | value: 82.46806381979653 |
| | - type: euclidean_pearson |
| | value: 81.32235058296072 |
| | - type: euclidean_spearman |
| | value: 82.46676890643914 |
| | - type: main_score |
| | value: 82.46806381979653 |
| | - type: manhattan_pearson |
| | value: 81.43885277175312 |
| | - type: manhattan_spearman |
| | value: 82.38955952718666 |
| | task: |
| | type: STS |
| | - dataset: |
| | config: ar |
| | name: MTEB STS22 (ar) |
| | revision: de9d86b3b84231dc21f76c7b7af1f28e2f57f6e3 |
| | split: test |
| | type: mteb/sts22-crosslingual-sts |
| | metrics: |
| | - type: cosine_pearson |
| | value: 49.58293768761314 |
| | - type: cosine_spearman |
| | value: 57.261888789832874 |
| | - type: euclidean_pearson |
| | value: 53.36549109538782 |
| | - type: euclidean_spearman |
| | value: 57.261888789832874 |
| | - type: main_score |
| | value: 57.261888789832874 |
| | - type: manhattan_pearson |
| | value: 53.06640323833928 |
| | - type: manhattan_spearman |
| | value: 57.05837935512948 |
| | task: |
| | type: STS |
| | - dataset: |
| | config: default |
| | name: MTEB STSBenchmark (default) |
| | revision: b0fddb56ed78048fa8b90373c8a3cfc37b684831 |
| | split: test |
| | type: mteb/stsbenchmark-sts |
| | metrics: |
| | - type: cosine_pearson |
| | value: 81.43997935928729 |
| | - type: cosine_spearman |
| | value: 82.04996129795596 |
| | - type: euclidean_pearson |
| | value: 82.01917866996972 |
| | - type: euclidean_spearman |
| | value: 82.04996129795596 |
| | - type: main_score |
| | value: 82.04996129795596 |
| | - type: manhattan_pearson |
| | value: 82.03487112040936 |
| | - type: manhattan_spearman |
| | value: 82.03774605775651 |
| | task: |
| | type: STS |
| | - dataset: |
| | config: default |
| | name: MTEB SummEval (default) |
| | revision: cda12ad7615edc362dbf25a00fdd61d3b1eaf93c |
| | split: test |
| | type: mteb/summeval |
| | metrics: |
| | - type: cosine_pearson |
| | value: 32.113475997147674 |
| | - type: cosine_spearman |
| | value: 32.17194233764879 |
| | - type: dot_pearson |
| | value: 32.113469728827255 |
| | - type: dot_spearman |
| | value: 32.174771315355386 |
| | - type: main_score |
| | value: 32.17194233764879 |
| | - type: pearson |
| | value: 32.113475997147674 |
| | - type: spearman |
| | value: 32.17194233764879 |
| | task: |
| | type: Summarization |
| | - name: SentenceTransformer based on sentence-transformers/LaBSE |
| | results: |
| | - task: |
| | type: semantic-similarity |
| | name: Semantic Similarity |
| | dataset: |
| | name: sts test 768 |
| | type: sts-test-768 |
| | metrics: |
| | - type: pearson_cosine |
| | value: 0.7269177710249681 |
| | name: Pearson Cosine |
| | - type: spearman_cosine |
| | value: 0.7225258779395222 |
| | name: Spearman Cosine |
| | - type: pearson_manhattan |
| | value: 0.7259261785622463 |
| | name: Pearson Manhattan |
| | - type: spearman_manhattan |
| | value: 0.7210463582530393 |
| | name: Spearman Manhattan |
| | - type: pearson_euclidean |
| | value: 0.7259567884235211 |
| | name: Pearson Euclidean |
| | - type: spearman_euclidean |
| | value: 0.722525823788783 |
| | name: Spearman Euclidean |
| | - type: pearson_dot |
| | value: 0.7269177712136122 |
| | name: Pearson Dot |
| | - type: spearman_dot |
| | value: 0.7225258771129475 |
| | name: Spearman Dot |
| | - type: pearson_max |
| | value: 0.7269177712136122 |
| | name: Pearson Max |
| | - type: spearman_max |
| | value: 0.7225258779395222 |
| | name: Spearman Max |
| | - type: pearson_cosine |
| | value: 0.8143867576376295 |
| | name: Pearson Cosine |
| | - type: spearman_cosine |
| | value: 0.8205044914629483 |
| | name: Spearman Cosine |
| | - type: pearson_manhattan |
| | value: 0.8203365887013151 |
| | name: Pearson Manhattan |
| | - type: spearman_manhattan |
| | value: 0.8203816698535976 |
| | name: Spearman Manhattan |
| | - type: pearson_euclidean |
| | value: 0.8201809453496319 |
| | name: Pearson Euclidean |
| | - type: spearman_euclidean |
| | value: 0.8205044914629483 |
| | name: Spearman Euclidean |
| | - type: pearson_dot |
| | value: 0.8143867541070537 |
| | name: Pearson Dot |
| | - type: spearman_dot |
| | value: 0.8205044914629483 |
| | name: Spearman Dot |
| | - type: pearson_max |
| | value: 0.8203365887013151 |
| | name: Pearson Max |
| | - type: spearman_max |
| | value: 0.8205044914629483 |
| | name: Spearman Max |
| | - task: |
| | type: semantic-similarity |
| | name: Semantic Similarity |
| | dataset: |
| | name: sts test 512 |
| | type: sts-test-512 |
| | metrics: |
| | - type: pearson_cosine |
| | value: 0.7268389724271859 |
| | name: Pearson Cosine |
| | - type: spearman_cosine |
| | value: 0.7224359411000278 |
| | name: Spearman Cosine |
| | - type: pearson_manhattan |
| | value: 0.7241418669615103 |
| | name: Pearson Manhattan |
| | - type: spearman_manhattan |
| | value: 0.7195408311833029 |
| | name: Spearman Manhattan |
| | - type: pearson_euclidean |
| | value: 0.7248184919191593 |
| | name: Pearson Euclidean |
| | - type: spearman_euclidean |
| | value: 0.7212936866178097 |
| | name: Spearman Euclidean |
| | - type: pearson_dot |
| | value: 0.7252522928016701 |
| | name: Pearson Dot |
| | - type: spearman_dot |
| | value: 0.7205040482865328 |
| | name: Spearman Dot |
| | - type: pearson_max |
| | value: 0.7268389724271859 |
| | name: Pearson Max |
| | - type: spearman_max |
| | value: 0.7224359411000278 |
| | name: Spearman Max |
| | - type: pearson_cosine |
| | value: 0.8143448965624136 |
| | name: Pearson Cosine |
| | - type: spearman_cosine |
| | value: 0.8211700903453509 |
| | name: Spearman Cosine |
| | - type: pearson_manhattan |
| | value: 0.8217448619823571 |
| | name: Pearson Manhattan |
| | - type: spearman_manhattan |
| | value: 0.8216016599665544 |
| | name: Spearman Manhattan |
| | - type: pearson_euclidean |
| | value: 0.8216413349390971 |
| | name: Pearson Euclidean |
| | - type: spearman_euclidean |
| | value: 0.82188122418776 |
| | name: Spearman Euclidean |
| | - type: pearson_dot |
| | value: 0.8097020064483653 |
| | name: Pearson Dot |
| | - type: spearman_dot |
| | value: 0.8147306090545295 |
| | name: Spearman Dot |
| | - type: pearson_max |
| | value: 0.8217448619823571 |
| | name: Pearson Max |
| | - type: spearman_max |
| | value: 0.82188122418776 |
| | name: Spearman Max |
| | - task: |
| | type: semantic-similarity |
| | name: Semantic Similarity |
| | dataset: |
| | name: sts test 256 |
| | type: sts-test-256 |
| | metrics: |
| | - type: pearson_cosine |
| | value: 0.7283468617741852 |
| | name: Pearson Cosine |
| | - type: spearman_cosine |
| | value: 0.7264294106954872 |
| | name: Spearman Cosine |
| | - type: pearson_manhattan |
| | value: 0.7227711798003426 |
| | name: Pearson Manhattan |
| | - type: spearman_manhattan |
| | value: 0.718067982079232 |
| | name: Spearman Manhattan |
| | - type: pearson_euclidean |
| | value: 0.7251492361775083 |
| | name: Pearson Euclidean |
| | - type: spearman_euclidean |
| | value: 0.7215068115809131 |
| | name: Spearman Euclidean |
| | - type: pearson_dot |
| | value: 0.7243396991648858 |
| | name: Pearson Dot |
| | - type: spearman_dot |
| | value: 0.7221390873398206 |
| | name: Spearman Dot |
| | - type: pearson_max |
| | value: 0.7283468617741852 |
| | name: Pearson Max |
| | - type: spearman_max |
| | value: 0.7264294106954872 |
| | name: Spearman Max |
| | - type: pearson_cosine |
| | value: 0.8075613785257986 |
| | name: Pearson Cosine |
| | - type: spearman_cosine |
| | value: 0.8159258089804861 |
| | name: Spearman Cosine |
| | - type: pearson_manhattan |
| | value: 0.8208711370091426 |
| | name: Pearson Manhattan |
| | - type: spearman_manhattan |
| | value: 0.8196747601014518 |
| | name: Spearman Manhattan |
| | - type: pearson_euclidean |
| | value: 0.8210210137439432 |
| | name: Pearson Euclidean |
| | - type: spearman_euclidean |
| | value: 0.8203004500356083 |
| | name: Spearman Euclidean |
| | - type: pearson_dot |
| | value: 0.7870611647231145 |
| | name: Pearson Dot |
| | - type: spearman_dot |
| | value: 0.7874848213991118 |
| | name: Spearman Dot |
| | - type: pearson_max |
| | value: 0.8210210137439432 |
| | name: Pearson Max |
| | - type: spearman_max |
| | value: 0.8203004500356083 |
| | name: Spearman Max |
| | - task: |
| | type: semantic-similarity |
| | name: Semantic Similarity |
| | dataset: |
| | name: sts test 128 |
| | type: sts-test-128 |
| | metrics: |
| | - type: pearson_cosine |
| | value: 0.7102082520621849 |
| | name: Pearson Cosine |
| | - type: spearman_cosine |
| | value: 0.7103917869311991 |
| | name: Spearman Cosine |
| | - type: pearson_manhattan |
| | value: 0.7134729607181519 |
| | name: Pearson Manhattan |
| | - type: spearman_manhattan |
| | value: 0.708895102058259 |
| | name: Spearman Manhattan |
| | - type: pearson_euclidean |
| | value: 0.7171545288118942 |
| | name: Pearson Euclidean |
| | - type: spearman_euclidean |
| | value: 0.7130380237150746 |
| | name: Spearman Euclidean |
| | - type: pearson_dot |
| | value: 0.6777774738547628 |
| | name: Pearson Dot |
| | - type: spearman_dot |
| | value: 0.6746474823963989 |
| | name: Spearman Dot |
| | - type: pearson_max |
| | value: 0.7171545288118942 |
| | name: Pearson Max |
| | - type: spearman_max |
| | value: 0.7130380237150746 |
| | name: Spearman Max |
| | - type: pearson_cosine |
| | value: 0.8024378358145556 |
| | name: Pearson Cosine |
| | - type: spearman_cosine |
| | value: 0.8117561815472325 |
| | name: Spearman Cosine |
| | - type: pearson_manhattan |
| | value: 0.818920309459774 |
| | name: Pearson Manhattan |
| | - type: spearman_manhattan |
| | value: 0.8180515365910205 |
| | name: Spearman Manhattan |
| | - type: pearson_euclidean |
| | value: 0.8198346073356603 |
| | name: Pearson Euclidean |
| | - type: spearman_euclidean |
| | value: 0.8185162896024369 |
| | name: Spearman Euclidean |
| | - type: pearson_dot |
| | value: 0.7513270537478935 |
| | name: Pearson Dot |
| | - type: spearman_dot |
| | value: 0.7427542871546953 |
| | name: Spearman Dot |
| | - type: pearson_max |
| | value: 0.8198346073356603 |
| | name: Pearson Max |
| | - type: spearman_max |
| | value: 0.8185162896024369 |
| | name: Spearman Max |
| | - task: |
| | type: semantic-similarity |
| | name: Semantic Similarity |
| | dataset: |
| | name: sts test 64 |
| | type: sts-test-64 |
| | metrics: |
| | - type: pearson_cosine |
| | value: 0.6930745722517785 |
| | name: Pearson Cosine |
| | - type: spearman_cosine |
| | value: 0.6982194042238953 |
| | name: Spearman Cosine |
| | - type: pearson_manhattan |
| | value: 0.6971382079778946 |
| | name: Pearson Manhattan |
| | - type: spearman_manhattan |
| | value: 0.6942362764367931 |
| | name: Spearman Manhattan |
| | - type: pearson_euclidean |
| | value: 0.7012627015062325 |
| | name: Pearson Euclidean |
| | - type: spearman_euclidean |
| | value: 0.6986972295835788 |
| | name: Spearman Euclidean |
| | - type: pearson_dot |
| | value: 0.6376735798940838 |
| | name: Pearson Dot |
| | - type: spearman_dot |
| | value: 0.6344835722310429 |
| | name: Spearman Dot |
| | - type: pearson_max |
| | value: 0.7012627015062325 |
| | name: Pearson Max |
| | - type: spearman_max |
| | value: 0.6986972295835788 |
| | name: Spearman Max |
| | - type: pearson_cosine |
| | value: 0.7855080652087961 |
| | name: Pearson Cosine |
| | - type: spearman_cosine |
| | value: 0.7948979371698327 |
| | name: Spearman Cosine |
| | - type: pearson_manhattan |
| | value: 0.8060407473462375 |
| | name: Pearson Manhattan |
| | - type: spearman_manhattan |
| | value: 0.8041199691999044 |
| | name: Spearman Manhattan |
| | - type: pearson_euclidean |
| | value: 0.8088262858195556 |
| | name: Pearson Euclidean |
| | - type: spearman_euclidean |
| | value: 0.8060483394849104 |
| | name: Spearman Euclidean |
| | - type: pearson_dot |
| | value: 0.677754045289596 |
| | name: Pearson Dot |
| | - type: spearman_dot |
| | value: 0.6616232873061395 |
| | name: Spearman Dot |
| | - type: pearson_max |
| | value: 0.8088262858195556 |
| | name: Pearson Max |
| | - type: spearman_max |
| | value: 0.8060483394849104 |
| | name: Spearman Max |
| | license: apache-2.0 |
| | --- |
| | |
| | # SentenceTransformer based on sentence-transformers/LaBSE |
| |
|
| | This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/LaBSE](https://huggingface.co/sentence-transformers/LaBSE) on the Omartificial-Intelligence-Space/arabic-n_li-triplet dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. |
| | |
| | ## Model Details |
| | |
| | ### Model Description |
| | - **Model Type:** Sentence Transformer |
| | - **Base model:** [sentence-transformers/LaBSE](https://huggingface.co/sentence-transformers/LaBSE) <!-- at revision e34fab64a3011d2176c99545a93d5cbddc9a91b7 --> |
| | - **Maximum Sequence Length:** 256 tokens |
| | - **Output Dimensionality:** 768 tokens |
| | - **Similarity Function:** Cosine Similarity |
| | - **Training Dataset:** |
| | - Omartificial-Intelligence-Space/arabic-n_li-triplet |
| | <!-- - **Language:** Unknown --> |
| | <!-- - **License:** Unknown --> |
| |
|
| | ### Model Sources |
| |
|
| | - **Documentation:** [Sentence Transformers Documentation](https://sbert.net) |
| | - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers) |
| | - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers) |
| |
|
| | ### Full Model Architecture |
| |
|
| | ``` |
| | SentenceTransformer( |
| | (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel |
| | (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) |
| | (2): Dense({'in_features': 768, 'out_features': 768, 'bias': True, 'activation_function': 'torch.nn.modules.activation.Tanh'}) |
| | (3): Normalize() |
| | ) |
| | ``` |
| |
|
| | ## Usage |
| |
|
| | ### Direct Usage (Sentence Transformers) |
| |
|
| | First install the Sentence Transformers library: |
| |
|
| | ```bash |
| | pip install -U sentence-transformers |
| | ``` |
| |
|
| | Then you can load this model and run inference. |
| | ```python |
| | from sentence_transformers import SentenceTransformer |
| | |
| | # Download from the 🤗 Hub |
| | model = SentenceTransformer("Omartificial-Intelligence-Space/Arabic-labse") |
| | # Run inference |
| | sentences = [ |
| | 'يجلس شاب ذو شعر أشقر على الحائط يقرأ جريدة بينما تمر امرأة وفتاة شابة.', |
| | 'ذكر شاب ينظر إلى جريدة بينما تمر إمرأتان بجانبه', |
| | 'الشاب نائم بينما الأم تقود ابنتها إلى الحديقة', |
| | ] |
| | embeddings = model.encode(sentences) |
| | print(embeddings.shape) |
| | # [3, 768] |
| | |
| | # Get the similarity scores for the embeddings |
| | similarities = model.similarity(embeddings, embeddings) |
| | print(similarities.shape) |
| | # [3, 3] |
| | ``` |
| |
|
| | <!-- |
| | ### Direct Usage (Transformers) |
| |
|
| | <details><summary>Click to see the direct usage in Transformers</summary> |
| |
|
| | </details> |
| | --> |
| |
|
| | <!-- |
| | ### Downstream Usage (Sentence Transformers) |
| |
|
| | You can finetune this model on your own dataset. |
| |
|
| | <details><summary>Click to expand</summary> |
| |
|
| | </details> |
| | --> |
| |
|
| | <!-- |
| | ### Out-of-Scope Use |
| |
|
| | *List how the model may foreseeably be misused and address what users ought not to do with the model.* |
| | --> |
| |
|
| | ## Evaluation |
| |
|
| | ### Metrics |
| |
|
| | #### Semantic Similarity |
| | * Dataset: `sts-test-768` |
| | * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator) |
| |
|
| | | Metric | Value | |
| | |:--------------------|:-----------| |
| | | pearson_cosine | 0.7269 | |
| | | **spearman_cosine** | **0.7225** | |
| | | pearson_manhattan | 0.7259 | |
| | | spearman_manhattan | 0.721 | |
| | | pearson_euclidean | 0.726 | |
| | | spearman_euclidean | 0.7225 | |
| | | pearson_dot | 0.7269 | |
| | | spearman_dot | 0.7225 | |
| | | pearson_max | 0.7269 | |
| | | spearman_max | 0.7225 | |
| | |
| | #### Semantic Similarity |
| | * Dataset: `sts-test-512` |
| | * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator) |
| | |
| | | Metric | Value | |
| | |:--------------------|:-----------| |
| | | pearson_cosine | 0.7268 | |
| | | **spearman_cosine** | **0.7224** | |
| | | pearson_manhattan | 0.7241 | |
| | | spearman_manhattan | 0.7195 | |
| | | pearson_euclidean | 0.7248 | |
| | | spearman_euclidean | 0.7213 | |
| | | pearson_dot | 0.7253 | |
| | | spearman_dot | 0.7205 | |
| | | pearson_max | 0.7268 | |
| | | spearman_max | 0.7224 | |
| | |
| | #### Semantic Similarity |
| | * Dataset: `sts-test-256` |
| | * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator) |
| | |
| | | Metric | Value | |
| | |:--------------------|:-----------| |
| | | pearson_cosine | 0.7283 | |
| | | **spearman_cosine** | **0.7264** | |
| | | pearson_manhattan | 0.7228 | |
| | | spearman_manhattan | 0.7181 | |
| | | pearson_euclidean | 0.7251 | |
| | | spearman_euclidean | 0.7215 | |
| | | pearson_dot | 0.7243 | |
| | | spearman_dot | 0.7221 | |
| | | pearson_max | 0.7283 | |
| | | spearman_max | 0.7264 | |
| | |
| | #### Semantic Similarity |
| | * Dataset: `sts-test-128` |
| | * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator) |
| | |
| | | Metric | Value | |
| | |:--------------------|:-----------| |
| | | pearson_cosine | 0.7102 | |
| | | **spearman_cosine** | **0.7104** | |
| | | pearson_manhattan | 0.7135 | |
| | | spearman_manhattan | 0.7089 | |
| | | pearson_euclidean | 0.7172 | |
| | | spearman_euclidean | 0.713 | |
| | | pearson_dot | 0.6778 | |
| | | spearman_dot | 0.6746 | |
| | | pearson_max | 0.7172 | |
| | | spearman_max | 0.713 | |
| | |
| | #### Semantic Similarity |
| | * Dataset: `sts-test-64` |
| | * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator) |
| | |
| | | Metric | Value | |
| | |:--------------------|:-----------| |
| | | pearson_cosine | 0.6931 | |
| | | **spearman_cosine** | **0.6982** | |
| | | pearson_manhattan | 0.6971 | |
| | | spearman_manhattan | 0.6942 | |
| | | pearson_euclidean | 0.7013 | |
| | | spearman_euclidean | 0.6987 | |
| | | pearson_dot | 0.6377 | |
| | | spearman_dot | 0.6345 | |
| | | pearson_max | 0.7013 | |
| | | spearman_max | 0.6987 | |
| | |
| | #### Semantic Similarity |
| | * Dataset: `sts-test-768` |
| | * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator) |
| | |
| | | Metric | Value | |
| | |:--------------------|:-----------| |
| | | pearson_cosine | 0.8144 | |
| | | **spearman_cosine** | **0.8205** | |
| | | pearson_manhattan | 0.8203 | |
| | | spearman_manhattan | 0.8204 | |
| | | pearson_euclidean | 0.8202 | |
| | | spearman_euclidean | 0.8205 | |
| | | pearson_dot | 0.8144 | |
| | | spearman_dot | 0.8205 | |
| | | pearson_max | 0.8203 | |
| | | spearman_max | 0.8205 | |
| | |
| | #### Semantic Similarity |
| | * Dataset: `sts-test-512` |
| | * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator) |
| | |
| | | Metric | Value | |
| | |:--------------------|:-----------| |
| | | pearson_cosine | 0.8143 | |
| | | **spearman_cosine** | **0.8212** | |
| | | pearson_manhattan | 0.8217 | |
| | | spearman_manhattan | 0.8216 | |
| | | pearson_euclidean | 0.8216 | |
| | | spearman_euclidean | 0.8219 | |
| | | pearson_dot | 0.8097 | |
| | | spearman_dot | 0.8147 | |
| | | pearson_max | 0.8217 | |
| | | spearman_max | 0.8219 | |
| | |
| | #### Semantic Similarity |
| | * Dataset: `sts-test-256` |
| | * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator) |
| | |
| | | Metric | Value | |
| | |:--------------------|:-----------| |
| | | pearson_cosine | 0.8076 | |
| | | **spearman_cosine** | **0.8159** | |
| | | pearson_manhattan | 0.8209 | |
| | | spearman_manhattan | 0.8197 | |
| | | pearson_euclidean | 0.821 | |
| | | spearman_euclidean | 0.8203 | |
| | | pearson_dot | 0.7871 | |
| | | spearman_dot | 0.7875 | |
| | | pearson_max | 0.821 | |
| | | spearman_max | 0.8203 | |
| | |
| | #### Semantic Similarity |
| | * Dataset: `sts-test-128` |
| | * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator) |
| | |
| | | Metric | Value | |
| | |:--------------------|:-----------| |
| | | pearson_cosine | 0.8024 | |
| | | **spearman_cosine** | **0.8118** | |
| | | pearson_manhattan | 0.8189 | |
| | | spearman_manhattan | 0.8181 | |
| | | pearson_euclidean | 0.8198 | |
| | | spearman_euclidean | 0.8185 | |
| | | pearson_dot | 0.7513 | |
| | | spearman_dot | 0.7428 | |
| | | pearson_max | 0.8198 | |
| | | spearman_max | 0.8185 | |
| | |
| | #### Semantic Similarity |
| | * Dataset: `sts-test-64` |
| | * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator) |
| | |
| | | Metric | Value | |
| | |:--------------------|:-----------| |
| | | pearson_cosine | 0.7855 | |
| | | **spearman_cosine** | **0.7949** | |
| | | pearson_manhattan | 0.806 | |
| | | spearman_manhattan | 0.8041 | |
| | | pearson_euclidean | 0.8088 | |
| | | spearman_euclidean | 0.806 | |
| | | pearson_dot | 0.6778 | |
| | | spearman_dot | 0.6616 | |
| | | pearson_max | 0.8088 | |
| | | spearman_max | 0.806 | |
| | |
| | <!-- |
| | ## Bias, Risks and Limitations |
| | |
| | *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.* |
| | --> |
| | |
| | <!-- |
| | ### Recommendations |
| | |
| | *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.* |
| | --> |
| | |
| | ## Training Details |
| | |
| | ### Training Dataset |
| | |
| | #### Omartificial-Intelligence-Space/arabic-n_li-triplet |
| | |
| | * Dataset: Omartificial-Intelligence-Space/arabic-n_li-triplet |
| | * Size: 557,850 training samples |
| | * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code> |
| | * Approximate statistics based on the first 1000 samples: |
| | | | anchor | positive | negative | |
| | |:--------|:---------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------| |
| | | type | string | string | string | |
| | | details | <ul><li>min: 4 tokens</li><li>mean: 9.99 tokens</li><li>max: 51 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 12.44 tokens</li><li>max: 49 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 13.82 tokens</li><li>max: 49 tokens</li></ul> | |
| | * Samples: |
| | | anchor | positive | negative | |
| | |:------------------------------------------------------------|:--------------------------------------------|:------------------------------------| |
| | | <code>شخص على حصان يقفز فوق طائرة معطلة</code> | <code>شخص في الهواء الطلق، على حصان.</code> | <code>شخص في مطعم، يطلب عجة.</code> | |
| | | <code>أطفال يبتسمون و يلوحون للكاميرا</code> | <code>هناك أطفال حاضرون</code> | <code>الاطفال يتجهمون</code> | |
| | | <code>صبي يقفز على لوح التزلج في منتصف الجسر الأحمر.</code> | <code>الفتى يقوم بخدعة التزلج</code> | <code>الصبي يتزلج على الرصيف</code> | |
| | * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters: |
| | ```json |
| | { |
| | "loss": "MultipleNegativesRankingLoss", |
| | "matryoshka_dims": [ |
| | 768, |
| | 512, |
| | 256, |
| | 128, |
| | 64 |
| | ], |
| | "matryoshka_weights": [ |
| | 1, |
| | 1, |
| | 1, |
| | 1, |
| | 1 |
| | ], |
| | "n_dims_per_step": -1 |
| | } |
| | ``` |
| | |
| | ### Evaluation Dataset |
| | |
| | #### Omartificial-Intelligence-Space/arabic-n_li-triplet |
| | |
| | * Dataset: Omartificial-Intelligence-Space/arabic-n_li-triplet |
| | * Size: 6,584 evaluation samples |
| | * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code> |
| | * Approximate statistics based on the first 1000 samples: |
| | | | anchor | positive | negative | |
| | |:--------|:-----------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:----------------------------------------------------------------------------------| |
| | | type | string | string | string | |
| | | details | <ul><li>min: 4 tokens</li><li>mean: 19.71 tokens</li><li>max: 100 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 9.37 tokens</li><li>max: 38 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 10.49 tokens</li><li>max: 34 tokens</li></ul> | |
| | * Samples: |
| | | anchor | positive | negative | |
| | |:-----------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------|:---------------------------------------------------| |
| | | <code>امرأتان يتعانقان بينما يحملان حزمة</code> | <code>إمرأتان يحملان حزمة</code> | <code>الرجال يتشاجرون خارج مطعم</code> | |
| | | <code>طفلين صغيرين يرتديان قميصاً أزرق، أحدهما يرتدي الرقم 9 والآخر يرتدي الرقم 2 يقفان على خطوات خشبية في الحمام ويغسلان أيديهما في المغسلة.</code> | <code>طفلين يرتديان قميصاً مرقماً يغسلون أيديهم</code> | <code>طفلين يرتديان سترة يذهبان إلى المدرسة</code> | |
| | | <code>رجل يبيع الدونات لعميل خلال معرض عالمي أقيم في مدينة أنجليس</code> | <code>رجل يبيع الدونات لعميل</code> | <code>امرأة تشرب قهوتها في مقهى صغير</code> | |
| | * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters: |
| | ```json |
| | { |
| | "loss": "MultipleNegativesRankingLoss", |
| | "matryoshka_dims": [ |
| | 768, |
| | 512, |
| | 256, |
| | 128, |
| | 64 |
| | ], |
| | "matryoshka_weights": [ |
| | 1, |
| | 1, |
| | 1, |
| | 1, |
| | 1 |
| | ], |
| | "n_dims_per_step": -1 |
| | } |
| | ``` |
| | |
| | ### Training Hyperparameters |
| | #### Non-Default Hyperparameters |
| | |
| | - `per_device_train_batch_size`: 64 |
| | - `per_device_eval_batch_size`: 64 |
| | - `num_train_epochs`: 1 |
| | - `warmup_ratio`: 0.1 |
| | - `fp16`: True |
| | - `batch_sampler`: no_duplicates |
| | |
| | #### All Hyperparameters |
| | <details><summary>Click to expand</summary> |
| | |
| | - `overwrite_output_dir`: False |
| | - `do_predict`: False |
| | - `prediction_loss_only`: True |
| | - `per_device_train_batch_size`: 64 |
| | - `per_device_eval_batch_size`: 64 |
| | - `per_gpu_train_batch_size`: None |
| | - `per_gpu_eval_batch_size`: None |
| | - `gradient_accumulation_steps`: 1 |
| | - `eval_accumulation_steps`: None |
| | - `learning_rate`: 5e-05 |
| | - `weight_decay`: 0.0 |
| | - `adam_beta1`: 0.9 |
| | - `adam_beta2`: 0.999 |
| | - `adam_epsilon`: 1e-08 |
| | - `max_grad_norm`: 1.0 |
| | - `num_train_epochs`: 1 |
| | - `max_steps`: -1 |
| | - `lr_scheduler_type`: linear |
| | - `lr_scheduler_kwargs`: {} |
| | - `warmup_ratio`: 0.1 |
| | - `warmup_steps`: 0 |
| | - `log_level`: passive |
| | - `log_level_replica`: warning |
| | - `log_on_each_node`: True |
| | - `logging_nan_inf_filter`: True |
| | - `save_safetensors`: True |
| | - `save_on_each_node`: False |
| | - `save_only_model`: False |
| | - `no_cuda`: False |
| | - `use_cpu`: False |
| | - `use_mps_device`: False |
| | - `seed`: 42 |
| | - `data_seed`: None |
| | - `jit_mode_eval`: False |
| | - `use_ipex`: False |
| | - `bf16`: False |
| | - `fp16`: True |
| | - `fp16_opt_level`: O1 |
| | - `half_precision_backend`: auto |
| | - `bf16_full_eval`: False |
| | - `fp16_full_eval`: False |
| | - `tf32`: None |
| | - `local_rank`: 0 |
| | - `ddp_backend`: None |
| | - `tpu_num_cores`: None |
| | - `tpu_metrics_debug`: False |
| | - `debug`: [] |
| | - `dataloader_drop_last`: False |
| | - `dataloader_num_workers`: 0 |
| | - `dataloader_prefetch_factor`: None |
| | - `past_index`: -1 |
| | - `disable_tqdm`: False |
| | - `remove_unused_columns`: True |
| | - `label_names`: None |
| | - `load_best_model_at_end`: False |
| | - `ignore_data_skip`: False |
| | - `fsdp`: [] |
| | - `fsdp_min_num_params`: 0 |
| | - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False} |
| | - `fsdp_transformer_layer_cls_to_wrap`: None |
| | - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'gradient_accumulation_kwargs': None} |
| | - `deepspeed`: None |
| | - `label_smoothing_factor`: 0.0 |
| | - `optim`: adamw_torch |
| | - `optim_args`: None |
| | - `adafactor`: False |
| | - `group_by_length`: False |
| | - `length_column_name`: length |
| | - `ddp_find_unused_parameters`: None |
| | - `ddp_bucket_cap_mb`: None |
| | - `ddp_broadcast_buffers`: False |
| | - `dataloader_pin_memory`: True |
| | - `dataloader_persistent_workers`: False |
| | - `skip_memory_metrics`: True |
| | - `use_legacy_prediction_loop`: False |
| | - `push_to_hub`: False |
| | - `resume_from_checkpoint`: None |
| | - `hub_model_id`: None |
| | - `hub_strategy`: every_save |
| | - `hub_private_repo`: False |
| | - `hub_always_push`: False |
| | - `gradient_checkpointing`: False |
| | - `gradient_checkpointing_kwargs`: None |
| | - `include_inputs_for_metrics`: False |
| | - `eval_do_concat_batches`: True |
| | - `fp16_backend`: auto |
| | - `push_to_hub_model_id`: None |
| | - `push_to_hub_organization`: None |
| | - `mp_parameters`: |
| | - `auto_find_batch_size`: False |
| | - `full_determinism`: False |
| | - `torchdynamo`: None |
| | - `ray_scope`: last |
| | - `ddp_timeout`: 1800 |
| | - `torch_compile`: False |
| | - `torch_compile_backend`: None |
| | - `torch_compile_mode`: None |
| | - `dispatch_batches`: None |
| | - `split_batches`: None |
| | - `include_tokens_per_second`: False |
| | - `include_num_input_tokens_seen`: False |
| | - `neftune_noise_alpha`: None |
| | - `optim_target_modules`: None |
| | - `batch_sampler`: no_duplicates |
| | - `multi_dataset_batch_sampler`: proportional |
| | |
| | </details> |
| | |
| | ### Training Logs |
| | | Epoch | Step | Training Loss | sts-test-128_spearman_cosine | sts-test-256_spearman_cosine | sts-test-512_spearman_cosine | sts-test-64_spearman_cosine | sts-test-768_spearman_cosine | |
| | |:------:|:----:|:-------------:|:----------------------------:|:----------------------------:|:----------------------------:|:---------------------------:|:----------------------------:| |
| | | None | 0 | - | 0.7104 | 0.7264 | 0.7224 | 0.6982 | 0.7225 | |
| | | 0.0229 | 200 | 13.1738 | - | - | - | - | - | |
| | | 0.0459 | 400 | 8.8127 | - | - | - | - | - | |
| | | 0.0688 | 600 | 8.0984 | - | - | - | - | - | |
| | | 0.0918 | 800 | 7.2984 | - | - | - | - | - | |
| | | 0.1147 | 1000 | 7.5749 | - | - | - | - | - | |
| | | 0.1377 | 1200 | 7.1292 | - | - | - | - | - | |
| | | 0.1606 | 1400 | 6.6146 | - | - | - | - | - | |
| | | 0.1835 | 1600 | 6.6523 | - | - | - | - | - | |
| | | 0.2065 | 1800 | 6.1095 | - | - | - | - | - | |
| | | 0.2294 | 2000 | 6.0841 | - | - | - | - | - | |
| | | 0.2524 | 2200 | 6.3024 | - | - | - | - | - | |
| | | 0.2753 | 2400 | 6.1941 | - | - | - | - | - | |
| | | 0.2983 | 2600 | 6.1686 | - | - | - | - | - | |
| | | 0.3212 | 2800 | 5.8317 | - | - | - | - | - | |
| | | 0.3442 | 3000 | 6.0597 | - | - | - | - | - | |
| | | 0.3671 | 3200 | 5.7832 | - | - | - | - | - | |
| | | 0.3900 | 3400 | 5.7088 | - | - | - | - | - | |
| | | 0.4130 | 3600 | 5.6988 | - | - | - | - | - | |
| | | 0.4359 | 3800 | 5.5268 | - | - | - | - | - | |
| | | 0.4589 | 4000 | 5.5543 | - | - | - | - | - | |
| | | 0.4818 | 4200 | 5.3152 | - | - | - | - | - | |
| | | 0.5048 | 4400 | 5.2894 | - | - | - | - | - | |
| | | 0.5277 | 4600 | 5.1805 | - | - | - | - | - | |
| | | 0.5506 | 4800 | 5.4559 | - | - | - | - | - | |
| | | 0.5736 | 5000 | 5.3836 | - | - | - | - | - | |
| | | 0.5965 | 5200 | 5.2626 | - | - | - | - | - | |
| | | 0.6195 | 5400 | 5.2511 | - | - | - | - | - | |
| | | 0.6424 | 5600 | 5.3308 | - | - | - | - | - | |
| | | 0.6654 | 5800 | 5.2264 | - | - | - | - | - | |
| | | 0.6883 | 6000 | 5.2881 | - | - | - | - | - | |
| | | 0.7113 | 6200 | 5.1349 | - | - | - | - | - | |
| | | 0.7342 | 6400 | 5.0872 | - | - | - | - | - | |
| | | 0.7571 | 6600 | 4.5515 | - | - | - | - | - | |
| | | 0.7801 | 6800 | 3.4312 | - | - | - | - | - | |
| | | 0.8030 | 7000 | 3.1008 | - | - | - | - | - | |
| | | 0.8260 | 7200 | 2.9582 | - | - | - | - | - | |
| | | 0.8489 | 7400 | 2.8153 | - | - | - | - | - | |
| | | 0.8719 | 7600 | 2.7214 | - | - | - | - | - | |
| | | 0.8948 | 7800 | 2.5392 | - | - | - | - | - | |
| | | 0.9177 | 8000 | 2.584 | - | - | - | - | - | |
| | | 0.9407 | 8200 | 2.5384 | - | - | - | - | - | |
| | | 0.9636 | 8400 | 2.4937 | - | - | - | - | - | |
| | | 0.9866 | 8600 | 2.4155 | - | - | - | - | - | |
| | | 1.0 | 8717 | - | 0.8118 | 0.8159 | 0.8212 | 0.7949 | 0.8205 | |
| | |
| | |
| | ### Framework Versions |
| | - Python: 3.9.18 |
| | - Sentence Transformers: 3.0.1 |
| | - Transformers: 4.40.0 |
| | - PyTorch: 2.2.2+cu121 |
| | - Accelerate: 0.26.1 |
| | - Datasets: 2.19.0 |
| | - Tokenizers: 0.19.1 |
| | |
| | ## Citation |
| | |
| | ### BibTeX |
| | |
| | #### Sentence Transformers |
| | ```bibtex |
| | @inproceedings{reimers-2019-sentence-bert, |
| | title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", |
| | author = "Reimers, Nils and Gurevych, Iryna", |
| | booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", |
| | month = "11", |
| | year = "2019", |
| | publisher = "Association for Computational Linguistics", |
| | url = "https://arxiv.org/abs/1908.10084", |
| | } |
| | ``` |
| | |
| | #### MatryoshkaLoss |
| | ```bibtex |
| | @misc{kusupati2024matryoshka, |
| | title={Matryoshka Representation Learning}, |
| | author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi}, |
| | year={2024}, |
| | eprint={2205.13147}, |
| | archivePrefix={arXiv}, |
| | primaryClass={cs.LG} |
| | } |
| | ``` |
| | |
| | #### MultipleNegativesRankingLoss |
| | ```bibtex |
| | @misc{henderson2017efficient, |
| | title={Efficient Natural Language Response Suggestion for Smart Reply}, |
| | author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil}, |
| | year={2017}, |
| | eprint={1705.00652}, |
| | archivePrefix={arXiv}, |
| | primaryClass={cs.CL} |
| | } |
| | ``` |
| | |
| | ## <span style="color:blue">Acknowledgments</span> |
| | |
| | The author would like to thank Prince Sultan University for their invaluable support in this project. Their contributions and resources have been instrumental in the development and fine-tuning of these models. |
| | |
| | |
| | |
| | ```markdown |
| | ## Citation |
| | |
| | If you use the Arabic Matryoshka Embeddings Model, please cite it as follows: |
| | |
| | @misc{nacar2024enhancingsemanticsimilarityunderstanding, |
| | title={Enhancing Semantic Similarity Understanding in Arabic NLP with Nested Embedding Learning}, |
| | author={Omer Nacar and Anis Koubaa}, |
| | year={2024}, |
| | eprint={2407.21139}, |
| | archivePrefix={arXiv}, |
| | primaryClass={cs.CL}, |
| | url={https://arxiv.org/abs/2407.21139}, |
| | } |