Update README.md
Browse files
README.md
CHANGED
|
@@ -13,6 +13,8 @@ language:
|
|
| 13 |
- en
|
| 14 |
---
|
| 15 |
|
|
|
|
|
|
|
| 16 |
# SentenceTransformer based on intfloat/multilingual-e5-small
|
| 17 |
|
| 18 |
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [intfloat/multilingual-e5-small](https://huggingface.co/intfloat/multilingual-e5-small) on datasets that include Korean query-passage pairs for improved performance on Korean retrieval tasks. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
|
|
@@ -162,6 +164,20 @@ You can finetune this model on your own dataset.
|
|
| 162 |
|
| 163 |
#### Information Retrieval
|
| 164 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 165 |
|
| 166 |
|
| 167 |
<!--
|
|
|
|
| 13 |
- en
|
| 14 |
---
|
| 15 |
|
| 16 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/642b0c2fecec03b4464a1d9b/IxcqY5qbGNuGpqDciIcOI.webp" width="600">
|
| 17 |
+
|
| 18 |
# SentenceTransformer based on intfloat/multilingual-e5-small
|
| 19 |
|
| 20 |
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [intfloat/multilingual-e5-small](https://huggingface.co/intfloat/multilingual-e5-small) on datasets that include Korean query-passage pairs for improved performance on Korean retrieval tasks. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
|
|
|
|
| 164 |
|
| 165 |
#### Information Retrieval
|
| 166 |
|
| 167 |
+
| Model | Average | XPQARetrieval | PublicHealthQA | MIRACLRetrieval | Ko-StrategyQA | BelebeleRetrieval | AutoRAGRetrieval | MrTidyRetrieval |
|
| 168 |
+
|:------------------------------------------------------------|----------:|----------------:|-----------------:|------------------:|----------------:|--------------------:|-------------------:|------------------:|
|
| 169 |
+
| BAAI/bge-m3 | 0.724169 | 0.36075 | 0.80412 | 0.70146 | 0.79405 | 0.93164 | 0.83008 | 0.64708 |
|
| 170 |
+
| Snowflake/snowflake-arctic-embed-l-v2.0 | 0.724104 | 0.43018 | 0.81679 | 0.66077 | 0.80455 | 0.9271 | 0.83863 | 0.59071 |
|
| 171 |
+
| intfloat/multilingual-e5-large | 0.721607 | 0.3571 | 0.82534 | 0.66486 | 0.80348 | 0.94499 | 0.81337 | 0.64211 |
|
| 172 |
+
| intfloat/multilingual-e5-base | 0.689429 | 0.3607 | 0.77203 | 0.6227 | 0.76355 | 0.92868 | 0.79752 | 0.58082 |
|
| 173 |
+
| **dragonkue/multilingual-e5-small-ko** | 0.688849 | 0.34892 | 0.79729 | 0.61113 | 0.76173 | 0.9297 | 0.86184 | 0.51133 |
|
| 174 |
+
| intfloat/multilingual-e5-small | 0.670906 | 0.33003 | 0.73668 | 0.61238 | 0.75157 | 0.90531 | 0.80068 | 0.55969 |
|
| 175 |
+
| ibm-granite/granite-embedding-278m-multilingual | 0.641935 | 0.23058 | 0.77668 | 0.59216 | 0.71762 | 0.83231 | 0.70226 | nan |
|
| 176 |
+
| ibm-granite/granite-embedding-107m-multilingual | 0.625862 | 0.23058 | 0.73209 | 0.58413 | 0.70531 | 0.82063 | 0.68243 | nan |
|
| 177 |
+
| sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 | 0.456867 | 0.21345 | 0.67409 | 0.25676 | 0.45903 | 0.71491 | 0.42296 | nan |
|
| 178 |
+
|
| 179 |
+
#### Performance Comparison by Model Size (Based on Average NDCG@10)
|
| 180 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/642b0c2fecec03b4464a1d9b/Ba2bVpPlB7egF80USITJ5.png" width="800"/>
|
| 181 |
|
| 182 |
|
| 183 |
<!--
|