Update README.md
Browse files
README.md
CHANGED
|
@@ -2801,7 +2801,7 @@ model-index:
|
|
| 2801 |
value: 85.30624598674467
|
| 2802 |
license: apache-2.0
|
| 2803 |
---
|
| 2804 |
-
<h1 align="center">Snowflake's
|
| 2805 |
<h4 align="center">
|
| 2806 |
<p>
|
| 2807 |
<a href=#news>News</a> |
|
|
@@ -2825,7 +2825,7 @@ license: apache-2.0
|
|
| 2825 |
## Models
|
| 2826 |
|
| 2827 |
|
| 2828 |
-
Arctic-Embed is a suite of text embedding models that focuses on creating high-quality
|
| 2829 |
|
| 2830 |
|
| 2831 |
The `arctic-text-embedding` models achieve **state-of-the-art performance on the MTEB/BEIR leaderboard** for each of their size variants. Evaluation is performed using these [scripts](https://github.com/Snowflake-Labs/arctic-embed/tree/main/src). As shown below, each class of model size achieves SOTA retrieval accuracy when compared to other top models.
|
|
@@ -2944,8 +2944,8 @@ To use an arctic-embed model, you can use the transformers package, as shown bel
|
|
| 2944 |
import torch
|
| 2945 |
from transformers import AutoModel, AutoTokenizer
|
| 2946 |
|
| 2947 |
-
tokenizer = AutoTokenizer.from_pretrained('Snowflake/
|
| 2948 |
-
model = AutoModel.from_pretrained('Snowflake/
|
| 2949 |
model.eval()
|
| 2950 |
|
| 2951 |
query_prefix = 'Represent this sentence for searching relevant passages: '
|
|
@@ -2981,7 +2981,7 @@ If you use the long context model and have more than 2048 tokens, ensure that yo
|
|
| 2981 |
|
| 2982 |
|
| 2983 |
``` py
|
| 2984 |
-
model = AutoModel.from_pretrained('Snowflake/
|
| 2985 |
```
|
| 2986 |
|
| 2987 |
|
|
|
|
| 2801 |
value: 85.30624598674467
|
| 2802 |
license: apache-2.0
|
| 2803 |
---
|
| 2804 |
+
<h1 align="center">Snowflake's Artic-embed-m</h1>
|
| 2805 |
<h4 align="center">
|
| 2806 |
<p>
|
| 2807 |
<a href=#news>News</a> |
|
|
|
|
| 2825 |
## Models
|
| 2826 |
|
| 2827 |
|
| 2828 |
+
Arctic-Embed is a suite of text embedding models that focuses on creating high-quality retrieval models optimized for performance.
|
| 2829 |
|
| 2830 |
|
| 2831 |
The `arctic-text-embedding` models achieve **state-of-the-art performance on the MTEB/BEIR leaderboard** for each of their size variants. Evaluation is performed using these [scripts](https://github.com/Snowflake-Labs/arctic-embed/tree/main/src). As shown below, each class of model size achieves SOTA retrieval accuracy when compared to other top models.
|
|
|
|
| 2944 |
import torch
|
| 2945 |
from transformers import AutoModel, AutoTokenizer
|
| 2946 |
|
| 2947 |
+
tokenizer = AutoTokenizer.from_pretrained('Snowflake/arctic-embed-m')
|
| 2948 |
+
model = AutoModel.from_pretrained('Snowflake/arctic-embed-m', add_pooling_layer=False)
|
| 2949 |
model.eval()
|
| 2950 |
|
| 2951 |
query_prefix = 'Represent this sentence for searching relevant passages: '
|
|
|
|
| 2981 |
|
| 2982 |
|
| 2983 |
``` py
|
| 2984 |
+
model = AutoModel.from_pretrained('Snowflake/arctic-embed-m-long', trust_remote_code=True, rotary_scaling_factor=2)
|
| 2985 |
```
|
| 2986 |
|
| 2987 |
|