Sentence Similarity
sentence-transformers
ONNX
Safetensors
Transformers.js
gte
feature-extraction
mteb
arctic
snowflake-arctic-embed
custom_code
Eval Results (legacy)
Eval Results
Instructions to use Snowflake/snowflake-arctic-embed-m-v2.0 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use Snowflake/snowflake-arctic-embed-m-v2.0 with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("Snowflake/snowflake-arctic-embed-m-v2.0", trust_remote_code=True) sentences = [ "The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] - Transformers.js
How to use Snowflake/snowflake-arctic-embed-m-v2.0 with Transformers.js:
// npm i @huggingface/transformers import { pipeline } from '@huggingface/transformers'; // Allocate pipeline const pipe = await pipeline('sentence-similarity', 'Snowflake/snowflake-arctic-embed-m-v2.0'); - Notebooks
- Google Colab
- Kaggle
Does this model use GTE weights?
#14
by nauti16 - opened
Hi, thank you for sharing this great model!!!
I understand that arctic-embed-m-v2.0 builds on the GTE-multilingual-base.
To clarify whether this model supports commercial use, could you confirm:
- Does 'arctic-embed-m-v2.0' reuse the pre-trained weights from 'GTE-multilingual-base', or
- Did you train the model entirely from scratch using your own data without pre-trained weights from GTE-multilingual-base?
Because GTE-multilingual-base was trained on MS MARCO, which is restricted to non-commercial use.
Thanks in advance!
We trained arctic embed 2.0 m based on ‘Alibaba-NLP/gte-multilingual-mlm-base’, which represents weights before fine tuning on MS MARCO.
Thank you for the clarification!!!
pxyu changed discussion status to closed