Instructions to use Snowflake/snowflake-arctic-embed-m-v1.5 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use Snowflake/snowflake-arctic-embed-m-v1.5 with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("Snowflake/snowflake-arctic-embed-m-v1.5") sentences = [ "That is a happy person", "That is a happy dog", "That is a very happy person", "Today is a sunny day" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Transformers.js
How to use Snowflake/snowflake-arctic-embed-m-v1.5 with Transformers.js:
// npm i @huggingface/transformers import { pipeline } from '@huggingface/transformers'; // Allocate pipeline const pipe = await pipeline('sentence-similarity', 'Snowflake/snowflake-arctic-embed-m-v1.5'); - llama-cpp-python
How to use Snowflake/snowflake-arctic-embed-m-v1.5 with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="Snowflake/snowflake-arctic-embed-m-v1.5", filename="gguf/snowflake-arctic-embed-m-v1.5-bf16.gguf", )
output = llm( "Once upon a time,", max_tokens=512, echo=True ) print(output)
- Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use Snowflake/snowflake-arctic-embed-m-v1.5 with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Snowflake/snowflake-arctic-embed-m-v1.5:BF16 # Run inference directly in the terminal: llama-cli -hf Snowflake/snowflake-arctic-embed-m-v1.5:BF16
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Snowflake/snowflake-arctic-embed-m-v1.5:BF16 # Run inference directly in the terminal: llama-cli -hf Snowflake/snowflake-arctic-embed-m-v1.5:BF16
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf Snowflake/snowflake-arctic-embed-m-v1.5:BF16 # Run inference directly in the terminal: ./llama-cli -hf Snowflake/snowflake-arctic-embed-m-v1.5:BF16
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf Snowflake/snowflake-arctic-embed-m-v1.5:BF16 # Run inference directly in the terminal: ./build/bin/llama-cli -hf Snowflake/snowflake-arctic-embed-m-v1.5:BF16
Use Docker
docker model run hf.co/Snowflake/snowflake-arctic-embed-m-v1.5:BF16
- LM Studio
- Jan
- Ollama
How to use Snowflake/snowflake-arctic-embed-m-v1.5 with Ollama:
ollama run hf.co/Snowflake/snowflake-arctic-embed-m-v1.5:BF16
- Unsloth Studio new
How to use Snowflake/snowflake-arctic-embed-m-v1.5 with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Snowflake/snowflake-arctic-embed-m-v1.5 to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Snowflake/snowflake-arctic-embed-m-v1.5 to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Snowflake/snowflake-arctic-embed-m-v1.5 to start chatting
- Docker Model Runner
How to use Snowflake/snowflake-arctic-embed-m-v1.5 with Docker Model Runner:
docker model run hf.co/Snowflake/snowflake-arctic-embed-m-v1.5:BF16
- Lemonade
How to use Snowflake/snowflake-arctic-embed-m-v1.5 with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull Snowflake/snowflake-arctic-embed-m-v1.5:BF16
Run and chat with the model
lemonade run user.snowflake-arctic-embed-m-v1.5-BF16
List all available models
lemonade list
Uninitialised weights warning when loading with Sentence Transformers
Hi all,
Thanks for the great work on this model, amazing work and performance for its size and MRL support is fantastic.
I have a question about a warning being emitted by sentence transformers when loading the model. I am getting the following:
Some weights of BertModel were not initialized from the model checkpoint at Snowflake/snowflake-arctic-embed-m-v1.5 and are newly initialized: ['pooler.dense.bias', 'pooler.dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
I think this may be a harmless warning but I wanted to check if this is expected behaviour.
Thanks.
It should be harmless but also should not happen. Are you perhaps initializing the model differently from the example snippet in the readme? We do suggest explicitly disabling the pooling layer in that example. I believe the v1.0 model may already have this setting in its config files, too. I'll look at migrating that setting from the sentence bert config json file for the 1.0 model to v1.5 when I get the chance, but also feel free to open a PR if getting this option automatically applied is urgent for you.
I get the same warning and I'm pretty sure I followed the example in the readme, including disabling the pooling layer.
The warning can indeed be safely ignored. However, it is odd that you still get the warning even with model_kwargs=dict(add_pooling_layer=False).
#5 should make it a bit easier: you won't have to manually specify model_kwargs=dict(add_pooling_layer=False) anymore, and you shouldn't get any warning.
- Tom Aarsen
Oh, it was my mistake, a base class was initialising the model with the wrong arguments, just to be overwritten with the correct ones later. Thank you!