mandarjoshi/trivia_qa
Viewer • Updated • 848k • 88.7k • 192
How to use thoddnn/all-MiniLM-L6-v2-4bit with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("thoddnn/all-MiniLM-L6-v2-4bit")
sentences = [
"That is a happy person",
"That is a happy dog",
"That is a very happy person",
"Today is a sunny day"
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]How to use thoddnn/all-MiniLM-L6-v2-4bit with Transformers:
# Load model directly
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("thoddnn/all-MiniLM-L6-v2-4bit")
model = AutoModel.from_pretrained("thoddnn/all-MiniLM-L6-v2-4bit")How to use thoddnn/all-MiniLM-L6-v2-4bit with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir all-MiniLM-L6-v2-4bit thoddnn/all-MiniLM-L6-v2-4bit
The Model mlx-community/all-MiniLM-L6-v2-4bit was converted to MLX format from sentence-transformers/all-MiniLM-L6-v2 using mlx-lm version 0.0.3.
pip install mlx-embeddings
from mlx_embeddings import load, generate
import mlx.core as mx
model, tokenizer = load("mlx-community/all-MiniLM-L6-v2-4bit")
# For text embeddings
output = generate(model, processor, texts=["I like grapes", "I like fruits"])
embeddings = output.text_embeds # Normalized embeddings
# Compute dot product between normalized embeddings
similarity_matrix = mx.matmul(embeddings, embeddings.T)
print("Similarity matrix between texts:")
print(similarity_matrix)
Quantized