lbourdois/fineweb-2-trimming
Preview • Updated • 1.97M • 1.52k • 1
How to use alphaedge-ai/embeddinggemma-pms-16384 with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("fill-mask", model="alphaedge-ai/embeddinggemma-pms-16384") # Load model directly
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("alphaedge-ai/embeddinggemma-pms-16384")
model = AutoModel.from_pretrained("alphaedge-ai/embeddinggemma-pms-16384")This model is a 62.3% smaller version of google/embeddinggemma-300m optimized for Piedmontese language via vocabulary trimming mined on Lumberjackk/fineweb-2-trimming.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("embeddinggemma-pms-16384")
# Run inference with queries and documents
query = "My query"
documents = [
"Chunk 1",
"Chunk 2",
"Chunk 3",
]
query_embeddings = model.encode_query(query)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# Compute similarities to determine a ranking
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
Base model
google/embeddinggemma-300m