Sentence Similarity
sentence-transformers
Safetensors
modernbert
feature-extraction
Generated from Trainer
dataset_size:36864
loss:MatryoshkaLoss
loss:CachedMultipleNegativesRankingLoss
text-embeddings-inference
Instructions to use waris-gill/langcache-embed-v2-local with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use waris-gill/langcache-embed-v2-local with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("waris-gill/langcache-embed-v2-local") sentences = [ "What are civil cases and what are some examples?", "What are criminal cases and what are no examples?", "Civil cases involve disputes between individuals or organizations, typically seeking monetary compensation or specific performance, and *do not* include criminal prosecutions by the government.", "Criminal cases involve disputes between individuals or organizations, seeking monetary damages or specific performance, while civil cases concern offenses against the state punishable by imprisonment.", "What are some examples of civil cases?" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [5, 5] - Notebooks
- Google Colab
- Kaggle
Add new SentenceTransformer model
#1
by waris-gill - opened
Hello!
This pull request has been automatically generated from the push_to_hub method from the Sentence Transformers library.
Full Model Architecture:
SentenceTransformer(
(0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: ModernBertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Tip:
Consider testing this pull request before merging by loading the model from this PR with the revision argument:
from sentence_transformers import SentenceTransformer
# TODO: Fill in the PR number
pr_number = 2
model = SentenceTransformer(
"waris-gill/langcache-embed-v2-local",
revision=f"refs/pr/{pr_number}",
backend="torch",
)
# Verify that everything works as expected
embeddings = model.encode(["The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium."])
print(embeddings.shape)
similarities = model.similarity(embeddings, embeddings)
print(similarities)
waris-gill changed pull request status to merged