Sentence Similarity
sentence-transformers
Safetensors
Korean
English
xlm-roberta
text-embeddings
retrieval
mteb
korean
multilingual
e5
text-embeddings-inference
Instructions to use jjp97/laal-embedding-v0 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use jjp97/laal-embedding-v0 with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("jjp97/laal-embedding-v0") sentences = [ "The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] - Notebooks
- Google Colab
- Kaggle
Prompt usage
#1
by tomaarsen - opened
Hello!
I believe the query and document prompts from the config_sentence_transformers.json aren't applied automatically with model.encode, only with model.encode_query (uses prompt "query") and model.encode_document (uses prompt "document").
So then
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("jjp97/laal-embedding-v1")
q_emb = model.encode("νμ¬ μ λνΌ λ°©λ²")
p_emb = model.encode("νμ¬κ° λ°μνλ©΄ μ¦μ 119μ μ κ³ νκ³ μμ ν κ²½λ‘λ‘ λνΌν΄μΌ νλ€.")
would not use any instructions.
API Documentation: https://sbert.net/docs/package_reference/sentence_transformer/SentenceTransformer.html#sentence_transformers.SentenceTransformer.encode_query
- Tom Aarsen
Hi Tom, thanks for the helpful feedback!
I've made the following updates:
- Changed the prompt key from "passage" to "document" in config_sentence_transformers.json
- Updated the README examples to use encode_query() and encode_document() instead of encode()
Appreciate you catching this!
Very nice! That's exactly how I would have tackled it as well.
Congratulations on the release!
- Tom Aarsen
tomaarsen changed discussion status to closed