An Efficient Self-Supervised Cross-View Training For Sentence Embedding
Paper • 2311.03228 • Published • 1
How to use kornwtp/SCT-model-phayathaibert with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("kornwtp/SCT-model-phayathaibert")
sentences = [
"That is a happy person",
"That is a happy dog",
"That is a very happy person",
"Today is a sunny day"
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]How to use kornwtp/SCT-model-phayathaibert with Transformers:
# Load model directly
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("kornwtp/SCT-model-phayathaibert")
model = AutoModel.from_pretrained("kornwtp/SCT-model-phayathaibert")This is a sentence-transformers model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.
We use SCT here and training the model with Thai Wikipedia here
Using this model becomes easy when you have sentence-transformers installed:
pip install -U sentence-transformers
Then you can use the model like this:
from sentence_transformers import SentenceTransformer
sentences = ["กลุ่มผู้ชายเล่นฟุตบอลบนชายหาด", "กลุ่มเด็กชายกำลังเล่นฟุตบอลบนชายหาด"]
model = SentenceTransformer('{MODEL_NAME}')
embeddings = model.encode(sentences)
print(embeddings)