Sentence Similarity
sentence-transformers
PyTorch
Transformers
bert
feature-extraction
text-embeddings-inference
Instructions to use kornwtp/ConGen-WangchanBERT-Small with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use kornwtp/ConGen-WangchanBERT-Small with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("kornwtp/ConGen-WangchanBERT-Small") sentences = [ "That is a happy person", "That is a happy dog", "That is a very happy person", "Today is a sunny day" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Transformers
How to use kornwtp/ConGen-WangchanBERT-Small with Transformers:
# Load model directly from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained("kornwtp/ConGen-WangchanBERT-Small") model = AutoModel.from_pretrained("kornwtp/ConGen-WangchanBERT-Small") - Notebooks
- Google Colab
- Kaggle
kornwtp/ConGen-WangchanBERT-Small
This is a ConGen model: It maps sentences to a 128 dimensional dense vector space and can be used for tasks like semantic search.
Usage
Using this model becomes easy when you have ConGen installed:
pip install -U git+https://github.com/KornWtp/ConGen.git
Then you can use the model like this:
from sentence_transformers import SentenceTransformer
sentences = ["กลุ่มผู้ชายเล่นฟุตบอลบนชายหาด", "กลุ่มเด็กชายกำลังเล่นฟุตบอลบนชายหาด"]
model = SentenceTransformer('kornwtp/ConGen-WangchanBERT-Small')
embeddings = model.encode(sentences)
print(embeddings)
Evaluation Results
For an automated evaluation of this model, see the Thai Sentence Embeddings Benchmark: Semantic Textual Similarity
Citing & Authors
@inproceedings{limkonchotiwat-etal-2022-congen,
title = "{ConGen}: Unsupervised Control and Generalization Distillation For Sentence Representation",
author = "Limkonchotiwat, Peerat and
Ponwitayarat, Wuttikorn and
Lowphansirikul, Lalita and
Udomcharoenchaikit, Can and
Chuangsuwanich, Ekapol and
Nutanong, Sarana",
booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2022",
year = "2022",
publisher = "Association for Computational Linguistics",
}
- Downloads last month
- 483