| # SentiCSE | |
| This is a roBERTa-base model trained on MR dataset and finetuned for sentiment analysis with the Sentiment tasks. | |
| This model is suitable for English. | |
| - Reference Paper: SentiCSE (Main of Coling 2024). | |
| - Git Repo: https://github.com/nayohan/SentiCSE. | |
| ```python | |
| import torch | |
| from scipy.spatial.distance import cosine | |
| from transformers import AutoTokenizer, AutoModel | |
| tokenizer = AutoTokenizer.from_pretrained("DILAB-HYU/SentiCSE") | |
| model = AutoModel.from_pretrained("DILAB-HYU/SentiCSE") | |
| # Tokenize input texts | |
| texts = [ | |
| "The food is delicious.", | |
| "The atmosphere of the restaurant is good.", | |
| "The food at the restaurant is devoid of flavor.", | |
| "The restaurant lacks a good ambiance." | |
| ] | |
| inputs = tokenizer(texts, padding=True, truncation=True, return_tensors="pt") | |
| # Get the embeddings | |
| with torch.no_grad(): | |
| embeddings = model(**inputs, output_hidden_states=True, return_dict=True).pooler_output | |
| # Calculate cosine similarities | |
| # Cosine similarities are in [-1, 1]. Higher means more similar | |
| cosine_sim_0_1 = 1 - cosine(embeddings[0], embeddings[1]) | |
| cosine_sim_0_2 = 1 - cosine(embeddings[0], embeddings[2]) | |
| cosine_sim_0_3 = 1 - cosine(embeddings[0], embeddings[3]) | |
| print("Cosine similarity between \"%s\" and \"%s\" is: %.3f" % (texts[0], texts[1], cosine_sim_0_1)) | |
| print("Cosine similarity between \"%s\" and \"%s\" is: %.3f" % (texts[0], texts[2], cosine_sim_0_2)) | |
| print("Cosine similarity between \"%s\" and \"%s\" is: %.3f" % (texts[0], texts[3], cosine_sim_0_3)) | |
| ``` | |
| Output: | |
| ``` | |
| Cosine similarity between "The food is delicious." and "The atmosphere of the restaurant is good." is: 0.942 | |
| Cosine similarity between "The food is delicious." and "The food at the restaurant is devoid of flavor." is: 0.703 | |
| Cosine similarity between "The food is delicious." and "The restaurant lacks a good ambiance." is: 0.656 | |
| ``` | |
| ## BibTeX entry and citation info | |
| Please cite the reference paper if you use this model. | |
| ``` | |
| @article{2024SentiCES, | |
| title={SentiCSE: A Sentiment-aware Contrastive Sentence Embedding Framework with Sentiment-guided Textual Similarity}, | |
| author={Kim, Jaemin and Na, Yohan and Kim, Kangmin and Lee, Sangrak and Chae, Dong-Kyu}, | |
| journal={Proceedings of the 30th International Conference on Computational Linguistics (COLING)}, | |
| year={2024}, | |
| } | |
| ``` | |