Helsinki-NLP/opus-100
Viewer โข Updated โข 55.1M โข 30.2k โข 235
How to use kekeappa/kor-static-embedding-512 with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("kekeappa/kor-static-embedding-512")
sentences = [
"The weather is lovely today.",
"It's so sunny outside!",
"He drove to the stadium."
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]How to use kekeappa/kor-static-embedding-512 with Model2Vec:
from model2vec import StaticModel
model = StaticModel.from_pretrained("kekeappa/kor-static-embedding-512")ํ๊ตญ์ด ํนํ ์ด๊ฒฝ๋ Static Embedding ๋ชจ๋ธ โ 68MB, 512์ฐจ์.
kekeappa/kor-static-embedding-512๋ฅผ Matryoshka ํ์ต์ผ๋ก ๋ง๋ค๊ณ 512์ฐจ์์ผ๋ก ์๋ผ๋ธ ๋ณ์ข ์ ๋๋ค. ๊ฐ์ ๋ชจ๋ธ ํจ๋ฐ๋ฆฌ์ 4๊ฐ ์ฐจ์ ์กด์ฌ โ ์ฉ๋์ ๋ง๊ฒ ์ ํ:
| ์ฐจ์ | ํฌ๊ธฐ | ์ฉ๋ |
|---|---|---|
| 64 | 9MB | ๐ ๋ธ๋ผ์ฐ์ ยท ๋ชจ๋ฐ์ผ ยท ์ฃ์ง |
| 128 | 17MB | โก ๊ฐ๋ฒผ์ด ๊ฒ์ยท๋ถ๋ฅ |
| 256 | 34MB | โ๏ธ ๊ฐ์ฑ๋น |
| 512 | 68MB | ๐ฏ ์ต๊ณ ์ ํ๋ |
| ๋ฒค์น๋งํฌ | Pearson | Spearman |
|---|---|---|
| KorSTS-test | 0.7760 | 0.7718 |
| KorSTS-valid | โ | 0.8330 |
| KLUE-STS-val | โ | 0.7033 |
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("kekeappa/kor-static-embedding-512")
emb = model.encode(["ํ๊ตญ์ด ๋ฌธ์ฅ", "์๋ฒ ๋ฉ ํ
์คํธ"], normalize_embeddings=True)
print(emb.shape) # (2, 512)
4-stage ํ์ต:
BM-K/KoSimCSE-roberta-multitask teacher์ vocab ์๋ฒ ๋ฉ โ PCA + Zipf weightingkakaobrain/kor_nli (multi_nli + snli) 277K tripletMatryoshkaLoss)ํ์ต ์ฝ๋: https://github.com/johunsang/kor-static-embedding-512
Apache 2.0
Base model
klue/roberta-base