Korean Embedding Models
Collection
Public Korean embedding models with benchmark results and model cards • 2 items • Updated
How to use hyunseop/qwen3-embedding with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("hyunseop/qwen3-embedding")
sentences = [
"That is a happy person",
"That is a happy dog",
"That is a very happy person",
"Today is a sunny day"
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]H100 fine-tuned Korean embedding model based on Qwen/Qwen3-Embedding-4B.
This is the stronger of the two uploaded models for public Korean retrieval. Compared with bge-m3-ko-h100, it is the better model to highlight when you want the strongest general Korean retrieval story.
Qwen/Qwen3-Embedding-4B111677_20260506_114341_both_2gpuMIRACLRetrievalko| cutoff | Precision | Recall | F1 | mAP | mRR | NDCG |
|---|---|---|---|---|---|---|
| @1 | 0.46479 | 0.29452 | 0.32978 | 0.29452 | 0.464789 | 0.46479 |
| @3 | 0.25665 | 0.42305 | 0.28522 | 0.42305 | 0.577318 | 0.45190 |
| @5 | 0.18310 | 0.47493 | 0.23726 | 0.47493 | 0.577318 | 0.45852 |
| @10 | 0.11737 | 0.58359 | 0.17922 | 0.58359 | 0.577318 | 0.49227 |
| @20 | 0.07254 | 0.67332 | 0.12345 | 0.67332 | 0.577318 | 0.52325 |
| @100 | 0.02028 | 0.83695 | 0.03891 | 0.83695 | 0.577318 | 0.56629 |
| @1000 | 0.00246 | 0.95250 | 0.00491 | 0.95250 | 0.577318 | 0.58808 |
output/111677_20260506_114341_both_2gpu/qwenbenchmark_results/autorag_benchmark.jsonbenchmark_results/qwen_miracl_fast4/miracl_benchmark.txtbenchmark_results/qwen_miracl_fast4/miracl_benchmark.jsondragonkue/snowflake-arctic-embed-l-v2.0-ko 0.740433, dragonkue/BGE-m3-ko 0.729993, nlpai-lab/KURE-v1 0.727739, and nlpai-lab/KoE5 0.711356 on the Korean retrieval leaderboard cited by the model card0.7484