You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

SentenceTransformer based on BAAI/bge-m3

This is a sentence-transformers model finetuned from BAAI/bge-m3. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: BAAI/bge-m3
Maximum Sequence Length: 8192 tokens
Output Dimensionality: 1024 dimensions
Similarity Function: Cosine Similarity

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False, 'architecture': 'XLMRobertaModel'})
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Ja-ck/bge-m3-yes24-ft")
# Run inference
sentences = [
    '수능특강영어영역영어독해연습',
    'EBS 수능특강 영어영역 영어 (2025년)',
    '2026 E 모든 변형문제 수능특강 영어독해연습 (2025년)',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.7262, 0.7143],
#         [0.7262, 1.0000, 0.7726],
#         [0.7143, 0.7726, 1.0000]])

Training Details

Training Dataset

Unnamed Dataset

Size: 922,205 training samples
Columns: sentence_0, sentence_1, and sentence_2

Approximate statistics based on the first 1000 samples:

	sentence_0	sentence_1	sentence_2
type	string	string	string
details	min: 4 tokens mean: 10.46 tokens max: 45 tokens	min: 3 tokens mean: 14.22 tokens max: 45 tokens	min: 3 tokens mean: 13.67 tokens max: 39 tokens

Samples:

sentence_0	sentence_1	sentence_2
`생태프로그램`	`유아숲교육 프로그램`	`지식생태학`
`1984임희선`	`1984`	`1984 최동원`
`이해원N제시즌세트`	`2026 이해원 N제 시즌1 수학 1, 2 세트`	`이해원 N제 시즌1 수학1 (2023년용)`

IR 평가 (RAG 검색 성능)

실제 RAG 시나리오에서의 검색 성능을 평가했습니다.

평가 환경

항목	값
평가 쿼리	10,000개
비교 모델	Baseline (bge-m3) vs Fine-tuned

평가 결과 요약

At K=10:
  ✅        MRR: 0.4159 → 0.5985 (+43.9%)
  ✅     RECALL: 0.6331 → 0.8329 (+31.6%)
  ✅       NDCG: 0.4579 → 0.6453 (+40.9%)
  ✅   HIT_RATE: 0.6777 → 0.8704 (+28.4%)

MRR (Mean Reciprocal Rank): 0.42 → 0.60 (+43.9%)

의미: "정답이 몇 번째에 나오나요?"

정답 위치	점수
1위	1.0
2위	0.5
3위	0.33
10위	0.1

해석:

Before: 평균적으로 정답이 2~3위 근처
After: 평균적으로 정답이 1~2위 근처
사용자가 원하는 상품을 더 빨리 찾음

Recall@10: 0.63 → 0.83 (+31.6%)

의미: "상위 10개 중 정답이 포함된 비율"

해석:

Before: 100번 검색하면 63번 정답이 상위 10개 안에 있음
After: 100번 검색하면 83번 정답이 상위 10개 안에 있음
검색 누락이 37% → 17%로 절반 이상 감소

NDCG@10: 0.46 → 0.65 (+40.9%)

의미: "정답이 높은 순위에 있을수록 높은 점수" (0~1 스케일)

해석:

1위에 정답 = 1.0점, 10위에 정답 = 낮은 점수
Before: 정답이 있어도 중하위권에 많았음
After: 정답이 상위권에 배치됨
랭킹 품질이 40% 개선

Hit Rate@10: 0.68 → 0.87 (+28.4%)

의미: "상위 10개 안에 정답이 하나라도 있으면 성공"

해석:

Before: 100번 검색 중 68번 성공
After: 100번 검색 중 87번 성공
검색 실패율이 32% → 13%로 감소

전체 K값별 결과

K	MRR (Δ%)	Recall (Δ%)	NDCG (Δ%)	Hit Rate (Δ%)
1	0.30→0.46 (+53.7%)	0.26→0.40 (+51.6%)	0.30→0.46 (+53.7%)	0.30→0.46 (+53.7%)
5	0.40→0.59 (+46.1%)	0.53→0.74 (+39.8%)	0.42→0.61 (+45.1%)	0.57→0.79 (+37.4%)
10	0.42→0.60 (+43.9%)	0.63→0.83 (+31.6%)	0.46→0.65 (+40.9%)	0.68→0.87 (+28.4%)
20	0.42→0.60 (+42.7%)	0.72→0.90 (+24.0%)	0.48→0.66 (+37.4%)	0.77→0.92 (+20.7%)
50	0.42→0.60 (+42.0%)	0.82→0.95 (+15.3%)	0.51→0.68 (+33.7%)	0.85→0.96 (+12.7%)
100	0.43→0.60 (+41.9%)	0.88→0.97 (+10.0%)	0.52→0.68 (+31.7%)	0.90→0.98 (+8.3%)

결론

관점	Before	After	개선
정답 평균 순위	2~3위	1~2위	더 빨리 찾음
상위 10개 포함률	63%	83%	+20%p
검색 실패율	32%	13%	절반 이상 감소

결론: 파인튜닝 후 221만개 상품 중에서 검색 정확도가 30~40% 향상되었습니다. RAG 시스템에서 관련 상품을 더 정확하게 찾아줄 수 있습니다.

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

TripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification},
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}