yes24-task-targeted-embedding-0.6B
YES24 도서 커머스 검색에 특화된 0.6B 한국어 임베딩 모델입니다.
Qwen3-0.6B를 기반으로 2단계 학습을 거쳤습니다:
- Stage 1: MAKI 검색-클릭 로그(4.5M pairs)로 query-document distillation
- Stage 2 (v7): NV-Retriever TopK-PercPos로 마이닝한 584k hard negative triplet으로 retrieval LoRA 학습 후 merge
Performance
MAKI holdout 1,000 queries, 13,476 docs 기준:
| Model | MRR | Recall@1 | nDCG@1 | nDCG@10 |
|---|---|---|---|---|
| yes24-task-targeted-0.6B (v7) | 0.6746 | 0.558 | 0.7638 | 0.8644 |
| bge-m3-yes24-ft (568M) | 0.6692 | 0.548 | 0.7585 | 0.8612 |
| yes24-task-targeted-0.6B (v6) | 0.6635 | 0.534 | 0.7507 | 0.8631 |
| stage1-base (no LoRA) | 0.6557 | 0.523 | 0.7438 | 0.8603 |
v7 vs v6 개선
| Metric | v6 | v7 | Delta |
|---|---|---|---|
| MRR | 0.6635 | 0.6746 | +0.0111 |
| Recall@1 | 0.534 | 0.558 | +0.024 |
| nDCG@1 | 0.7507 | 0.7638 | +0.0131 |
| nDCG@10 | 0.8631 | 0.8644 | +0.0013 |
63% 데이터량(584k vs 927k)으로 전 지표 개선. TopK-PercPos 마이닝이 false negative를 효과적으로 제거한 결과.
Model Details
- Architecture: Qwen3-0.6B + LoRA (rank=32) merged
- Embedding Dimension: 1024
- Max Sequence Length: 384 tokens
- Pooling: Last token
- Normalization: L2 normalize
- Similarity Function: Cosine similarity (= dot product after normalization)
Usage
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("Ja-ck/yes24-task-targeted-embedding-0.6B")
queries = ["Query: 개브리얼 제빈"]
documents = [
"Document: Title: 내일 또 내일 또 내일\nCategory: 영미소설",
"Document: Title: 파친코\nCategory: 영미소설",
]
q_emb = model.encode(queries)
d_emb = model.encode(documents)
similarities = model.similarity(q_emb, d_emb)
print(similarities)
Note: Query에는
Query:prefix, Document에는Document:prefix를 붙여야 합니다.
Training Details
Stage 2 (v7) Training
- Hard Negative Mining: NV-Retriever TopK-PercPos (relative_margin=0.95, absolute_threshold=0.90)
- Teacher Model: nlpai-lab/KURE-v1 (1024d, Title+Author+Category embedding)
- Triplets: 583,923 (hard 84%, medium 16%)
- Loss: InfoNCE + 2×Cosine Distillation (Qwen3-Embedding-4B teacher) + GOR
- LoRA: rank=32, alpha=32, target=q/k/v/o/gate/up/down_proj
- Training: 2 epochs, batch=160×4 grad accum, lr=1.5e-4, bf16
- Hardware: RunPod H100 SXM 80GB
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 384, 'do_lower_case': False, 'architecture': 'Qwen3Model'})
(1): Pooling({'pooling_mode_lasttoken': True})
(2): Normalize()
)
Framework Versions
- Sentence Transformers: 5.x
- Transformers: 5.x
- PyTorch: 2.10+
- PEFT: 0.18+
- Downloads last month
- -