You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

yes24-task-targeted-embedding-0.6B

YES24 도서 커머스 검색에 특화된 0.6B 한국어 임베딩 모델입니다.

Qwen3-0.6B를 기반으로 2단계 학습을 거쳤습니다:

  1. Stage 1: MAKI 검색-클릭 로그(4.5M pairs)로 query-document distillation
  2. Stage 2 (v7): NV-Retriever TopK-PercPos로 마이닝한 584k hard negative triplet으로 retrieval LoRA 학습 후 merge

Performance

MAKI holdout 1,000 queries, 13,476 docs 기준:

Model MRR Recall@1 nDCG@1 nDCG@10
yes24-task-targeted-0.6B (v7) 0.6746 0.558 0.7638 0.8644
bge-m3-yes24-ft (568M) 0.6692 0.548 0.7585 0.8612
yes24-task-targeted-0.6B (v6) 0.6635 0.534 0.7507 0.8631
stage1-base (no LoRA) 0.6557 0.523 0.7438 0.8603

v7 vs v6 개선

Metric v6 v7 Delta
MRR 0.6635 0.6746 +0.0111
Recall@1 0.534 0.558 +0.024
nDCG@1 0.7507 0.7638 +0.0131
nDCG@10 0.8631 0.8644 +0.0013

63% 데이터량(584k vs 927k)으로 전 지표 개선. TopK-PercPos 마이닝이 false negative를 효과적으로 제거한 결과.

Model Details

  • Architecture: Qwen3-0.6B + LoRA (rank=32) merged
  • Embedding Dimension: 1024
  • Max Sequence Length: 384 tokens
  • Pooling: Last token
  • Normalization: L2 normalize
  • Similarity Function: Cosine similarity (= dot product after normalization)

Usage

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("Ja-ck/yes24-task-targeted-embedding-0.6B")

queries = ["Query: 개브리얼 제빈"]
documents = [
    "Document: Title: 내일 또 내일 또 내일\nCategory: 영미소설",
    "Document: Title: 파친코\nCategory: 영미소설",
]

q_emb = model.encode(queries)
d_emb = model.encode(documents)

similarities = model.similarity(q_emb, d_emb)
print(similarities)

Note: Query에는 Query: prefix, Document에는 Document: prefix를 붙여야 합니다.

Training Details

Stage 2 (v7) Training

  • Hard Negative Mining: NV-Retriever TopK-PercPos (relative_margin=0.95, absolute_threshold=0.90)
  • Teacher Model: nlpai-lab/KURE-v1 (1024d, Title+Author+Category embedding)
  • Triplets: 583,923 (hard 84%, medium 16%)
  • Loss: InfoNCE + 2×Cosine Distillation (Qwen3-Embedding-4B teacher) + GOR
  • LoRA: rank=32, alpha=32, target=q/k/v/o/gate/up/down_proj
  • Training: 2 epochs, batch=160×4 grad accum, lr=1.5e-4, bf16
  • Hardware: RunPod H100 SXM 80GB

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 384, 'do_lower_case': False, 'architecture': 'Qwen3Model'})
  (1): Pooling({'pooling_mode_lasttoken': True})
  (2): Normalize()
)

Framework Versions

  • Sentence Transformers: 5.x
  • Transformers: 5.x
  • PyTorch: 2.10+
  • PEFT: 0.18+
Downloads last month
-
Safetensors
Model size
0.6B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support