--- language: - en license: apache-2.0 library_name: sentence-transformers tags: - sentence-transformers - sentence-similarity - feature-extraction - dense-retrieval - information-retrieval - job-skill-matching - esco - talentclef - xlm-roberta base_model: jjzha/esco-xlm-roberta-large pipeline_tag: sentence-similarity model-index: - name: skillscout-large results: - task: type: information-retrieval name: Information Retrieval dataset: name: TalentCLEF 2026 Task B Validation type: talentclef-2026-taskb-validation metrics: - type: cosine_ndcg_at_10 value: 0.4830 name: nDCG@10 - type: cosine_map_at_100 value: 0.1825 name: MAP@100 - type: cosine_mrr_at_10 value: 0.6657 name: MRR@10 - type: cosine_accuracy_at_10 value: 0.9474 name: Accuracy@10 --- # SkillScout Large - Job-to-Skill Dense Retriever **SkillScout Large** is a dense bi-encoder for retrieving relevant skills from a job title. Given a job title (e.g., *"Data Scientist"*), it produces a 1024-dimensional embedding and retrieves the most semantically relevant skills from the [ESCO](https://esco.ec.europa.eu/) skill gazetteer (9,052 skills) via cosine similarity. This is **Stage 1** of the TalentGuide two-stage job-skill matching pipeline, trained for [TalentCLEF 2026 Task B](https://talentclef.github.io/). > **Best pipeline result (TalentCLEF 2026 validation set):** > nDCG@10 graded = **0.6896** | nDCG@10 binary = **0.7330** > when combined with a fine-tuned cross-encoder at blend alpha=0.7. > Bi-encoder alone: nDCG@10 graded = **0.3621** | MAP = **0.4545** --- ## Model Summary | Property | Value | |---|---| | Base model | [jjzha/esco-xlm-roberta-large](https://huggingface.co/jjzha/esco-xlm-roberta-large) | | Architecture | XLM-RoBERTa-large + mean pooling | | Embedding dimension | 1024 | | Max sequence length | 64 tokens | | Training loss | Multiple Negatives Ranking (MNR) | | Training pairs | 93,720 (ESCO job-skill pairs, essential + optional) | | Epochs | 3 | | Best checkpoint | Step 3500 (by validation nDCG@10) | | Hardware | NVIDIA RTX 3070 8GB, fp16 AMP | --- ## What is TalentCLEF Task B? **TalentCLEF 2026 Task B** is a graded information-retrieval shared task: - **Query**: a job title (e.g., *"Electrician"*) - **Corpus**: 9,052 ESCO skills (e.g., *"install electric switches"*) - **Relevance levels**: `2` = Core, `1` = Contextual, `0` = Non-relevant - **Primary metric**: nDCG with graded relevance (core=2, contextual=1) --- ## Usage ### Installation ```bash pip install sentence-transformers faiss-cpu ``` ### Encode and Compare ```python from sentence_transformers import SentenceTransformer model = SentenceTransformer("talentguide/skillscout-large") job = "Data Scientist" skills = ["data science", "machine learning", "install electric switches"] embs = model.encode([job] + skills, normalize_embeddings=True) scores = embs[0] @ embs[1:].T for skill, score in zip(skills, scores): print(f"{score:.3f} {skill}") # 0.872 data science # 0.731 machine learning # 0.112 install electric switches ``` ### Full Retrieval with FAISS (Recommended) ```python from sentence_transformers import SentenceTransformer import faiss, numpy as np model = SentenceTransformer("talentguide/skillscout-large") # Build index once over your skill corpus skill_texts = [...] # list of skill names embs = model.encode(skill_texts, batch_size=128, normalize_embeddings=True, show_progress_bar=True).astype(np.float32) index = faiss.IndexFlatIP(embs.shape[1]) # inner product on L2-normed = cosine index.add(embs) job_title = "Software Engineer" q = model.encode([job_title], normalize_embeddings=True).astype(np.float32) scores, idxs = index.search(q, k=50) for rank, (idx, score) in enumerate(zip(idxs[0], scores[0]), 1): print(f"{rank:3d}. [{score:.4f}] {skill_texts[idx]}") ``` ### Demo Output ``` Software Engineer 1. [0.942] define software architecture 2. [0.938] software frameworks 3. [0.935] create software design Data Scientist 1. [0.951] data science 2. [0.921] establish data processes 3. [0.919] create data models Electrician 1. [0.944] install electric switches 2. [0.938] install electricity sockets 3. [0.930] use electrical wire tools ``` --- ## Two-Stage Pipeline Integration ``` Job title | v [SkillScout Large] <- this model | top-200 candidates via FAISS ANN v [Cross-encoder re-ranker] | fine-grained re-scoring v Final ranked list (graded: core > contextual > irrelevant) ``` Blend formula (alpha=0.7 gives best validation results): ```python final_score = alpha * biencoder_score + (1 - alpha) * crossencoder_score ``` --- ## Training Details ### Data Source: [ESCO occupational ontology](https://esco.ec.europa.eu/), TalentCLEF 2026 training split. | | Count | |---|---| | Job-skill pairs (essential) | ~57,500 | | Job-skill pairs (optional) | ~57,200 | | Total InputExamples | **93,720** | | Validation queries | 304 | | Validation corpus | 9,052 skills | | Validation qrels | 56,417 | Each ESCO job has 5-15 title aliases; skills have multiple phrasings. Optional pairs are downsampled to 50% of essential count to maintain class balance. ### Hyperparameters ``` Loss : MultipleNegativesRankingLoss (scale=20, cos_sim) Batch size : 64 (63 in-batch negatives per anchor) Epochs : 3 Warmup : 10% of steps (~440 steps) Optimizer : AdamW fused Learning rate : 5e-5, linear decay Precision : fp16 AMP Max seq len : 64 tokens Best model : saved by cosine-nDCG@10 on validation ``` ### Training Curve | Epoch | Step | Train Loss | nDCG@10 val | MAP@100 val | |---|---|---|---|---| | 0.34 | 500 | 2.9232 | 0.3430 | - | | 0.68 | 1000 | 2.1179 | 0.3424 | - | | 1.00 | 1465 | - | 0.3676 | 0.1758 | | 1.37 | 2000 | 1.7070 | 0.3692 | - | | 1.71 | 2500 | 1.6366 | 0.3744 | - | | 2.00 | 2930 | - | 0.3717 | 0.1780 | | **2.39** | **3500** | **1.4540** | **0.3769** | **0.1808** | ### Validation Metrics (best checkpoint, step 3500) | Metric | Value | |---|---| | **nDCG@10** | **0.4830** | | nDCG@50 | 0.4240 | | nDCG@100 | 0.3769 | | **MAP@100** | **0.1825** | | **MRR@10** | **0.6657** | | Accuracy@1 | 0.5099 | | Accuracy@3 | 0.7993 | | Accuracy@5 | 0.8914 | | Accuracy@10 | 0.9474 | Evaluated with `InformationRetrievalEvaluator` (binary: any qrel > 0 = relevant). ### Pipeline Results (graded relevance, full 9052-skill ranking) | Run | nDCG@10 graded | nDCG@10 binary | MAP | |---|---|---|---| | Zero-shot `jjzha/esco-xlm-roberta-large` | 0.2039 | 0.2853 | 0.2663 | | **SkillScout Large (bi-encoder only)** | **0.3621** | **0.4830** | **0.4545** | | SkillScout Large + cross-encoder (alpha=0.7) | **0.6896** | **0.7330** | 0.2481 | --- ## Competitive Context (TalentCLEF 2025 Task B) | Team | MAP (test) | Approach | |---|---|---| | pjmathematician (winner 2025) | 0.36 | GTE 7B + contrastive + LLM-augmented data | | NLPnorth (3rd of 14, 2025) | 0.29 | 3-class discriminative classification | | **SkillScout Large (2026 val, Stage 1 only)** | **0.4545** | MNR fine-tuned bi-encoder | --- ## Limitations - **English only** - trained on ESCO EN labels. - **ESCO-domain optimised** - transfer to O*NET or custom taxonomies may require fine-tuning. - **Max 64 tokens** - reduce long descriptions to a concise job title. - **Graded distinction** - the bi-encoder alone does not reliably separate core vs contextual skills; a cross-encoder re-ranker is recommended for graded nDCG. --- ## Citation ```bibtex @misc{talentguide-skillscout-2026, title = {SkillScout Large: Dense Job-to-Skill Retrieval for TalentCLEF 2026}, author = {TalentGuide}, year = {2026}, url = {https://huggingface.co/talentguide/skillscout-large} } @misc{talentclef2026taskb, title = {TalentCLEF 2026 Task B: Job-Skill Matching}, author = {TalentCLEF Organizers}, year = {2026}, url = {https://talentclef.github.io/} } ``` --- ## Framework Versions - Python 3.12.10 | Sentence Transformers 5.3.0 | Transformers 5.5.0 - PyTorch 2.11.0+cu128 | Accelerate 1.13.0 | Tokenizers 0.22.2