| ---
|
| language:
|
| - en
|
| license: apache-2.0
|
| library_name: sentence-transformers
|
| tags:
|
| - sentence-transformers
|
| - sentence-similarity
|
| - feature-extraction
|
| - dense-retrieval
|
| - information-retrieval
|
| - job-skill-matching
|
| - esco
|
| - talentclef
|
| - xlm-roberta
|
| base_model: jjzha/esco-xlm-roberta-large
|
| pipeline_tag: sentence-similarity
|
| model-index:
|
| - name: skillscout-large
|
| results:
|
| - task:
|
| type: information-retrieval
|
| name: Information Retrieval
|
| dataset:
|
| name: TalentCLEF 2026 Task B Validation
|
| type: talentclef-2026-taskb-validation
|
| metrics:
|
| - type: cosine_ndcg_at_10
|
| value: 0.4830
|
| name: nDCG@10
|
| - type: cosine_map_at_100
|
| value: 0.1825
|
| name: MAP@100
|
| - type: cosine_mrr_at_10
|
| value: 0.6657
|
| name: MRR@10
|
| - type: cosine_accuracy_at_10
|
| value: 0.9474
|
| name: Accuracy@10
|
| ---
|
|
|
| # SkillScout Large - Job-to-Skill Dense Retriever
|
|
|
| **SkillScout Large** is a dense bi-encoder for retrieving relevant skills from a job title.
|
| Given a job title (e.g., *"Data Scientist"*), it produces a 1024-dimensional embedding and
|
| retrieves the most semantically relevant skills from the [ESCO](https://esco.ec.europa.eu/)
|
| skill gazetteer (9,052 skills) via cosine similarity.
|
|
|
| This is **Stage 1** of the TalentGuide two-stage job-skill matching pipeline, trained for
|
| [TalentCLEF 2026 Task B](https://talentclef.github.io/).
|
|
|
| > **Best pipeline result (TalentCLEF 2026 validation set):**
|
| > nDCG@10 graded = **0.6896** | nDCG@10 binary = **0.7330**
|
| > when combined with a fine-tuned cross-encoder at blend alpha=0.7.
|
| > Bi-encoder alone: nDCG@10 graded = **0.3621** | MAP = **0.4545**
|
|
|
| ---
|
|
|
| ## Model Summary
|
|
|
| | Property | Value |
|
| |---|---|
|
| | Base model | [jjzha/esco-xlm-roberta-large](https://huggingface.co/jjzha/esco-xlm-roberta-large) |
|
| | Architecture | XLM-RoBERTa-large + mean pooling |
|
| | Embedding dimension | 1024 |
|
| | Max sequence length | 64 tokens |
|
| | Training loss | Multiple Negatives Ranking (MNR) |
|
| | Training pairs | 93,720 (ESCO job-skill pairs, essential + optional) |
|
| | Epochs | 3 |
|
| | Best checkpoint | Step 3500 (by validation nDCG@10) |
|
| | Hardware | NVIDIA RTX 3070 8GB, fp16 AMP |
|
|
|
| ---
|
|
|
| ## What is TalentCLEF Task B?
|
|
|
| **TalentCLEF 2026 Task B** is a graded information-retrieval shared task:
|
|
|
| - **Query**: a job title (e.g., *"Electrician"*)
|
| - **Corpus**: 9,052 ESCO skills (e.g., *"install electric switches"*)
|
| - **Relevance levels**: `2` = Core, `1` = Contextual, `0` = Non-relevant
|
| - **Primary metric**: nDCG with graded relevance (core=2, contextual=1)
|
|
|
| ---
|
|
|
| ## Usage
|
|
|
| ### Installation
|
|
|
| ```bash
|
| pip install sentence-transformers faiss-cpu
|
| ```
|
|
|
| ### Encode and Compare
|
|
|
| ```python
|
| from sentence_transformers import SentenceTransformer
|
|
|
| model = SentenceTransformer("talentguide/skillscout-large")
|
|
|
| job = "Data Scientist"
|
| skills = ["data science", "machine learning", "install electric switches"]
|
|
|
| embs = model.encode([job] + skills, normalize_embeddings=True)
|
| scores = embs[0] @ embs[1:].T
|
|
|
| for skill, score in zip(skills, scores):
|
| print(f"{score:.3f} {skill}")
|
| # 0.872 data science
|
| # 0.731 machine learning
|
| # 0.112 install electric switches
|
| ```
|
|
|
| ### Full Retrieval with FAISS (Recommended)
|
|
|
| ```python
|
| from sentence_transformers import SentenceTransformer
|
| import faiss, numpy as np
|
|
|
| model = SentenceTransformer("talentguide/skillscout-large")
|
|
|
| # Build index once over your skill corpus
|
| skill_texts = [...] # list of skill names
|
|
|
| embs = model.encode(skill_texts, batch_size=128,
|
| normalize_embeddings=True,
|
| show_progress_bar=True).astype(np.float32)
|
|
|
| index = faiss.IndexFlatIP(embs.shape[1]) # inner product on L2-normed = cosine
|
| index.add(embs)
|
|
|
| job_title = "Software Engineer"
|
| q = model.encode([job_title], normalize_embeddings=True).astype(np.float32)
|
| scores, idxs = index.search(q, k=50)
|
|
|
| for rank, (idx, score) in enumerate(zip(idxs[0], scores[0]), 1):
|
| print(f"{rank:3d}. [{score:.4f}] {skill_texts[idx]}")
|
| ```
|
|
|
| ### Demo Output
|
|
|
| ```
|
| Software Engineer
|
| 1. [0.942] define software architecture
|
| 2. [0.938] software frameworks
|
| 3. [0.935] create software design
|
|
|
| Data Scientist
|
| 1. [0.951] data science
|
| 2. [0.921] establish data processes
|
| 3. [0.919] create data models
|
|
|
| Electrician
|
| 1. [0.944] install electric switches
|
| 2. [0.938] install electricity sockets
|
| 3. [0.930] use electrical wire tools
|
| ```
|
|
|
| ---
|
|
|
| ## Two-Stage Pipeline Integration
|
|
|
| ```
|
| Job title
|
| |
|
| v
|
| [SkillScout Large] <- this model
|
| | top-200 candidates via FAISS ANN
|
| v
|
| [Cross-encoder re-ranker]
|
| | fine-grained re-scoring
|
| v
|
| Final ranked list (graded: core > contextual > irrelevant)
|
| ```
|
|
|
| Blend formula (alpha=0.7 gives best validation results):
|
|
|
| ```python
|
| final_score = alpha * biencoder_score + (1 - alpha) * crossencoder_score
|
| ```
|
|
|
| ---
|
|
|
| ## Training Details
|
|
|
| ### Data
|
|
|
| Source: [ESCO occupational ontology](https://esco.ec.europa.eu/), TalentCLEF 2026 training split.
|
|
|
| | | Count |
|
| |---|---|
|
| | Job-skill pairs (essential) | ~57,500 |
|
| | Job-skill pairs (optional) | ~57,200 |
|
| | Total InputExamples | **93,720** |
|
| | Validation queries | 304 |
|
| | Validation corpus | 9,052 skills |
|
| | Validation qrels | 56,417 |
|
|
|
| Each ESCO job has 5-15 title aliases; skills have multiple phrasings.
|
| Optional pairs are downsampled to 50% of essential count to maintain class balance.
|
|
|
| ### Hyperparameters
|
|
|
| ```
|
| Loss : MultipleNegativesRankingLoss (scale=20, cos_sim)
|
| Batch size : 64 (63 in-batch negatives per anchor)
|
| Epochs : 3
|
| Warmup : 10% of steps (~440 steps)
|
| Optimizer : AdamW fused
|
| Learning rate : 5e-5, linear decay
|
| Precision : fp16 AMP
|
| Max seq len : 64 tokens
|
| Best model : saved by cosine-nDCG@10 on validation
|
| ```
|
|
|
| ### Training Curve
|
|
|
| | Epoch | Step | Train Loss | nDCG@10 val | MAP@100 val |
|
| |---|---|---|---|---|
|
| | 0.34 | 500 | 2.9232 | 0.3430 | - |
|
| | 0.68 | 1000 | 2.1179 | 0.3424 | - |
|
| | 1.00 | 1465 | - | 0.3676 | 0.1758 |
|
| | 1.37 | 2000 | 1.7070 | 0.3692 | - |
|
| | 1.71 | 2500 | 1.6366 | 0.3744 | - |
|
| | 2.00 | 2930 | - | 0.3717 | 0.1780 |
|
| | **2.39** | **3500** | **1.4540** | **0.3769** | **0.1808** |
|
|
|
| ### Validation Metrics (best checkpoint, step 3500)
|
|
|
| | Metric | Value |
|
| |---|---|
|
| | **nDCG@10** | **0.4830** |
|
| | nDCG@50 | 0.4240 |
|
| | nDCG@100 | 0.3769 |
|
| | **MAP@100** | **0.1825** |
|
| | **MRR@10** | **0.6657** |
|
| | Accuracy@1 | 0.5099 |
|
| | Accuracy@3 | 0.7993 |
|
| | Accuracy@5 | 0.8914 |
|
| | Accuracy@10 | 0.9474 |
|
|
|
| Evaluated with `InformationRetrievalEvaluator` (binary: any qrel > 0 = relevant).
|
|
|
| ### Pipeline Results (graded relevance, full 9052-skill ranking)
|
|
|
| | Run | nDCG@10 graded | nDCG@10 binary | MAP |
|
| |---|---|---|---|
|
| | Zero-shot `jjzha/esco-xlm-roberta-large` | 0.2039 | 0.2853 | 0.2663 |
|
| | **SkillScout Large (bi-encoder only)** | **0.3621** | **0.4830** | **0.4545** |
|
| | SkillScout Large + cross-encoder (alpha=0.7) | **0.6896** | **0.7330** | 0.2481 |
|
|
|
| ---
|
|
|
| ## Competitive Context (TalentCLEF 2025 Task B)
|
|
|
| | Team | MAP (test) | Approach |
|
| |---|---|---|
|
| | pjmathematician (winner 2025) | 0.36 | GTE 7B + contrastive + LLM-augmented data |
|
| | NLPnorth (3rd of 14, 2025) | 0.29 | 3-class discriminative classification |
|
| | **SkillScout Large (2026 val, Stage 1 only)** | **0.4545** | MNR fine-tuned bi-encoder |
|
|
|
| ---
|
|
|
| ## Limitations
|
|
|
| - **English only** - trained on ESCO EN labels.
|
| - **ESCO-domain optimised** - transfer to O*NET or custom taxonomies may require fine-tuning.
|
| - **Max 64 tokens** - reduce long descriptions to a concise job title.
|
| - **Graded distinction** - the bi-encoder alone does not reliably separate core vs contextual skills; a cross-encoder re-ranker is recommended for graded nDCG.
|
|
|
| ---
|
|
|
| ## Citation
|
|
|
| ```bibtex
|
| @misc{talentguide-skillscout-2026,
|
| title = {SkillScout Large: Dense Job-to-Skill Retrieval for TalentCLEF 2026},
|
| author = {TalentGuide},
|
| year = {2026},
|
| url = {https://huggingface.co/talentguide/skillscout-large}
|
| }
|
|
|
| @misc{talentclef2026taskb,
|
| title = {TalentCLEF 2026 Task B: Job-Skill Matching},
|
| author = {TalentCLEF Organizers},
|
| year = {2026},
|
| url = {https://talentclef.github.io/}
|
| }
|
| ```
|
|
|
| ---
|
|
|
| ## Framework Versions
|
|
|
| - Python 3.12.10 | Sentence Transformers 5.3.0 | Transformers 5.5.0
|
| - PyTorch 2.11.0+cu128 | Accelerate 1.13.0 | Tokenizers 0.22.2
|
| |