TalentCLEF-2025
Collection
Job to Job and Job to Skill matching sentence transformer models • 9 items • Updated • 1
How to use pj-mathematician/JobSkillGTE-7b-lora with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("pj-mathematician/JobSkillGTE-7b-lora")
sentences = [
"Bus drivers, including those operating in various sectors like public transit, intercity, private, or school services, need strong driving skills, knowledge of traffic laws, and the ability to operate safely in diverse conditions. Additionally, effective communication skills and the ability to handle passenger inquiries and emergencies are crucial.\n['bus driver', 'intercity bus driver', 'private bus operator', 'transit bus driver', 'public service vehicle operator', 'passenger driver', 'international bus driver', 'public bus operator', 'touristic bus driver', 'coach driver', 'private coach driver', 'public bus driver', 'bus operator', 'driver of bus', 'bus driving operator', 'schoolbus driver']",
"The skill of determining shreds sizes percentage in cigarettes is primarily required by tobacco processing technicians and quality control specialists in the cigarette manufacturing industry, who ensure that the tobacco shreds meet specific size and quality standards for consistent product performance.\n['determine shreds sizes percentage in cigarettes', 'determine shreds sizes percentage in cigarettes', 'determine the shreds sizes percentage of cigarettes', 'determine shreds size percentages in cigarettes', 'agree shreds sizes percentage in cigarettes', 'determine the shreds sizes percentage in cigarettes', 'confirm shreds sizes percentage in cigarettes', 'sort shreds sizes percentage in cigarettes']",
"Job roles such as curriculum developers, educational consultants, and instructional designers require skills like analyzing, evaluating, and scrutinizing curriculums to improve educational outcomes. For legislative programmes, roles including policy analysts, legislative aides, and compliance officers use skills to test, evaluate, and scrutinize legislative processes to ensure effective and efficient policy implementation.\n['analyse curriculum', 'test legislative programmes', 'evaluate legislative programmes', 'evaluate curriculum', 'test curriculum', 'investigate curriculum', 'scrutinise curriculum', 'analyze curriculum', 'scrutinise legislative processes', 'investigate legislative programmes']",
"Job roles such as customer service representatives, flight attendants, and hotel concierges require a strong focus on passengers or customers, ensuring their needs and comfort are prioritized to provide excellent service and support.\n['focus on passengers', 'prioritise passengers', 'ensure passenger prioritisation', 'make passengers a priority', 'maintain a focus on passengers', 'ensure passengers are the priority focus', 'ensure passengers are prioritised', 'attend to passengers', 'ensure a focus on passengers']"
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]Top performing model on TalentCLEF 2025 Task B. Use it for job title <-> skill set matching
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: Qwen2Model
(1): Pooling({'word_embedding_dimension': 3584, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': True, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("pj-mathematician/JobSkillGTE-7b-lora")
# Run inference
sentences = [
"An insulation supervisor, regardless of the specific type of insulation material or installation area, requires strong project management skills, knowledge of building codes and safety regulations, and expertise in insulation techniques to oversee the installation process effectively and ensure quality standards are met.\n['insulation supervisor', 'supervisor of installation of insulating materials', 'supervisor of insulation materials installation', 'supervisor of installation of insulation', 'solid wall insulation installation supervisor', 'insulation installers supervisor', 'cavity wall insulation installation supervisor', 'loft insulation installation supervisor']",
"The skill of installing insulation material is primarily required by job roles such as insulation workers, HVAC technicians, and construction specialists, who are responsible for improving energy efficiency and thermal comfort in buildings by correctly fitting and fixing insulation materials in various structures.\n['install insulation material', 'insulate structure', 'fix insulation', 'insulation material installation', 'installation of insulation material', 'fitting insulation', 'insulating structure', 'installing insulation material', 'fixing insulation', 'fit insulation']",
"Job roles such as Food Safety Inspector, Public Health Officer, and Environmental Health Specialist require the skill of taking action on food safety violations to ensure compliance with health regulations and maintain public safety standards.\n['take action on food safety violations', 'invoke action on food safety violations', 'agree action on food safety violations', 'pursue action on food safety violations', 'determine action on food safety violations']",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 3584]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
anchor and positive| anchor | positive | |
|---|---|---|
| type | string | string |
| details |
|
|
| anchor | positive |
|---|---|
A technical director or any of its synonyms requires a strong blend of technical expertise and leadership skills, including the ability to oversee technical operations, manage teams, and ensure the successful execution of technical projects while maintaining operational efficiency and innovation. |
Job roles that require promoting health and safety include occupational health and safety specialists, safety managers, and public health educators, all of whom work to ensure safe and healthy environments in workplaces and communities. |
A technical director or any of its synonyms requires a strong blend of technical expertise and leadership skills, including the ability to oversee technical operations, manage teams, and ensure the successful execution of technical projects while maintaining operational efficiency and innovation. |
Job roles that require organizing rehearsals include directors, choreographers, and conductors in theater, dance, and music ensembles, who must efficiently plan and schedule practice sessions to prepare performers for a successful final performance. |
A technical director or any of its synonyms requires a strong blend of technical expertise and leadership skills, including the ability to oversee technical operations, manage teams, and ensure the successful execution of technical projects while maintaining operational efficiency and innovation. |
Job roles such as Health and Safety Managers, Environmental Health Officers, and Risk Management Specialists often require the skill of negotiating health and safety issues with third parties to ensure compliance and protection standards are met across different organizations and sites. |
CachedGISTEmbedLoss with these parameters:{'guide': SentenceTransformer(
(0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
), 'temperature': 0.01, 'mini_batch_size': 48, 'margin_strategy': 'absolute', 'margin': 0.0}
per_device_train_batch_size: 128per_device_eval_batch_size: 128gradient_accumulation_steps: 2num_train_epochs: 2warmup_ratio: 0.05log_on_each_node: Falsefp16: Truedataloader_num_workers: 4fsdp: ['full_shard', 'auto_wrap']fsdp_config: {'transformer_layer_cls_to_wrap': ['Qwen2DecoderLayer'], 'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}ddp_find_unused_parameters: Truegradient_checkpointing: Truebatch_sampler: no_duplicatesoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 128per_device_eval_batch_size: 128per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 2eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 2max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.05warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Falselogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Truedataloader_num_workers: 4dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: ['full_shard', 'auto_wrap']fsdp_min_num_params: 0fsdp_config: {'transformer_layer_cls_to_wrap': ['Qwen2DecoderLayer'], 'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}tp_size: 0fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Trueddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsegradient_checkpointing: Truegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: no_duplicatesmulti_dataset_batch_sampler: proportional| Epoch | Step | Training Loss |
|---|---|---|
| 0.0156 | 1 | 21.5186 |
| 0.0312 | 2 | 21.4075 |
| 0.0469 | 3 | 21.0309 |
| 0.0625 | 4 | 20.7294 |
| 0.0781 | 5 | 20.9851 |
| 0.0938 | 6 | 21.3215 |
| 0.1094 | 7 | 19.8458 |
| 0.125 | 8 | 18.52 |
| 0.1406 | 9 | 17.622 |
| 0.1562 | 10 | 17.5794 |
| 0.1719 | 11 | 15.8784 |
| 0.1875 | 12 | 14.5842 |
| 0.2031 | 13 | 13.3324 |
| 0.2188 | 14 | 12.3194 |
| 0.2344 | 15 | 11.2523 |
| 0.25 | 16 | 10.7172 |
| 0.2656 | 17 | 10.0063 |
| 0.2812 | 18 | 9.5643 |
| 0.2969 | 19 | 9.2463 |
| 0.3125 | 20 | 8.6533 |
| 0.3281 | 21 | 8.0588 |
| 0.3438 | 22 | 8.1866 |
| 0.3594 | 23 | 7.6767 |
| 0.375 | 24 | 6.9832 |
| 0.3906 | 25 | 6.7932 |
| 0.4062 | 26 | 6.292 |
| 0.4219 | 27 | 6.1263 |
| 0.4375 | 28 | 5.8976 |
| 0.4531 | 29 | 5.7214 |
| 0.4688 | 30 | 5.6451 |
| 0.4844 | 31 | 5.6232 |
| 0.5 | 32 | 5.2984 |
| 0.5156 | 33 | 5.0322 |
| 0.5312 | 34 | 4.9435 |
| 0.5469 | 35 | 4.737 |
| 0.5625 | 36 | 4.4266 |
| 0.5781 | 37 | 4.5082 |
| 0.5938 | 38 | 4.315 |
| 0.6094 | 39 | 4.269 |
| 0.625 | 40 | 4.2473 |
| 0.6406 | 41 | 4.2054 |
| 0.6562 | 42 | 4.2172 |
| 0.6719 | 43 | 3.8311 |
| 0.6875 | 44 | 4.0803 |
| 0.7031 | 45 | 4.2809 |
| 0.7188 | 46 | 4.1843 |
| 0.7344 | 47 | 3.9913 |
| 0.75 | 48 | 3.9465 |
| 0.7656 | 49 | 4.0828 |
| 0.7812 | 50 | 4.0018 |
| 0.7969 | 51 | 3.8023 |
| 0.8125 | 52 | 3.897 |
| 0.8281 | 53 | 3.8941 |
| 0.8438 | 54 | 3.7708 |
| 0.8594 | 55 | 3.8051 |
| 0.875 | 56 | 3.7117 |
| 0.8906 | 57 | 3.8584 |
| 0.9062 | 58 | 3.6421 |
| 0.9219 | 59 | 3.7097 |
| 0.9375 | 60 | 3.6906 |
| 0.9531 | 61 | 3.7011 |
| 0.9688 | 62 | 3.744 |
| 0.9844 | 63 | 3.6493 |
| 1.0 | 64 | 3.5659 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
Base model
Alibaba-NLP/gte-Qwen2-7B-instruct