SentenceTransformer based on sentence-transformers/paraphrase-MiniLM-L6-v2

This is a sentence-transformers model finetuned from sentence-transformers/paraphrase-MiniLM-L6-v2 on the ssf-train-valid-splits dataset. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False, 'architecture': 'BertModel'})
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Fatini/ssf-retriever-modernbert-embed-base-paraphrase-MiniLM-L6-v2")
# Run inference
sentences = [
    'The Traffic Coordinator/Dispatch Coordinator is responsible for supporting the execution of general transportation operations and activities including transport fleet management documentation, receiving and communicating schedules to transport operators and cargo loaders, and gathering general information from customers to support transport order fulfilments. Systematic and logical, he/she is required to record documentation and ensure schedules are communicated and received. He is also expected to work in rotating shifts with high accuracy and precision, and to work with internal and external stakeholders to accomplish his work.',
    'The Traffic Coordinator/Dispatch Coordinator plays a crucial role in facilitating transportation operations by managing fleet documentation, effectively communicating schedules to transport operators and cargo loaders, and collecting essential information from customers to ensure smooth order fulfillment. This position demands a systematic and logical approach to record-keeping and schedule management. The individual must demonstrate high accuracy and precision while working in rotating shifts and collaborating with both internal and external stakeholders to achieve operational goals.',
    'The Traffic Coordinator/Dispatch Coordinator is tasked with overseeing the execution of general transportation operations and activities, including managing transport fleet communication, receiving and documenting schedules from cargo loaders and transport operators, and gathering data from customers to aid in transport order processing. Methodical and organized, he/she is expected to maintain documentation and ensure that all schedules are accurately communicated and received. He is also required to work in fixed shifts with minimal attention to detail, and to engage with external and internal customers to fulfill his duties.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.8714, 0.8014],
#         [0.8714, 1.0000, 0.7370],
#         [0.8014, 0.7370, 1.0000]])

Training Details

Training Dataset

ssf-train-valid-splits

  • Dataset: ssf-train-valid-splits at 5d1e49b
  • Size: 6,032 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 54 tokens
    • mean: 123.25 tokens
    • max: 128 tokens
    • min: 55 tokens
    • mean: 122.54 tokens
    • max: 128 tokens
    • min: 53 tokens
    • mean: 120.0 tokens
    • max: 128 tokens
  • Samples:
    anchor positive negative
    The Assistant Civil and Structural Engineer/Technical Executive (Civil and Structural Engineering) supports planning and development of projects and assists in the development of engineering designs based on project requirements, from conceptual to schematic and detailed designs. He/She assists in the designing and coordination of design models. He also executes risk assessments to identify risks associated with the projects. He is meticulous and highly detail-oriented. He possesses good knowledge in civil and structural practices, is analytical and has good problem-solving skills. He is required to work both in office and at project sites. The Assistant Civil and Structural Engineer/Technical Executive (Civil and Structural Engineering) plays a crucial role in the planning and development of engineering projects, contributing to the creation of designs that meet project specifications, from initial concepts to detailed schematics. He/She is involved in the design and coordination of various engineering models. Additionally, he/she conducts risk assessments to pinpoint potential project risks. He is known for his meticulous attention to detail and possesses strong analytical and problem-solving abilities. This role requires a balance of office work and field activities at project sites. The Assistant Civil and Structural Engineer/Technical Executive (Civil and Structural Engineering) focuses on the maintenance and evaluation of existing structures and aids in the analysis of engineering specifications based on operational needs, from preliminary assessments to final evaluations. He/She is responsible for the review and adjustment of engineering plans. He also performs quality checks to determine compliance with regulatory standards. He is recognized for his thoroughness and has extensive knowledge in structural assessments, is critical and has strong evaluation skills. He is expected to operate solely in office environments without site visits.
    The Market and Liquidity Risk Manager is responsible for the implementation of market and liquidity risk management frameworks. He/She conducts analyses and assessments of various market and liquidity scenarios and how it impacts the organisation's risk appetite and funding capacity. He oversees the monitoring of risk controls and thresholds. The Market and Liquidity Risk Manager's duties may require him to be contactable after office hours. He has excellent analytical, strategic planning, problem resolution and communication skills. He is comfortable working in deadline-driven environments, and can manage multiple responsibilities while effectively focusing on priority issues. The Market Risk and Liquidity Management Lead is tasked with the development and execution of comprehensive market and liquidity risk management strategies. This role involves conducting thorough analyses and evaluations of diverse market and liquidity conditions and their implications for the organization's risk tolerance and financial stability. The Lead supervises the assessment of risk controls and limits, ensuring compliance and effectiveness. Availability after standard working hours may be necessary for this position. The ideal candidate possesses strong analytical abilities, strategic planning expertise, adept problem-solving skills, and exceptional communication capabilities. They thrive in fast-paced environments and can juggle various responsibilities while maintaining a focus on key priorities. The Office Maintenance Coordinator is responsible for managing the upkeep and cleanliness of office spaces. This role includes scheduling routine cleaning services, ensuring that supplies are stocked, and addressing maintenance requests from staff. The Coordinator typically works during regular business hours and does not require after-hours availability. Candidates should have basic organizational skills, attention to detail, and the ability to follow simple instructions. They may work independently but primarily focus on maintaining a tidy environment rather than dealing with complex problem-solving or strategic planning.
    The Client Implementation Analyst is responsible for handling clients' queries and processing issues. He/She is responsible for coordinating communications with clients in order to understand their needs, expectations and potential conflicts. He provides support in compiling documentation and completing administrative tasks for the implementation process as well as in facilitating interactions with internal stakeholders. The Client Implementation Analyst excels at communicating effectively and builds strong relationships with customers and internal stakeholders. He prioritises clients' needs and is committed to supporting the delivery of timely client solutions. The Client Solutions Manager plays a pivotal role in addressing client inquiries and resolving issues efficiently. He/She is tasked with coordinating client communications to gain insights into their requirements, expectations, and any potential challenges. This role involves supporting the preparation of documentation and managing administrative responsibilities throughout the client implementation process, as well as enhancing collaboration with internal teams. The Client Solutions Manager is skilled in effective communication and fosters strong partnerships with both clients and internal stakeholders. He prioritizes client satisfaction and is dedicated to ensuring the timely delivery of solutions to meet client needs. The Data Entry Specialist is primarily focused on inputting large volumes of data into databases and ensuring accuracy in records. He/She is responsible for maintaining data integrity and performing routine audits to verify information. This role does not involve direct client interaction or the management of client expectations. Instead, the Data Entry Specialist works independently, following established protocols to complete data entry tasks efficiently. Attention to detail is crucial, but the position is limited to administrative functions and does not require strategic decision-making or collaboration with external stakeholders.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false
    }
    

Evaluation Dataset

ssf-train-valid-splits

  • Dataset: ssf-train-valid-splits at 5d1e49b
  • Size: 1,508 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 60 tokens
    • mean: 123.03 tokens
    • max: 128 tokens
    • min: 68 tokens
    • mean: 122.25 tokens
    • max: 128 tokens
    • min: 62 tokens
    • mean: 119.82 tokens
    • max: 128 tokens
  • Samples:
    anchor positive negative
    The Assistant Chartering Broker/Trainee Chartering Broker supports senior chartering brokers by identifying ships that meet clients requirements and are available for charter, performing voyage calculations, and preparing contracts and/or charter parties for both cargo owners and ship owners. To do so, he/she monitors the freight, ship hire and cargo rates closely, analyses market data to identify potential clients, and ensures that service standards are met to build and maintain relationships with existing clientele. He has initiative and with a flair for numeracy and accuracy. The Junior Chartering Broker assists senior brokers in locating vessels that satisfy client specifications and are ready for chartering. This role involves conducting voyage calculations, drafting contracts, and preparing charter agreements for both cargo owners and ship owners. The Junior Broker closely monitors freight, vessel hire, and cargo rates, analyzes market trends to uncover potential clients, and upholds service excellence to foster and sustain relationships with current clients. A strong sense of initiative and keen numerical skills are essential for success in this position. The Data Entry Specialist is responsible for inputting and managing data within the company's database. This role emphasizes accuracy and attention to detail, requiring the individual to handle routine tasks such as updating records, verifying information, and generating basic reports. The Data Entry Specialist works independently and does not interact with clients or stakeholders, focusing solely on maintaining the integrity of the data. Proficiency in typing and familiarity with database software are key skills needed for this position.
    The Quality Assurance Tester participates in the development process for games to ensure design quality and adherence to the standards. He/She is involved in tasks that include game design, source code development, review and control, configuration management and integration of different game elements. Prior to the release of games, he is involved in analysis of game playtesting to ensure that games meet or exceed specified standards and end user requirements. He spends most of his time in playtesting and evaluating games for various projects. He also spends a significant amount of time in aligning internal stakeholders on the quality assurance aspects of the game. He should have an eye for detail to spot and identify errors and discrepancies. He is systematic and highly organised, with the ability to work on his own and function as part of a team. He should also be able to think creatively to solve problems. The Game Quality Assurance Specialist plays a crucial role in the game development lifecycle, ensuring that games meet design specifications and quality standards. This position encompasses responsibilities such as game design evaluation, source code analysis, review and oversight of game elements, and the integration of various components. Before games are launched, the specialist conducts thorough playtesting and analysis to verify that the final product aligns with user expectations and industry benchmarks. A significant portion of the role is dedicated to evaluating game performance across multiple projects while collaborating with internal stakeholders to uphold quality assurance protocols. Attention to detail is essential for identifying errors and inconsistencies, and the specialist must be both methodical and organized, capable of working independently as well as collaboratively within a team. Creative problem-solving skills are also vital for overcoming challenges in the devel... The Data Entry Clerk is responsible for inputting large volumes of information into databases and maintaining accurate records. This role involves tasks such as verifying data integrity, creating spreadsheets, and managing documentation for various administrative functions. The clerk spends most of their time performing repetitive data entry tasks and ensuring that all information is updated and organized. They work independently with minimal interaction with other departments, focusing on meeting daily quotas rather than collaborating on projects. Attention to detail is important for avoiding errors, but the role does not require creative thinking or problem-solving, as tasks are strictly defined and routine.
    The Bellhop/Bell Attendant creates the first impression to arriving property guests. He/She directs vehicular flow at the driveway, greets guests and directs them to the check-in desk. He provides luggage and item delivery assistance, escorts guests to their designated rooms, explains the use of room amenities and facilities, as well as addresses guests' queries and requests. As a service ambassador, he maintains a professional image at all times and possesses a wealth of knowledge of the tourist areas and attractions around the property to provide general direction and tourist information to guests. He assists guests with physical disabilities or special needs at the entrance or lobby. He complies with organisational and regulatory requirements as he carries out his duties and stays vigilant to report any suspicious characters, activities and items to ensure workplace safety and the security of the property. He is well-groomed, confident and passionate in delivering excellent guest se... The Bellhop/Bell Attendant is responsible for creating a welcoming atmosphere for guests upon their arrival at the property. He/She manages the flow of vehicles at the entrance, warmly greets guests, and guides them to the check-in area. The role includes assisting with luggage and delivering items, escorting guests to their rooms, explaining the amenities and facilities available, and addressing any inquiries or requests they may have. As a representative of the service, he maintains a polished appearance at all times and is knowledgeable about local attractions and tourist information to assist guests effectively. Additionally, he provides support to guests with physical disabilities or special needs in the lobby area. He adheres to organizational policies and safety regulations, remaining alert to report any suspicious activities or individuals to ensure the security of the property. He exemplifies professionalism, is dedicated to providing outstanding guest service, and possesses e... The Bellhop/Bell Attendant is tasked with creating the final impression for departing property guests. He/She manages the flow of luggage at the loading dock, bids farewell to guests, and directs them to their vehicles. He provides cleaning and maintenance assistance, ensures guests have completed their check-out procedures, explains the return of room keys and charges, as well as addresses guests' complaints and dissatisfaction. As a service representative, he maintains a casual demeanor at times and possesses little knowledge of the property or the surrounding areas to provide limited direction and tourist information to guests. He does not assist guests with physical disabilities or special needs at the exit or parking area. He disregards organizational and regulatory requirements as he carries out his duties and fails to report any unusual occurrences, compromising workplace safety and the security of the property. He is unkempt, indifferent, and lacks enthusiasm in delivering sati...
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 16
  • gradient_accumulation_steps: 16
  • learning_rate: 2e-05
  • num_train_epochs: 5
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • bf16: True
  • tf32: False
  • load_best_model_at_end: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 16
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 5
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: False
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss Validation Loss
1.0 12 0.2957 0.0722
2.0 24 0.079 0.0446
3.0 36 0.0517 0.0350
4.0 48 0.0443 0.0323
5.0 60 0.0438 0.032
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.12.11
  • Sentence Transformers: 5.1.0
  • Transformers: 4.55.4
  • PyTorch: 2.8.0+cu128
  • Accelerate: 1.10.1
  • Datasets: 4.0.0
  • Tokenizers: 0.21.4

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
3
Safetensors
Model size
22.7M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Fatini/ssf-retriever-modernbert-embed-base-paraphrase-MiniLM-L6-v2

Finetuned
(21)
this model

Dataset used to train Fatini/ssf-retriever-modernbert-embed-base-paraphrase-MiniLM-L6-v2

Papers for Fatini/ssf-retriever-modernbert-embed-base-paraphrase-MiniLM-L6-v2