fyp / README.md
scr17's picture
Upload fine-tuned model
6986cfa
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:5000
  - loss:CosineSimilarityLoss
base_model: sentence-transformers/all-MiniLM-L6-v2
widget:
  - source_sentence: >-
      looking Product Manager expertise AWS Cybersecurity JavaScript Cloud
      Architecture candidate responsible designing implementing maintaining
      solutions using modern technologies
    sentences:
      - >-
        Emily Barry professional skilled JavaScript Machine Learning Kubernetes
        Computer Vision Experienced working multiple projects involving cloud
        technologies modern software development practices
      - >-
        Stephen Baker professional skilled React AWS Node.js NLP Experienced
        working multiple projects involving cloud technologies modern software
        development practices
      - >-
        James Jackson professional skilled Node.js Cybersecurity Kubernetes
        Docker Experienced working multiple projects involving cloud
        technologies modern software development practices
  - source_sentence: >-
      looking Software Engineer expertise AWS TensorFlow NLP Node.js candidate
      responsible designing implementing maintaining solutions using modern
      technologies
    sentences:
      - >-
        Jennifer Thompson professional skilled JavaScript TensorFlow Computer
        Vision Django Experienced working multiple projects involving cloud
        technologies modern software development practices
      - >-
        Lisa Bell professional skilled Python TensorFlow Computer Vision Machine
        Learning Experienced working multiple projects involving cloud
        technologies modern software development practices
      - >-
        Susan Rogers professional skilled Docker Cybersecurity Machine Learning
        Python Experienced working multiple projects involving cloud
        technologies modern software development practices
  - source_sentence: >-
      looking DevOps Engineer expertise Cybersecurity Machine Learning SQL
      TensorFlow candidate responsible designing implementing maintaining
      solutions using modern technologies
    sentences:
      - >-
        Kenneth Jones professional skilled NLP Node.js Cybersecurity Cloud
        Architecture Experienced working multiple projects involving cloud
        technologies modern software development practices
      - >-
        Matthew Mcintyre professional skilled NoSQL Kubernetes React Docker
        Experienced working multiple projects involving cloud technologies
        modern software development practices
      - >-
        William Wilson professional skilled SQL Kubernetes CI/CD Security
        Analysis Experienced working multiple projects involving cloud
        technologies modern software development practices
  - source_sentence: >-
      looking Software Engineer expertise Cybersecurity NLP SQL Django candidate
      responsible designing implementing maintaining solutions using modern
      technologies
    sentences:
      - >-
        Daniel Stewart professional skilled JavaScript Python Cybersecurity
        TensorFlow Experienced working multiple projects involving cloud
        technologies modern software development practices
      - >-
        Kristy Massey MD professional skilled Django Security Analysis
        JavaScript Cybersecurity Experienced working multiple projects involving
        cloud technologies modern software development practices
      - >-
        Melanie Sutton professional skilled Django CI/CD JavaScript SQL
        Experienced working multiple projects involving cloud technologies
        modern software development practices
  - source_sentence: >-
      looking AI Researcher expertise CI/CD Docker TensorFlow JavaScript
      candidate responsible designing implementing maintaining solutions using
      modern technologies
    sentences:
      - >-
        Dr. William Ramirez professional skilled NoSQL React CI/CD Cloud
        Architecture Experienced working multiple projects involving cloud
        technologies modern software development practices
      - >-
        Rebecca Wiley professional skilled Python Kubernetes Node.js JavaScript
        Experienced working multiple projects involving cloud technologies
        modern software development practices
      - >-
        Roberta Graham professional skilled Flask Machine Learning Node.js
        Docker Experienced working multiple projects involving cloud
        technologies modern software development practices
pipeline_tag: sentence-similarity
library_name: sentence-transformers

SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-MiniLM-L6-v2
  • Maximum Sequence Length: 256 tokens
  • Output Dimensionality: 384 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'looking AI Researcher expertise CI/CD Docker TensorFlow JavaScript candidate responsible designing implementing maintaining solutions using modern technologies',
    'Roberta Graham professional skilled Flask Machine Learning Node.js Docker Experienced working multiple projects involving cloud technologies modern software development practices',
    'Rebecca Wiley professional skilled Python Kubernetes Node.js JavaScript Experienced working multiple projects involving cloud technologies modern software development practices',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 5,000 training samples
  • Columns: sentence_0, sentence_1, and label
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 label
    type string string float
    details
    • min: 20 tokens
    • mean: 24.72 tokens
    • max: 32 tokens
    • min: 22 tokens
    • mean: 26.26 tokens
    • max: 34 tokens
    • min: 0.4
    • mean: 0.71
    • max: 1.0
  • Samples:
    sentence_0 sentence_1 label
    looking AI Researcher expertise CI/CD Python Computer Vision Flask candidate responsible designing implementing maintaining solutions using modern technologies Deanna Gibson professional skilled Security Analysis Node.js Machine Learning Kubernetes Experienced working multiple projects involving cloud technologies modern software development practices 0.481
    looking Machine Learning Engineer expertise AWS Kubernetes Python Django candidate responsible designing implementing maintaining solutions using modern technologies Amanda Johnson professional skilled AWS NLP Node.js Security Analysis Experienced working multiple projects involving cloud technologies modern software development practices 0.982
    looking Cybersecurity Analyst expertise JavaScript Python Node.js NoSQL candidate responsible designing implementing maintaining solutions using modern technologies Alicia Patton professional skilled Node.js TensorFlow SQL NoSQL Experienced working multiple projects involving cloud technologies modern software development practices 0.597
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • num_train_epochs: 30
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 30
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Epoch Step Training Loss
1.5974 500 0.0324
3.1949 1000 0.0298
4.7923 1500 0.028
6.3898 2000 0.025
7.9872 2500 0.0229
9.5847 3000 0.0198
11.1821 3500 0.0179
12.7796 4000 0.0156
14.3770 4500 0.014
15.9744 5000 0.0127
17.5719 5500 0.0115
19.1693 6000 0.0104
20.7668 6500 0.0098
22.3642 7000 0.009
23.9617 7500 0.0086
25.5591 8000 0.0082
27.1565 8500 0.0078
28.7540 9000 0.0076

Framework Versions

  • Python: 3.11.12
  • Sentence Transformers: 3.4.1
  • Transformers: 4.51.3
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.5.2
  • Datasets: 3.5.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}