You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-MiniLM-L6-v2
  • Maximum Sequence Length: 256 tokens
  • Output Dimensionality: 384 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False, 'architecture': 'BertModel'})
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'experienced professional skilled in excel, future, president. heavy own politics goal smile. during benefit eight beat pick allow test break. this dark why later gun.',
    'data analyst needed with experience in data cleaning, power bi, sql. agreement meet coach team production concern. politics happy challenge challenge want.',
    'data analyst needed with experience in sql, data cleaning, tableau. movie lead so those moment blue. outside work tree pick man fear administration strong.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[ 1.0000, -0.1204, -0.1657],
#         [-0.1204,  1.0000,  0.8512],
#         [-0.1657,  0.8512,  1.0000]])

Training Details

Training Dataset

Unnamed Dataset

  • Size: 16,000 training samples
  • Columns: sentence_0, sentence_1, and label
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 label
    type string string float
    details
    • min: 20 tokens
    • mean: 45.17 tokens
    • max: 64 tokens
    • min: 22 tokens
    • mean: 73.51 tokens
    • max: 138 tokens
    • min: 0.0
    • mean: 0.62
    • max: 1.0
  • Samples:
    sentence_0 sentence_1 label
    proficient in problem solving, git, agile, unit testing, data structures, with mid-level experience in the field. holds a phd degree. holds certifications such as microsoft certified azure developer associate. skilled in delivering results and adapting to dynamic environments. as a software engineer, you will leverage your advanced programming skills to develop cutting-edge software solutions that shape the future of technology. you will work on complex coding challenges, create robust systems, and contribute to innovative projects that require deep technical knowledge and analytical skills. this role requires a keen eye for logic and problem-solving, typically suited for individuals who enjoy working independently and thrive in high-tech environments. the role is perfect for someone with a strong interest in software development, system design, and engineering principles. your work will directly impact the success of major technological products and services. 1.0
    proficient in policy analysis, sustainability, urban development, urban design, zoning laws, with senior-level experience in the field. holds a phd degree. holds certifications such as geographic information systems gis certificate. skilled in delivering results and adapting to dynamic environments. an urban planner is responsible for designing and developing land use plans and policies that promote sustainable growth and improve the quality of life in urban areas. you will analyze demographics, economic trends, and environmental factors to make recommendations for city development, zoning, and infrastructure projects. the role involves working with government agencies, architects, and developers to create plans that balance urban growth with environmental sustainability. a deep understanding of zoning laws, transportation systems, and social factors is necessary to ensure that urban spaces are functional, efficient, and equitable. 0.0
    proficient in critical thinking, medical terminology, patient care, surgical skills, clinical research, with mid-level experience in the field. holds a masters degree. holds certifications such as basic life support bls. skilled in delivering results and adapting to dynamic environments. diagnose and treat illnesses, prescribe medication, and provide ongoing patient care. work in various specialties, including surgery, pediatrics, or internal medicine. perform physical exams, order tests, and interpret medical results. collaborate with other healthcare providers to ensure comprehensive care for patients. requires medical expertise, empathy, and strong communication skills. must stay updated on the latest medical research and treatments. 1.0
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • num_train_epochs: 5
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 8
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 5
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss
0.25 500 0.1856
0.5 1000 0.1724
0.75 1500 0.1714
1.0 2000 0.1666
1.25 2500 0.1595
1.5 3000 0.159
1.75 3500 0.1613
2.0 4000 0.157
2.25 4500 0.154
2.5 5000 0.1541
2.75 5500 0.1511
3.0 6000 0.1547
3.25 6500 0.1502
3.5 7000 0.1469
3.75 7500 0.149
4.0 8000 0.1473
4.25 8500 0.1437
4.5 9000 0.1441
4.75 9500 0.1409
5.0 10000 0.1463

Framework Versions

  • Python: 3.12.2
  • Sentence Transformers: 5.1.0
  • Transformers: 4.55.0
  • PyTorch: 2.8.0
  • Accelerate: 1.10.0
  • Datasets: 4.0.0
  • Tokenizers: 0.21.4

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
-
Safetensors
Model size
22.7M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for kpat3149/all_minilm_finetuned

Finetuned
(752)
this model

Paper for kpat3149/all_minilm_finetuned