candidatePI
/

candidate-ft-model

@@ -3,20 +3,69 @@ tags:
 - sentence-transformers
 - sentence-similarity
 - feature-extraction
 pipeline_tag: sentence-similarity
 library_name: sentence-transformers
 ---
-# SentenceTransformer
-This is a [sentence-transformers](https://www.SBERT.net) model trained. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
 ## Model Details
 ### Model Description
 - **Model Type:** Sentence Transformer
-<!-- - **Base model:** [Unknown](https://huggingface.co/unknown) -->
-- **Maximum Sequence Length:** 128 tokens
 - **Output Dimensionality:** 768 dimensions
 - **Similarity Function:** Cosine Similarity
 <!-- - **Training Dataset:** Unknown -->
@@ -33,7 +82,7 @@ This is a [sentence-transformers](https://www.SBERT.net) model trained. It maps
 ```
 SentenceTransformer(
-  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: XLMRobertaModel
   (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
 )
 ```
@@ -56,9 +105,9 @@ from sentence_transformers import SentenceTransformer
 model = SentenceTransformer("sentence_transformers_model_id")
 # Run inference
 sentences = [
-    'The weather is lovely today.',
-    "It's so sunny outside!",
-    'He drove to the stadium.',
 ]
 embeddings = model.encode(sentences)
 print(embeddings.shape)
@@ -94,6 +143,20 @@ You can finetune this model on your own dataset.
 *List how the model may foreseeably be misused and address what users ought not to do with the model.*
 -->
 <!--
 ## Bias, Risks and Limitations
@@ -108,6 +171,166 @@ You can finetune this model on your own dataset.
 ## Training Details
 ### Framework Versions
 - Python: 3.12.9
 - Sentence Transformers: 4.1.0
@@ -121,6 +344,31 @@ You can finetune this model on your own dataset.
 ### BibTeX
 <!--
 ## Glossary

 - sentence-transformers
 - sentence-similarity
 - feature-extraction
+- generated_from_trainer
+- dataset_size:1621
+- loss:MultipleNegativesRankingLoss
+base_model: sentence-transformers/all-mpnet-base-v2
+widget:
+- source_sentence: Liveblocks, real-time collaboration infrastructure
+  sentences:
+  - Serverless routing patterns
+  - Socket.io for basic real-time features
+  - Neutral platform development only
+- source_sentence: Positive attitude and team spirit
+  sentences:
+  - 6 years Android development, Java and Kotlin, Google Play publications
+  - Maintains team morale during challenging projects
+  - Lucky platforms only
+- source_sentence: Experience with .NET Core and C# development required
+  sentences:
+  - Organized team building activities and fostered inclusive environment
+  - iptables, firewall rule management
+  - 10 years C# development with .NET Framework and .NET Core 3.1+
+- source_sentence: Onion Routing, Tor support
+  sentences:
+  - Privacy-focused architecture design
+  - Led global teams across 6 countries effectively
+  - Business aware, context driven, strategic thinker
+- source_sentence: Must have expertise in Angular and TypeScript
+  sentences:
+  - React developer with JavaScript ES6+ experience
+  - Mobile app developer with no AR/VR experience
+  - Owns errors, learns from mistakes, transparent
 pipeline_tag: sentence-similarity
 library_name: sentence-transformers
+metrics:
+- pearson_cosine
+- spearman_cosine
+model-index:
+- name: SentenceTransformer based on sentence-transformers/all-mpnet-base-v2
+  results:
+  - task:
+      type: semantic-similarity
+      name: Semantic Similarity
+    dataset:
+      name: val
+      type: val
+    metrics:
+    - type: pearson_cosine
+      value: 0.33261488496356484
+      name: Pearson Cosine
+    - type: spearman_cosine
+      value: 0.3462323228018911
+      name: Spearman Cosine
 ---
+# SentenceTransformer based on sentence-transformers/all-mpnet-base-v2
+This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
 ## Model Details
 ### Model Description
 - **Model Type:** Sentence Transformer
+- **Base model:** [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) <!-- at revision 12e86a3c702fc3c50205a8db88f0ec7c0b6b94a0 -->
+- **Maximum Sequence Length:** 256 tokens
 - **Output Dimensionality:** 768 dimensions
 - **Similarity Function:** Cosine Similarity
 <!-- - **Training Dataset:** Unknown -->
 ```
 SentenceTransformer(
+  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: MPNetModel
   (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
 )
 ```
 model = SentenceTransformer("sentence_transformers_model_id")
 # Run inference
 sentences = [
+    'Must have expertise in Angular and TypeScript',
+    'React developer with JavaScript ES6+ experience',
+    'Mobile app developer with no AR/VR experience',
 ]
 embeddings = model.encode(sentences)
 print(embeddings.shape)
 *List how the model may foreseeably be misused and address what users ought not to do with the model.*
 -->
+## Evaluation
+### Metrics
+#### Semantic Similarity
+* Dataset: `val`
+* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
+| Metric              | Value      |
+|:--------------------|:-----------|
+| pearson_cosine      | 0.3326     |
+| **spearman_cosine** | **0.3462** |
 <!--
 ## Bias, Risks and Limitations
 ## Training Details
+### Training Dataset
+#### Unnamed Dataset
+* Size: 1,621 training samples
+* Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
+* Approximate statistics based on the first 1000 samples:
+  |         | sentence_0                                                                       | sentence_1                                                                       | label                                                          |
+  |:--------|:---------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:---------------------------------------------------------------|
+  | type    | string                                                                           | string                                                                           | float                                                          |
+  | details | <ul><li>min: 4 tokens</li><li>mean: 8.46 tokens</li><li>max: 21 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 9.85 tokens</li><li>max: 24 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.59</li><li>max: 1.0</li></ul> |
+* Samples:
+  | sentence_0                                          | sentence_1                                                           | label            |
+  |:----------------------------------------------------|:---------------------------------------------------------------------|:-----------------|
+  | <code>Authenticity in team relationships</code>     | <code>Genuine connections, real person, authentic leader</code>      | <code>0.9</code> |
+  | <code>Keyless SSL, private key security</code>      | <code>HSM integration, key management</code>                         | <code>0.4</code> |
+  | <code>Need expertise in database replication</code> | <code>Set up master-slave replication with automatic failover</code> | <code>0.9</code> |
+* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
+  ```json
+  {
+      "scale": 20.0,
+      "similarity_fct": "cos_sim"
+  }
+  ```
+### Training Hyperparameters
+#### Non-Default Hyperparameters
+- `eval_strategy`: steps
+- `per_device_train_batch_size`: 32
+- `per_device_eval_batch_size`: 32
+- `num_train_epochs`: 5
+- `multi_dataset_batch_sampler`: round_robin
+#### All Hyperparameters
+<details><summary>Click to expand</summary>
+- `overwrite_output_dir`: False
+- `do_predict`: False
+- `eval_strategy`: steps
+- `prediction_loss_only`: True
+- `per_device_train_batch_size`: 32
+- `per_device_eval_batch_size`: 32
+- `per_gpu_train_batch_size`: None
+- `per_gpu_eval_batch_size`: None
+- `gradient_accumulation_steps`: 1
+- `eval_accumulation_steps`: None
+- `torch_empty_cache_steps`: None
+- `learning_rate`: 5e-05
+- `weight_decay`: 0.0
+- `adam_beta1`: 0.9
+- `adam_beta2`: 0.999
+- `adam_epsilon`: 1e-08
+- `max_grad_norm`: 1
+- `num_train_epochs`: 5
+- `max_steps`: -1
+- `lr_scheduler_type`: linear
+- `lr_scheduler_kwargs`: {}
+- `warmup_ratio`: 0.0
+- `warmup_steps`: 0
+- `log_level`: passive
+- `log_level_replica`: warning
+- `log_on_each_node`: True
+- `logging_nan_inf_filter`: True
+- `save_safetensors`: True
+- `save_on_each_node`: False
+- `save_only_model`: False
+- `restore_callback_states_from_checkpoint`: False
+- `no_cuda`: False
+- `use_cpu`: False
+- `use_mps_device`: False
+- `seed`: 42
+- `data_seed`: None
+- `jit_mode_eval`: False
+- `use_ipex`: False
+- `bf16`: False
+- `fp16`: False
+- `fp16_opt_level`: O1
+- `half_precision_backend`: auto
+- `bf16_full_eval`: False
+- `fp16_full_eval`: False
+- `tf32`: None
+- `local_rank`: 0
+- `ddp_backend`: None
+- `tpu_num_cores`: None
+- `tpu_metrics_debug`: False
+- `debug`: []
+- `dataloader_drop_last`: False
+- `dataloader_num_workers`: 0
+- `dataloader_prefetch_factor`: None
+- `past_index`: -1
+- `disable_tqdm`: False
+- `remove_unused_columns`: True
+- `label_names`: None
+- `load_best_model_at_end`: False
+- `ignore_data_skip`: False
+- `fsdp`: []
+- `fsdp_min_num_params`: 0
+- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
+- `tp_size`: 0
+- `fsdp_transformer_layer_cls_to_wrap`: None
+- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
+- `deepspeed`: None
+- `label_smoothing_factor`: 0.0
+- `optim`: adamw_torch
+- `optim_args`: None
+- `adafactor`: False
+- `group_by_length`: False
+- `length_column_name`: length
+- `ddp_find_unused_parameters`: None
+- `ddp_bucket_cap_mb`: None
+- `ddp_broadcast_buffers`: False
+- `dataloader_pin_memory`: True
+- `dataloader_persistent_workers`: False
+- `skip_memory_metrics`: True
+- `use_legacy_prediction_loop`: False
+- `push_to_hub`: False
+- `resume_from_checkpoint`: None
+- `hub_model_id`: None
+- `hub_strategy`: every_save
+- `hub_private_repo`: None
+- `hub_always_push`: False
+- `gradient_checkpointing`: False
+- `gradient_checkpointing_kwargs`: None
+- `include_inputs_for_metrics`: False
+- `include_for_metrics`: []
+- `eval_do_concat_batches`: True
+- `fp16_backend`: auto
+- `push_to_hub_model_id`: None
+- `push_to_hub_organization`: None
+- `mp_parameters`:
+- `auto_find_batch_size`: False
+- `full_determinism`: False
+- `torchdynamo`: None
+- `ray_scope`: last
+- `ddp_timeout`: 1800
+- `torch_compile`: False
+- `torch_compile_backend`: None
+- `torch_compile_mode`: None
+- `include_tokens_per_second`: False
+- `include_num_input_tokens_seen`: False
+- `neftune_noise_alpha`: None
+- `optim_target_modules`: None
+- `batch_eval_metrics`: False
+- `eval_on_start`: False
+- `use_liger_kernel`: False
+- `eval_use_gather_object`: False
+- `average_tokens_across_devices`: False
+- `prompts`: None
+- `batch_sampler`: batch_sampler
+- `multi_dataset_batch_sampler`: round_robin
+</details>
+### Training Logs
+| Epoch  | Step | val_spearman_cosine |
+|:------:|:----:|:-------------------:|
+| 0.9804 | 50   | 0.3462              |
 ### Framework Versions
 - Python: 3.12.9
 - Sentence Transformers: 4.1.0
 ### BibTeX
+#### Sentence Transformers
+```bibtex
+@inproceedings{reimers-2019-sentence-bert,
+    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
+    author = "Reimers, Nils and Gurevych, Iryna",
+    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
+    month = "11",
+    year = "2019",
+    publisher = "Association for Computational Linguistics",
+    url = "https://arxiv.org/abs/1908.10084",
+}
+```
+#### MultipleNegativesRankingLoss
+```bibtex
+@misc{henderson2017efficient,
+    title={Efficient Natural Language Response Suggestion for Smart Reply},
+    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
+    year={2017},
+    eprint={1705.00652},
+    archivePrefix={arXiv},
+    primaryClass={cs.CL}
+}
+```
 <!--
 ## Glossary

eval/similarity_evaluation_val_results.csv ADDED Viewed

	@@ -0,0 +1,6 @@

+epoch,steps,cosine_pearson,cosine_spearman
+1.0,51,0.333061348383918,0.34606382932875346
+2.0,102,0.2896842112210425,0.29871199430927403
+3.0,153,0.31861828044212254,0.32684568868246433
+4.0,204,0.298435297570077,0.3068966237124457
+5.0,255,0.28717771168468886,0.2960869240364453

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a6172635c9d5c46dc7779a4fe3e442816a1214be169ebea95d5ee46a9f4581dc
-size 1112197096

 version https://git-lfs.github.com/spec/v1
+oid sha256:f8ee34bf80e7a842dc955d3be4f15bac3990a4f92341572bfbf67713c2903c61
+size 437967672