veton-berisha commited on
Commit
10c9cf1
·
verified ·
1 Parent(s): 9cf3a86

mse=0.0253

Browse files
README.md CHANGED
@@ -5,34 +5,34 @@ tags:
5
  - feature-extraction
6
  - generated_from_trainer
7
  - dataset_size:1621
8
- - loss:MultipleNegativesRankingLoss
9
  base_model: sentence-transformers/all-mpnet-base-v2
10
  widget:
11
- - source_sentence: Liveblocks, real-time collaboration infrastructure
12
  sentences:
13
- - Serverless routing patterns
14
- - Socket.io for basic real-time features
15
- - Neutral platform development only
16
- - source_sentence: Positive attitude and team spirit
17
  sentences:
18
- - 6 years Android development, Java and Kotlin, Google Play publications
19
- - Maintains team morale during challenging projects
20
- - Lucky platforms only
21
- - source_sentence: Experience with .NET Core and C# development required
22
  sentences:
23
- - Organized team building activities and fostered inclusive environment
24
- - iptables, firewall rule management
25
- - 10 years C# development with .NET Framework and .NET Core 3.1+
26
- - source_sentence: Onion Routing, Tor support
27
- sentences:
28
- - Privacy-focused architecture design
29
- - Led global teams across 6 countries effectively
30
  - Business aware, context driven, strategic thinker
31
- - source_sentence: Must have expertise in Angular and TypeScript
 
 
 
 
 
32
  sentences:
33
- - React developer with JavaScript ES6+ experience
34
- - Mobile app developer with no AR/VR experience
35
- - Owns errors, learns from mistakes, transparent
36
  pipeline_tag: sentence-similarity
37
  library_name: sentence-transformers
38
  metrics:
@@ -49,10 +49,10 @@ model-index:
49
  type: val
50
  metrics:
51
  - type: pearson_cosine
52
- value: 0.33261488496356484
53
  name: Pearson Cosine
54
  - type: spearman_cosine
55
- value: 0.3462323228018911
56
  name: Spearman Cosine
57
  ---
58
 
@@ -105,9 +105,9 @@ from sentence_transformers import SentenceTransformer
105
  model = SentenceTransformer("sentence_transformers_model_id")
106
  # Run inference
107
  sentences = [
108
- 'Must have expertise in Angular and TypeScript',
109
- 'React developer with JavaScript ES6+ experience',
110
- 'Mobile app developer with no AR/VR experience',
111
  ]
112
  embeddings = model.encode(sentences)
113
  print(embeddings.shape)
@@ -152,10 +152,10 @@ You can finetune this model on your own dataset.
152
  * Dataset: `val`
153
  * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
154
 
155
- | Metric | Value |
156
- |:--------------------|:-----------|
157
- | pearson_cosine | 0.3326 |
158
- | **spearman_cosine** | **0.3462** |
159
 
160
  <!--
161
  ## Bias, Risks and Limitations
@@ -181,18 +181,17 @@ You can finetune this model on your own dataset.
181
  | | sentence_0 | sentence_1 | label |
182
  |:--------|:---------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:---------------------------------------------------------------|
183
  | type | string | string | float |
184
- | details | <ul><li>min: 4 tokens</li><li>mean: 8.46 tokens</li><li>max: 21 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 9.85 tokens</li><li>max: 24 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.59</li><li>max: 1.0</li></ul> |
185
  * Samples:
186
- | sentence_0 | sentence_1 | label |
187
- |:----------------------------------------------------|:---------------------------------------------------------------------|:-----------------|
188
- | <code>Authenticity in team relationships</code> | <code>Genuine connections, real person, authentic leader</code> | <code>0.9</code> |
189
- | <code>Keyless SSL, private key security</code> | <code>HSM integration, key management</code> | <code>0.4</code> |
190
- | <code>Need expertise in database replication</code> | <code>Set up master-slave replication with automatic failover</code> | <code>0.9</code> |
191
- * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
192
  ```json
193
  {
194
- "scale": 20.0,
195
- "similarity_fct": "cos_sim"
196
  }
197
  ```
198
 
@@ -202,7 +201,7 @@ You can finetune this model on your own dataset.
202
  - `eval_strategy`: steps
203
  - `per_device_train_batch_size`: 32
204
  - `per_device_eval_batch_size`: 32
205
- - `num_train_epochs`: 5
206
  - `multi_dataset_batch_sampler`: round_robin
207
 
208
  #### All Hyperparameters
@@ -225,7 +224,7 @@ You can finetune this model on your own dataset.
225
  - `adam_beta2`: 0.999
226
  - `adam_epsilon`: 1e-08
227
  - `max_grad_norm`: 1
228
- - `num_train_epochs`: 5
229
  - `max_steps`: -1
230
  - `lr_scheduler_type`: linear
231
  - `lr_scheduler_kwargs`: {}
@@ -270,7 +269,6 @@ You can finetune this model on your own dataset.
270
  - `fsdp`: []
271
  - `fsdp_min_num_params`: 0
272
  - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
273
- - `tp_size`: 0
274
  - `fsdp_transformer_layer_cls_to_wrap`: None
275
  - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
276
  - `deepspeed`: None
@@ -328,14 +326,20 @@ You can finetune this model on your own dataset.
328
  ### Training Logs
329
  | Epoch | Step | val_spearman_cosine |
330
  |:------:|:----:|:-------------------:|
331
- | 0.9804 | 50 | 0.3462 |
 
 
 
 
 
 
332
 
333
 
334
  ### Framework Versions
335
  - Python: 3.12.9
336
  - Sentence Transformers: 4.1.0
337
- - Transformers: 4.51.3
338
- - PyTorch: 2.7.0
339
  - Accelerate: 1.7.0
340
  - Datasets: 3.6.0
341
  - Tokenizers: 0.21.1
@@ -357,18 +361,6 @@ You can finetune this model on your own dataset.
357
  }
358
  ```
359
 
360
- #### MultipleNegativesRankingLoss
361
- ```bibtex
362
- @misc{henderson2017efficient,
363
- title={Efficient Natural Language Response Suggestion for Smart Reply},
364
- author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
365
- year={2017},
366
- eprint={1705.00652},
367
- archivePrefix={arXiv},
368
- primaryClass={cs.CL}
369
- }
370
- ```
371
-
372
  <!--
373
  ## Glossary
374
 
 
5
  - feature-extraction
6
  - generated_from_trainer
7
  - dataset_size:1621
8
+ - loss:CosineSimilarityLoss
9
  base_model: sentence-transformers/all-mpnet-base-v2
10
  widget:
11
+ - source_sentence: Calmness during production incidents
12
  sentences:
13
+ - Takes feedback well, improves based on input, thanks reviewers
14
+ - Level-headed, clear thinking under stress, calming presence
15
+ - Implemented OAuth2/OIDC authentication for enterprise SSO
16
+ - source_sentence: Must have SDK development experience
17
  sentences:
18
+ - Technical lead without budget responsibility
19
+ - Created SDKs for multiple programming languages
20
+ - Built real-time dashboards processing streaming data
21
+ - source_sentence: Understanding of business context
22
  sentences:
23
+ - Work-life balance advocate, balanced person, holistic
24
+ - Adds spring to team's step
 
 
 
 
 
25
  - Business aware, context driven, strategic thinker
26
+ - source_sentence: Self-motivated with minimal supervision needed
27
+ sentences:
28
+ - Highly autonomous, self-directed learner, owns project outcomes
29
+ - Managed multi-datacenter Cassandra clusters
30
+ - Complex redirect logic implementation
31
+ - source_sentence: 5+ years building anxiety platforms
32
  sentences:
33
+ - Calming applications only
34
+ - Developed Selenium test suites covering 80% of critical user flows
35
+ - Designed event-driven systems using Kafka and AWS EventBridge
36
  pipeline_tag: sentence-similarity
37
  library_name: sentence-transformers
38
  metrics:
 
49
  type: val
50
  metrics:
51
  - type: pearson_cosine
52
+ value: 0.877106958407389
53
  name: Pearson Cosine
54
  - type: spearman_cosine
55
+ value: 0.8469811407862099
56
  name: Spearman Cosine
57
  ---
58
 
 
105
  model = SentenceTransformer("sentence_transformers_model_id")
106
  # Run inference
107
  sentences = [
108
+ '5+ years building anxiety platforms',
109
+ 'Calming applications only',
110
+ 'Developed Selenium test suites covering 80% of critical user flows',
111
  ]
112
  embeddings = model.encode(sentences)
113
  print(embeddings.shape)
 
152
  * Dataset: `val`
153
  * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
154
 
155
+ | Metric | Value |
156
+ |:--------------------|:----------|
157
+ | pearson_cosine | 0.8771 |
158
+ | **spearman_cosine** | **0.847** |
159
 
160
  <!--
161
  ## Bias, Risks and Limitations
 
181
  | | sentence_0 | sentence_1 | label |
182
  |:--------|:---------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:---------------------------------------------------------------|
183
  | type | string | string | float |
184
+ | details | <ul><li>min: 4 tokens</li><li>mean: 8.35 tokens</li><li>max: 21 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 9.74 tokens</li><li>max: 24 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.59</li><li>max: 1.0</li></ul> |
185
  * Samples:
186
+ | sentence_0 | sentence_1 | label |
187
+ |:-------------------------------------------------------|:----------------------------------------------------------------------|:-----------------|
188
+ | <code>Proactiveness in identifying improvements</code> | <code>Spots issues early, suggests solutions, takes initiative</code> | <code>0.9</code> |
189
+ | <code>Layout Worklet, custom layout</code> | <code>Layout worklet implementation patterns</code> | <code>0.2</code> |
190
+ | <code>Must have SDK development experience</code> | <code>Created SDKs for multiple programming languages</code> | <code>0.9</code> |
191
+ * Loss: [<code>CosineSimilarityLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters:
192
  ```json
193
  {
194
+ "loss_fct": "torch.nn.modules.loss.MSELoss"
 
195
  }
196
  ```
197
 
 
201
  - `eval_strategy`: steps
202
  - `per_device_train_batch_size`: 32
203
  - `per_device_eval_batch_size`: 32
204
+ - `num_train_epochs`: 4
205
  - `multi_dataset_batch_sampler`: round_robin
206
 
207
  #### All Hyperparameters
 
224
  - `adam_beta2`: 0.999
225
  - `adam_epsilon`: 1e-08
226
  - `max_grad_norm`: 1
227
+ - `num_train_epochs`: 4
228
  - `max_steps`: -1
229
  - `lr_scheduler_type`: linear
230
  - `lr_scheduler_kwargs`: {}
 
269
  - `fsdp`: []
270
  - `fsdp_min_num_params`: 0
271
  - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
 
272
  - `fsdp_transformer_layer_cls_to_wrap`: None
273
  - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
274
  - `deepspeed`: None
 
326
  ### Training Logs
327
  | Epoch | Step | val_spearman_cosine |
328
  |:------:|:----:|:-------------------:|
329
+ | 0.9804 | 50 | 0.7715 |
330
+ | 1.0 | 51 | 0.7742 |
331
+ | 1.9608 | 100 | 0.8218 |
332
+ | 2.0 | 102 | 0.8218 |
333
+ | 2.9412 | 150 | 0.8415 |
334
+ | 3.0 | 153 | 0.8423 |
335
+ | 3.9216 | 200 | 0.8470 |
336
 
337
 
338
  ### Framework Versions
339
  - Python: 3.12.9
340
  - Sentence Transformers: 4.1.0
341
+ - Transformers: 4.52.4
342
+ - PyTorch: 2.7.1
343
  - Accelerate: 1.7.0
344
  - Datasets: 3.6.0
345
  - Tokenizers: 0.21.1
 
361
  }
362
  ```
363
 
 
 
 
 
 
 
 
 
 
 
 
 
364
  <!--
365
  ## Glossary
366
 
config.json CHANGED
@@ -18,6 +18,6 @@
18
  "pad_token_id": 1,
19
  "relative_attention_num_buckets": 32,
20
  "torch_dtype": "float32",
21
- "transformers_version": "4.51.3",
22
  "vocab_size": 30527
23
  }
 
18
  "pad_token_id": 1,
19
  "relative_attention_num_buckets": 32,
20
  "torch_dtype": "float32",
21
+ "transformers_version": "4.52.4",
22
  "vocab_size": 30527
23
  }
config_sentence_transformers.json CHANGED
@@ -1,8 +1,8 @@
1
  {
2
  "__version__": {
3
  "sentence_transformers": "4.1.0",
4
- "transformers": "4.51.3",
5
- "pytorch": "2.7.0"
6
  },
7
  "prompts": {},
8
  "default_prompt_name": null,
 
1
  {
2
  "__version__": {
3
  "sentence_transformers": "4.1.0",
4
+ "transformers": "4.52.4",
5
+ "pytorch": "2.7.1"
6
  },
7
  "prompts": {},
8
  "default_prompt_name": null,
eval/similarity_evaluation_val_results.csv CHANGED
@@ -1,6 +1,5 @@
1
  epoch,steps,cosine_pearson,cosine_spearman
2
- 1.0,51,0.333061348383918,0.34606382932875346
3
- 2.0,102,0.2896842112210425,0.29871199430927403
4
- 3.0,153,0.31861828044212254,0.32684568868246433
5
- 4.0,204,0.298435297570077,0.3068966237124457
6
- 5.0,255,0.28717771168468886,0.2960869240364453
 
1
  epoch,steps,cosine_pearson,cosine_spearman
2
+ 1.0,51,0.8154555424408279,0.7741979456271402
3
+ 2.0,102,0.8586989969751344,0.8217682417751387
4
+ 3.0,153,0.8744392671984902,0.842272664134552
5
+ 4.0,204,0.877106958407389,0.8469811407862099
 
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:f8ee34bf80e7a842dc955d3be4f15bac3990a4f92341572bfbf67713c2903c61
3
  size 437967672
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1618565b9451f2af29c2284506ba4dabcc0133025fa60b4d4119c356d737eb24
3
  size 437967672