veton-berisha commited on
Commit
b105b5b
·
verified ·
1 Parent(s): 10c9cf1

mse=0.0234

Browse files
README.md CHANGED
@@ -4,35 +4,35 @@ tags:
4
  - sentence-similarity
5
  - feature-extraction
6
  - generated_from_trainer
7
- - dataset_size:1621
8
  - loss:CosineSimilarityLoss
9
  base_model: sentence-transformers/all-mpnet-base-v2
10
  widget:
11
- - source_sentence: Calmness during production incidents
12
  sentences:
13
- - Takes feedback well, improves based on input, thanks reviewers
14
- - Level-headed, clear thinking under stress, calming presence
15
- - Implemented OAuth2/OIDC authentication for enterprise SSO
16
- - source_sentence: Must have SDK development experience
17
  sentences:
18
- - Technical lead without budget responsibility
19
- - Created SDKs for multiple programming languages
20
- - Built real-time dashboards processing streaming data
21
- - source_sentence: Understanding of business context
22
  sentences:
23
- - Work-life balance advocate, balanced person, holistic
24
- - Adds spring to team's step
25
- - Business aware, context driven, strategic thinker
26
- - source_sentence: Self-motivated with minimal supervision needed
27
  sentences:
28
- - Highly autonomous, self-directed learner, owns project outcomes
29
- - Managed multi-datacenter Cassandra clusters
30
- - Complex redirect logic implementation
31
- - source_sentence: 5+ years building anxiety platforms
32
  sentences:
33
- - Calming applications only
34
- - Developed Selenium test suites covering 80% of critical user flows
35
- - Designed event-driven systems using Kafka and AWS EventBridge
36
  pipeline_tag: sentence-similarity
37
  library_name: sentence-transformers
38
  metrics:
@@ -49,10 +49,10 @@ model-index:
49
  type: val
50
  metrics:
51
  - type: pearson_cosine
52
- value: 0.877106958407389
53
  name: Pearson Cosine
54
  - type: spearman_cosine
55
- value: 0.8469811407862099
56
  name: Spearman Cosine
57
  ---
58
 
@@ -105,9 +105,9 @@ from sentence_transformers import SentenceTransformer
105
  model = SentenceTransformer("sentence_transformers_model_id")
106
  # Run inference
107
  sentences = [
108
- '5+ years building anxiety platforms',
109
- 'Calming applications only',
110
- 'Developed Selenium test suites covering 80% of critical user flows',
111
  ]
112
  embeddings = model.encode(sentences)
113
  print(embeddings.shape)
@@ -152,10 +152,10 @@ You can finetune this model on your own dataset.
152
  * Dataset: `val`
153
  * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
154
 
155
- | Metric | Value |
156
- |:--------------------|:----------|
157
- | pearson_cosine | 0.8771 |
158
- | **spearman_cosine** | **0.847** |
159
 
160
  <!--
161
  ## Bias, Risks and Limitations
@@ -175,19 +175,19 @@ You can finetune this model on your own dataset.
175
 
176
  #### Unnamed Dataset
177
 
178
- * Size: 1,621 training samples
179
  * Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
180
  * Approximate statistics based on the first 1000 samples:
181
- | | sentence_0 | sentence_1 | label |
182
- |:--------|:---------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:---------------------------------------------------------------|
183
- | type | string | string | float |
184
- | details | <ul><li>min: 4 tokens</li><li>mean: 8.35 tokens</li><li>max: 21 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 9.74 tokens</li><li>max: 24 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.59</li><li>max: 1.0</li></ul> |
185
  * Samples:
186
- | sentence_0 | sentence_1 | label |
187
- |:-------------------------------------------------------|:----------------------------------------------------------------------|:-----------------|
188
- | <code>Proactiveness in identifying improvements</code> | <code>Spots issues early, suggests solutions, takes initiative</code> | <code>0.9</code> |
189
- | <code>Layout Worklet, custom layout</code> | <code>Layout worklet implementation patterns</code> | <code>0.2</code> |
190
- | <code>Must have SDK development experience</code> | <code>Created SDKs for multiple programming languages</code> | <code>0.9</code> |
191
  * Loss: [<code>CosineSimilarityLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters:
192
  ```json
193
  {
@@ -326,20 +326,24 @@ You can finetune this model on your own dataset.
326
  ### Training Logs
327
  | Epoch | Step | val_spearman_cosine |
328
  |:------:|:----:|:-------------------:|
329
- | 0.9804 | 50 | 0.7715 |
330
- | 1.0 | 51 | 0.7742 |
331
- | 1.9608 | 100 | 0.8218 |
332
- | 2.0 | 102 | 0.8218 |
333
- | 2.9412 | 150 | 0.8415 |
334
- | 3.0 | 153 | 0.8423 |
335
- | 3.9216 | 200 | 0.8470 |
 
 
 
 
336
 
337
 
338
  ### Framework Versions
339
- - Python: 3.12.9
340
  - Sentence Transformers: 4.1.0
341
  - Transformers: 4.52.4
342
- - PyTorch: 2.7.1
343
  - Accelerate: 1.7.0
344
  - Datasets: 3.6.0
345
  - Tokenizers: 0.21.1
 
4
  - sentence-similarity
5
  - feature-extraction
6
  - generated_from_trainer
7
+ - dataset_size:3072
8
  - loss:CosineSimilarityLoss
9
  base_model: sentence-transformers/all-mpnet-base-v2
10
  widget:
11
+ - source_sentence: Build type system for programming language from scratch
12
  sentences:
13
+ - Uses TypeScript for type-safe JavaScript
14
+ - Led architecture decision meetings resulting in consensus
15
+ - Integrated Stripe, PayPal, and custom payment solutions
16
+ - source_sentence: Privacy engineering skills
17
  sentences:
18
+ - Implemented differential privacy
19
+ - Technical implementation without vendor management
20
+ - Created developer-friendly APIs with Swagger docs
21
+ - source_sentence: Privacy Pass, privacy protocol
22
  sentences:
23
+ - Modern development tools only
24
+ - Excellent at breaking down complex topics for junior developers
25
+ - Privacy-preserving authentication methods
26
+ - source_sentence: JVM tuning and profiling
27
  sentences:
28
+ - Performance monitoring patterns
29
+ - Optimized GC settings reducing pause times
30
+ - Senior developer with proven track record debugging distributed system race conditions
31
+ - source_sentence: Knowledge sharing enthusiasm
32
  sentences:
33
+ - Regular meetup speaker and blogger
34
+ - Optimized Spark jobs processing terabytes of data daily
35
+ - Configured database partitioning
36
  pipeline_tag: sentence-similarity
37
  library_name: sentence-transformers
38
  metrics:
 
49
  type: val
50
  metrics:
51
  - type: pearson_cosine
52
+ value: 0.8977247913414342
53
  name: Pearson Cosine
54
  - type: spearman_cosine
55
+ value: 0.8052388814564073
56
  name: Spearman Cosine
57
  ---
58
 
 
105
  model = SentenceTransformer("sentence_transformers_model_id")
106
  # Run inference
107
  sentences = [
108
+ 'Knowledge sharing enthusiasm',
109
+ 'Regular meetup speaker and blogger',
110
+ 'Configured database partitioning',
111
  ]
112
  embeddings = model.encode(sentences)
113
  print(embeddings.shape)
 
152
  * Dataset: `val`
153
  * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
154
 
155
+ | Metric | Value |
156
+ |:--------------------|:-----------|
157
+ | pearson_cosine | 0.8977 |
158
+ | **spearman_cosine** | **0.8052** |
159
 
160
  <!--
161
  ## Bias, Risks and Limitations
 
175
 
176
  #### Unnamed Dataset
177
 
178
+ * Size: 3,072 training samples
179
  * Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
180
  * Approximate statistics based on the first 1000 samples:
181
+ | | sentence_0 | sentence_1 | label |
182
+ |:--------|:---------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
183
+ | type | string | string | float |
184
+ | details | <ul><li>min: 4 tokens</li><li>mean: 9.74 tokens</li><li>max: 27 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 11.06 tokens</li><li>max: 24 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.67</li><li>max: 1.0</li></ul> |
185
  * Samples:
186
+ | sentence_0 | sentence_1 | label |
187
+ |:---------------------------------------------------------------------------|:---------------------------------------------------------------------------|:-----------------|
188
+ | <code>Boundary-value testing and equivalence partitioning expertise</code> | <code>QA engineer designing test cases with boundary value analysis</code> | <code>0.9</code> |
189
+ | <code>Must have strong decision-making skills</code> | <code>Makes timely decisions based on available information</code> | <code>0.7</code> |
190
+ | <code>8+ years building real-time collaboration tools</code> | <code>Traditional request-response application development</code> | <code>0.2</code> |
191
  * Loss: [<code>CosineSimilarityLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters:
192
  ```json
193
  {
 
326
  ### Training Logs
327
  | Epoch | Step | val_spearman_cosine |
328
  |:------:|:----:|:-------------------:|
329
+ | 0.5208 | 50 | 0.6737 |
330
+ | 1.0 | 96 | 0.7384 |
331
+ | 1.0417 | 100 | 0.7431 |
332
+ | 1.5625 | 150 | 0.7703 |
333
+ | 2.0 | 192 | 0.7790 |
334
+ | 2.0833 | 200 | 0.7817 |
335
+ | 2.6042 | 250 | 0.8011 |
336
+ | 3.0 | 288 | 0.7967 |
337
+ | 3.125 | 300 | 0.7963 |
338
+ | 3.6458 | 350 | 0.8046 |
339
+ | 4.0 | 384 | 0.8052 |
340
 
341
 
342
  ### Framework Versions
343
+ - Python: 3.12.10
344
  - Sentence Transformers: 4.1.0
345
  - Transformers: 4.52.4
346
+ - PyTorch: 2.7.1+cu126
347
  - Accelerate: 1.7.0
348
  - Datasets: 3.6.0
349
  - Tokenizers: 0.21.1
config_sentence_transformers.json CHANGED
@@ -2,7 +2,7 @@
2
  "__version__": {
3
  "sentence_transformers": "4.1.0",
4
  "transformers": "4.52.4",
5
- "pytorch": "2.7.1"
6
  },
7
  "prompts": {},
8
  "default_prompt_name": null,
 
2
  "__version__": {
3
  "sentence_transformers": "4.1.0",
4
  "transformers": "4.52.4",
5
+ "pytorch": "2.7.1+cu126"
6
  },
7
  "prompts": {},
8
  "default_prompt_name": null,
eval/similarity_evaluation_val_results.csv CHANGED
@@ -1,5 +1,5 @@
1
  epoch,steps,cosine_pearson,cosine_spearman
2
- 1.0,51,0.8154555424408279,0.7741979456271402
3
- 2.0,102,0.8586989969751344,0.8217682417751387
4
- 3.0,153,0.8744392671984902,0.842272664134552
5
- 4.0,204,0.877106958407389,0.8469811407862099
 
1
  epoch,steps,cosine_pearson,cosine_spearman
2
+ 1.0,96,0.8272483418053012,0.7384040919120075
3
+ 2.0,192,0.8806144722889805,0.7789630856263889
4
+ 3.0,288,0.8940053252264049,0.7967165513263559
5
+ 4.0,384,0.8977247913414342,0.8052388814564073
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:1618565b9451f2af29c2284506ba4dabcc0133025fa60b4d4119c356d737eb24
3
  size 437967672
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:628af632e016d61b250d100cdf4a3b0b13f3c1b2802767ceea7fd31e83f3ebfa
3
  size 437967672