oneryalcin commited on
Commit
e0ce95b
·
verified ·
1 Parent(s): 3f781ef

Training in progress, step 396

Browse files
README.md CHANGED
@@ -7,50 +7,64 @@ tags:
7
  - sentence-similarity
8
  - feature-extraction
9
  - generated_from_trainer
10
- - dataset_size:5832592
11
  - loss:MatryoshkaLoss
12
  - loss:MultipleNegativesRankingLoss
13
  widget:
14
- - source_sentence: crushing middlegame sacrifice short
 
15
  sentences:
16
- - themes advantage middlegame short moves f4f7 c4d5 f7d5 b3d5 f4f7+c4d5 c4d5+f7d5
17
- f7d5+b3d5
18
- - themes advantage fork middlegame short opening Four Knights Game Four Knights
19
- Game Italian Variation moves c8f5 d5e7 g8h8 e7f5 c8f5+d5e7 d5e7+g8h8 g8h8+e7f5
20
- - themes crushing middlegame sacrifice short moves g6g4 e1e6 f7e6 d2h6 g6g4+e1e6
21
- e1e6+f7e6 f7e6+d2h6
22
- - source_sentence: crushing endgame long
 
 
23
  sentences:
24
- - themes crushing endgame long moves e2c2 f5g5 c2g2 g5h6 g2h2 h6g7 e2c2+f5g5 f5g5+c2g2
25
- c2g2+g5h6 g5h6+g2h2 g2h2+h6g7
26
- - themes crushing endgame fork hangingPiece long moves c7c3 b2c3 d5f7 g5g7 f7g7
27
- f8g7 c7c3+b2c3 b2c3+d5f7 d5f7+g5g7 g5g7+f7g7 f7g7+f8g7
28
- - themes crushing intermezzo middlegame short moves c5b4 d1d3 f6e7 a3b4 c5b4+d1d3
29
- d1d3+f6e7 f6e7+a3b4
30
- - source_sentence: crushing endgame fork short
 
 
 
31
  sentences:
32
- - themes crushing endgame rookEndgame short skewer moves b4b3 h7h8 f8g7 h8b8 b4b3+h7h8
33
- h7h8+f8g7 f8g7+h8b8
34
- - themes crushing endgame fork short moves f2f1 f3d2 f1e2 d2c4 f2f1+f3d2 f3d2+f1e2
35
- f1e2+d2c4
36
- - themes mate mateIn1 middlegame oneMove moves d7d6 g3g7 d7d6+g3g7
37
- - source_sentence: crushing fork middlegame veryLong
 
 
38
  sentences:
39
- - themes crushing endgame fork master short moves f7f5 a6g6 g5g6 h4g6 f7f5+a6g6
40
- a6g6+g5g6 g5g6+h4g6
41
- - themes attraction discoveredCheck doubleCheck long mate mateIn3 opening operaMate
42
- sacrifice opening Bishops Opening Bishops Opening Ponziani Gambit moves h8g8 f6d8
43
- e8d8 d2g5 d8e8 d1d8 h8g8+f6d8 f6d8+e8d8 e8d8+d2g5 d2g5+d8e8 d8e8+d1d8
44
- - themes crushing fork middlegame veryLong moves h6h7 e8h5 f3g3 c5e3 h7h8q e3f4
45
- g3g2 h5g4 g2h1 f4d2 a1g1 g4f3 h6h7+e8h5 e8h5+f3g3 f3g3+c5e3 c5e3+h7h8q h7h8q+e3f4
46
- e3f4+g3g2 g3g2+h5g4 h5g4+g2h1 g2h1+f4d2 f4d2+a1g1 a1g1+g4f3
47
- - source_sentence: endgame mate mateIn2 pillsburysMate short
 
 
 
48
  sentences:
49
- - themes bishopEndgame crushing defensiveMove endgame master short moves g3g4 h5h4
50
- f4g5 h6g5 g3g4+h5h4 h5h4+f4g5 f4g5+h6g5
51
- - themes endgame mate mateIn2 pillsburysMate short moves c4e3 b5b8 f5c8 b8c8 c4e3+b5b8
52
- b5b8+f5c8 f5c8+b8c8
53
- - themes endgame mate mateIn1 oneMove moves e5f4 g3g1 e5f4+g3g1
 
 
 
54
  pipeline_tag: sentence-similarity
55
  library_name: sentence-transformers
56
  metrics:
@@ -74,31 +88,31 @@ model-index:
74
  type: chess-ir
75
  metrics:
76
  - type: cosine_accuracy@1
77
- value: 0.01
78
  name: Cosine Accuracy@1
79
  - type: cosine_accuracy@10
80
- value: 0.055
81
  name: Cosine Accuracy@10
82
  - type: cosine_precision@1
83
- value: 0.01
84
  name: Cosine Precision@1
85
  - type: cosine_precision@10
86
- value: 0.006
87
  name: Cosine Precision@10
88
  - type: cosine_recall@1
89
- value: 0.003333333333333333
90
  name: Cosine Recall@1
91
  - type: cosine_recall@10
92
- value: 0.019999999999999997
93
  name: Cosine Recall@10
94
  - type: cosine_ndcg@10
95
- value: 0.014141653573050736
96
  name: Cosine Ndcg@10
97
  - type: cosine_mrr@10
98
- value: 0.02086111111111111
99
  name: Cosine Mrr@10
100
  - type: cosine_map@100
101
- value: 0.012561680163147302
102
  name: Cosine Map@100
103
  - task:
104
  type: information-retrieval
@@ -108,31 +122,31 @@ model-index:
108
  type: chess-ir-tokens
109
  metrics:
110
  - type: cosine_accuracy@1
111
- value: 0.037037037037037035
112
  name: Cosine Accuracy@1
113
  - type: cosine_accuracy@10
114
- value: 0.21164021164021163
115
  name: Cosine Accuracy@10
116
  - type: cosine_precision@1
117
- value: 0.037037037037037035
118
  name: Cosine Precision@1
119
  - type: cosine_precision@10
120
- value: 0.047619047619047616
121
  name: Cosine Precision@10
122
  - type: cosine_recall@1
123
- value: 0.0025144161912381744
124
  name: Cosine Recall@1
125
  - type: cosine_recall@10
126
- value: 0.02212990521949281
127
  name: Cosine Recall@10
128
  - type: cosine_ndcg@10
129
- value: 0.0517090496324674
130
  name: Cosine Ndcg@10
131
  - type: cosine_mrr@10
132
- value: 0.08710842361636012
133
  name: Cosine Mrr@10
134
  - type: cosine_map@100
135
- value: 0.028156284478181654
136
  name: Cosine Map@100
137
  ---
138
 
@@ -184,12 +198,12 @@ from sentence_transformers import SentenceTransformer
184
  model = SentenceTransformer("oneryalcin/static-embedding-chess")
185
  # Run inference
186
  queries = [
187
- 'endgame mate mateIn2 pillsburysMate short',
188
  ]
189
  documents = [
190
- 'themes endgame mate mateIn2 pillsburysMate short moves c4e3 b5b8 f5c8 b8c8 c4e3+b5b8 b5b8+f5c8 f5c8+b8c8',
191
- 'themes bishopEndgame crushing defensiveMove endgame master short moves g3g4 h5h4 f4g5 h6g5 g3g4+h5h4 h5h4+f4g5 f4g5+h6g5',
192
- 'themes endgame mate mateIn1 oneMove moves e5f4 g3g1 e5f4+g3g1',
193
  ]
194
  query_embeddings = model.encode_query(queries)
195
  document_embeddings = model.encode_document(documents)
@@ -199,7 +213,7 @@ print(query_embeddings.shape, document_embeddings.shape)
199
  # Get the similarity scores for the embeddings
200
  similarities = model.similarity(query_embeddings, document_embeddings)
201
  print(similarities)
202
- # tensor([[ 0.9826, -0.1530, 0.0366]])
203
  ```
204
  <!--
205
  ### Direct Usage (Transformers)
@@ -234,17 +248,17 @@ You can finetune this model on your own dataset.
234
  * Datasets: `chess-ir` and `chess-ir-tokens`
235
  * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.sentence_transformer.evaluation.InformationRetrievalEvaluator)
236
 
237
- | Metric | chess-ir | chess-ir-tokens |
238
- |:--------------------|:-----------|:----------------|
239
- | cosine_accuracy@1 | 0.01 | 0.037 |
240
- | cosine_accuracy@10 | 0.055 | 0.2116 |
241
- | cosine_precision@1 | 0.01 | 0.037 |
242
- | cosine_precision@10 | 0.006 | 0.0476 |
243
- | cosine_recall@1 | 0.0033 | 0.0025 |
244
- | cosine_recall@10 | 0.02 | 0.0221 |
245
- | **cosine_ndcg@10** | **0.0141** | **0.0517** |
246
- | cosine_mrr@10 | 0.0209 | 0.0871 |
247
- | cosine_map@100 | 0.0126 | 0.0282 |
248
 
249
  <!--
250
  ## Bias, Risks and Limitations
@@ -264,20 +278,20 @@ You can finetune this model on your own dataset.
264
 
265
  #### Unnamed Dataset
266
 
267
- * Size: 5,832,592 training samples
268
  * Columns: <code>anchor</code> and <code>positive</code>
269
  * Approximate statistics based on the first 100 samples:
270
  | | anchor | positive |
271
  |:---------|:------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------|
272
  | type | string | string |
273
  | modality | text | text |
274
- | details | <ul><li>min: 14 characters</li><li>mean: 45.72 characters</li><li>max: 107 characters</li></ul> | <ul><li>min: 61 characters</li><li>mean: 121.98 characters</li><li>max: 233 characters</li></ul> |
275
  * Samples:
276
- | anchor | positive |
277
- |:-----------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------|
278
- | <code>crushing endgame fork short</code> | <code>themes crushing endgame fork short moves f7f6 g5e6 g7h6 e6c5 f7f6+g5e6 g5e6+g7h6 g7h6+e6c5</code> |
279
- | <code>crushing discoveredAttack kingsideAttack middlegame short</code> | <code>themes crushing discoveredAttack kingsideAttack middlegame short moves e4g3 f3g3 f2g3 h5e2 e4g3+f3g3 f3g3+f2g3 f2g3+h5e2</code> |
280
- | <code>crushing middlegame short</code> | <code>themes crushing middlegame short moves d7c8 e2g4 c8c7 c3b5 d7c8+e2g4 e2g4+c8c7 c8c7+c3b5</code> |
281
  * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
282
  ```json
283
  {
@@ -303,12 +317,12 @@ You can finetune this model on your own dataset.
303
  ### Training Hyperparameters
304
  #### Non-Default Hyperparameters
305
 
306
- - `per_device_train_batch_size`: 2048
307
- - `num_train_epochs`: 1
308
- - `learning_rate`: 0.05
309
  - `warmup_steps`: 0.1
310
  - `weight_decay`: 0.01
311
- - `per_device_eval_batch_size`: 2048
312
  - `push_to_hub`: True
313
  - `hub_model_id`: oneryalcin/static-embedding-chess
314
  - `load_best_model_at_end`: True
@@ -317,10 +331,10 @@ You can finetune this model on your own dataset.
317
  #### All Hyperparameters
318
  <details><summary>Click to expand</summary>
319
 
320
- - `per_device_train_batch_size`: 2048
321
- - `num_train_epochs`: 1
322
  - `max_steps`: -1
323
- - `learning_rate`: 0.05
324
  - `lr_scheduler_type`: linear
325
  - `lr_scheduler_kwargs`: None
326
  - `warmup_steps`: 0.1
@@ -361,7 +375,7 @@ You can finetune this model on your own dataset.
361
  - `trackio_space_id`: None
362
  - `trackio_bucket_id`: None
363
  - `trackio_static_space_id`: None
364
- - `per_device_eval_batch_size`: 2048
365
  - `prediction_loss_only`: True
366
  - `eval_on_start`: False
367
  - `eval_do_concat_batches`: True
@@ -422,46 +436,19 @@ You can finetune this model on your own dataset.
422
  ### Training Logs
423
  | Epoch | Step | Training Loss | chess-ir_cosine_ndcg@10 | chess-ir-tokens_cosine_ndcg@10 |
424
  |:------:|:----:|:-------------:|:-----------------------:|:------------------------------:|
425
- | -1 | -1 | - | 0.0087 | 0.0476 |
426
- | 0.0004 | 1 | 25.5090 | - | - |
427
- | 0.0102 | 29 | 24.7398 | - | - |
428
- | 0.0204 | 58 | 20.8309 | - | - |
429
- | 0.0305 | 87 | 16.5176 | - | - |
430
- | 0.0407 | 116 | 12.8534 | - | - |
431
- | 0.0509 | 145 | 10.2759 | - | - |
432
- | 0.0611 | 174 | 8.7313 | - | - |
433
- | 0.0713 | 203 | 7.8373 | - | - |
434
- | 0.0815 | 232 | 7.3665 | - | - |
435
- | 0.0916 | 261 | 7.0534 | - | - |
436
- | 0.1001 | 285 | - | 0.0403 | 0.0964 |
437
- | 0.1018 | 290 | 6.8225 | - | - |
438
- | 0.1120 | 319 | 6.6948 | - | - |
439
- | 0.1222 | 348 | 6.6811 | - | - |
440
- | 0.1324 | 377 | 6.5559 | - | - |
441
- | 0.1426 | 406 | 6.6007 | - | - |
442
- | 0.1527 | 435 | 6.5704 | - | - |
443
- | 0.1629 | 464 | 6.4524 | - | - |
444
- | 0.1731 | 493 | 6.4562 | - | - |
445
- | 0.1833 | 522 | 6.5016 | - | - |
446
- | 0.1935 | 551 | 6.4405 | - | - |
447
- | 0.2001 | 570 | - | 0.0165 | 0.0624 |
448
- | 0.2037 | 580 | 6.5354 | - | - |
449
- | 0.2138 | 609 | 6.4492 | - | - |
450
- | 0.2240 | 638 | 6.4807 | - | - |
451
- | 0.2342 | 667 | 6.4568 | - | - |
452
- | 0.2444 | 696 | 6.4335 | - | - |
453
- | 0.2546 | 725 | 6.4693 | - | - |
454
- | 0.2647 | 754 | 6.4870 | - | - |
455
- | 0.2749 | 783 | 6.4468 | - | - |
456
- | 0.2851 | 812 | 6.4680 | - | - |
457
- | 0.2953 | 841 | 6.3538 | - | - |
458
- | 0.3002 | 855 | - | 0.0141 | 0.0517 |
459
 
460
 
461
  ### Training Time
462
- - **Training**: 49.8 seconds
463
  - **Evaluation**: 0.1 seconds
464
- - **Total**: 49.9 seconds
465
 
466
  ### Framework Versions
467
  - Python: 3.12.10
 
7
  - sentence-similarity
8
  - feature-extraction
9
  - generated_from_trainer
10
+ - dataset_size:1619946
11
  - loss:MatryoshkaLoss
12
  - loss:MultipleNegativesRankingLoss
13
  widget:
14
+ - source_sentence: kingsideAttack master [UNK] mateIn1 oneMove [UNK] [UNK] Defense
15
+ Sicilian Defense [UNK] Attack
16
  sentences:
17
+ - themes kingsideAttack master mate mateIn1 oneMove opening opening Sicilian Defense
18
+ Sicilian Defense Nyezhmetdinov-Rossolimo Attack moves f3e5 c6g2 f3e5+c6g2
19
+ - themes crushing middlegame queensideAttack sacrifice veryLong moves d7c7 b3e6
20
+ f7e6 e1e6 c8b8 f6d7 c7d7 e6d7 d7c7+b3e6 b3e6+f7e6 f7e6+e1e6 e1e6+c8b8 c8b8+f6d7
21
+ f6d7+c7d7 c7d7+e6d7
22
+ - themes advancedPawn crushing endgame veryLong zugzwang moves d4e6 c4e6 f7e6 h7g6
23
+ f8g8 f6f7 g8f8 g6f6 e6e5 f6e5 d4e6+c4e6 c4e6+f7e6 f7e6+h7g6 h7g6+f8g8 f8g8+f6f7
24
+ f6f7+g8f8 g8f8+g6f6 g6f6+e6e5 e6e5+f6e5
25
+ - source_sentence: crushing intermezzo master middlegame sacrifice veryLong
26
  sentences:
27
+ - themes crushing endgame master masterVsMaster veryLong moves f5f6 c5e6 h5g6 h7g6
28
+ c3f3 d5b4 f3c6 b4c6 f5f6+c5e6 c5e6+h5g6 h5g6+h7g6 h7g6+c3f3 c3f3+d5b4 d5b4+f3c6
29
+ f3c6+b4c6
30
+ - themes advancedPawn advantage endgame long master promotion rookEndgame moves
31
+ h3h2 g1g2 g3g2 a6a7 h2h1q a7b8q h3h2+g1g2 g1g2+g3g2 g3g2+a6a7 a6a7+h2h1q h2h1q+a7b8q
32
+ - themes crushing intermezzo master middlegame sacrifice veryLong moves a6c4 d6f6
33
+ f1f6 h6h1 g1f2 h8f6 f2e2 f6e7 a6c4+d6f6 d6f6+f1f6 f1f6+h6h1 h6h1+g1f2 g1f2+h8f6
34
+ h8f6+f2e2 f2e2+f6e7
35
+ - source_sentence: advantage hangingPiece middlegame short Nimzo-Larsen Attack Nimzo-Larsen
36
+ Attack Modern [UNK]
37
  sentences:
38
+ - themes hangingPiece mate mateIn1 middlegame oneMove opening Trompowsky Attack
39
+ Trompowsky Attack Classical Defense moves f4g4 d8d1 f4g4+d8d1
40
+ - themes advancedPawn crushing defensiveMove endgame master quietMove veryLong moves
41
+ f1e1 h3h2 f8h8 f5h4 h8e5 g3g2 e5e4 h4f3 f1e1+h3h2 h3h2+f8h8 f8h8+f5h4 f5h4+h8e5
42
+ h8e5+g3g2 g3g2+e5e4 e5e4+h4f3
43
+ - themes advantage hangingPiece middlegame short opening Nimzo-Larsen Attack Nimzo-Larsen
44
+ Attack Modern Variation moves f5d7 b5g5 e3e2 d1d2 f5d7+b5g5 b5g5+e3e2 e3e2+d1d2
45
+ - source_sentence: '[UNK] defensiveMove [UNK] [UNK] veryLong'
46
  sentences:
47
+ - themes advantage discoveredAttack exposedKing middlegame trappedPiece veryLong
48
+ opening French Defense French Defense Orthoschnapp Gambit moves e2d1 c4e3 d2e3
49
+ b5f1 d1d2 f1g2 g1e2 g2h1 e2d1+c4e3 c4e3+d2e3 d2e3+b5f1 b5f1+d1d2 d1d2+f1g2 f1g2+g1e2
50
+ g1e2+g2h1
51
+ - themes crushing defensiveMove enPassant middlegame veryLong moves g2e2 a3f3 f7f5
52
+ e5f6 c4f4 g3f4 e2g2 f3g3 g2e2+a3f3 a3f3+f7f5 f7f5+e5f6 e5f6+c4f4 c4f4+g3f4 g3f4+e2g2
53
+ e2g2+f3g3
54
+ - themes advancedPawn bishopEndgame crushing defensiveMove endgame veryLong moves
55
+ f3e4 a3a2 g6g7 e6f7 e5e6 f7g8 e6e7 c5e7 f3e4+a3a2 a3a2+g6g7 g6g7+e6f7 e6f7+e5e6
56
+ e5e6+f7g8 f7g8+e6e7 e6e7+c5e7
57
+ - source_sentence: '[UNK] deflection discoveredAttack [UNK] queensideAttack short
58
+ Philidor Defense [UNK] Defense Other variations'
59
  sentences:
60
+ - themes crushing middlegame pin queensideAttack short opening Sicilian Defense
61
+ Sicilian Defense Najdorf Variation moves c3d5 c5b3 c1b1 b3d2 c3d5+c5b3 c5b3+c1b1
62
+ c1b1+b3d2
63
+ - themes crushing deflection discoveredAttack middlegame queensideAttack short opening
64
+ Philidor Defense Philidor Defense Other variations moves d3c3 d4b3 c1b1 d7d1 d3c3+d4b3
65
+ d4b3+c1b1 c1b1+d7d1
66
+ - themes advantage discoveredAttack middlegame short opening Philidor Defense Philidor
67
+ Defense Other variations moves e4d4 d3f5 c8b8 d1d4 e4d4+d3f5 d3f5+c8b8 c8b8+d1d4
68
  pipeline_tag: sentence-similarity
69
  library_name: sentence-transformers
70
  metrics:
 
88
  type: chess-ir
89
  metrics:
90
  - type: cosine_accuracy@1
91
+ value: 0.06
92
  name: Cosine Accuracy@1
93
  - type: cosine_accuracy@10
94
+ value: 0.255
95
  name: Cosine Accuracy@10
96
  - type: cosine_precision@1
97
+ value: 0.06
98
  name: Cosine Precision@1
99
  - type: cosine_precision@10
100
+ value: 0.032
101
  name: Cosine Precision@10
102
  - type: cosine_recall@1
103
+ value: 0.02
104
  name: Cosine Recall@1
105
  - type: cosine_recall@10
106
+ value: 0.10666666666666665
107
  name: Cosine Recall@10
108
  - type: cosine_ndcg@10
109
+ value: 0.07998649265394674
110
  name: Cosine Ndcg@10
111
  - type: cosine_mrr@10
112
+ value: 0.11224206349206348
113
  name: Cosine Mrr@10
114
  - type: cosine_map@100
115
+ value: 0.06593273410392075
116
  name: Cosine Map@100
117
  - task:
118
  type: information-retrieval
 
122
  type: chess-ir-tokens
123
  metrics:
124
  - type: cosine_accuracy@1
125
+ value: 0.12698412698412698
126
  name: Cosine Accuracy@1
127
  - type: cosine_accuracy@10
128
+ value: 0.3544973544973545
129
  name: Cosine Accuracy@10
130
  - type: cosine_precision@1
131
+ value: 0.12698412698412698
132
  name: Cosine Precision@1
133
  - type: cosine_precision@10
134
+ value: 0.10476190476190476
135
  name: Cosine Precision@10
136
  - type: cosine_recall@1
137
+ value: 0.0066613186633905
138
  name: Cosine Recall@1
139
  - type: cosine_recall@10
140
+ value: 0.0462228099305809
141
  name: Cosine Recall@10
142
  - type: cosine_ndcg@10
143
+ value: 0.11807198905104373
144
  name: Cosine Ndcg@10
145
  - type: cosine_mrr@10
146
+ value: 0.18598303518938442
147
  name: Cosine Mrr@10
148
  - type: cosine_map@100
149
+ value: 0.06497812950052975
150
  name: Cosine Map@100
151
  ---
152
 
 
198
  model = SentenceTransformer("oneryalcin/static-embedding-chess")
199
  # Run inference
200
  queries = [
201
+ '[UNK] deflection discoveredAttack [UNK] queensideAttack short Philidor Defense [UNK] Defense Other variations',
202
  ]
203
  documents = [
204
+ 'themes crushing deflection discoveredAttack middlegame queensideAttack short opening Philidor Defense Philidor Defense Other variations moves d3c3 d4b3 c1b1 d7d1 d3c3+d4b3 d4b3+c1b1 c1b1+d7d1',
205
+ 'themes advantage discoveredAttack middlegame short opening Philidor Defense Philidor Defense Other variations moves e4d4 d3f5 c8b8 d1d4 e4d4+d3f5 d3f5+c8b8 c8b8+d1d4',
206
+ 'themes crushing middlegame pin queensideAttack short opening Sicilian Defense Sicilian Defense Najdorf Variation moves c3d5 c5b3 c1b1 b3d2 c3d5+c5b3 c5b3+c1b1 c1b1+b3d2',
207
  ]
208
  query_embeddings = model.encode_query(queries)
209
  document_embeddings = model.encode_document(documents)
 
213
  # Get the similarity scores for the embeddings
214
  similarities = model.similarity(query_embeddings, document_embeddings)
215
  print(similarities)
216
+ # tensor([[0.6231, 0.4530, 0.1689]])
217
  ```
218
  <!--
219
  ### Direct Usage (Transformers)
 
248
  * Datasets: `chess-ir` and `chess-ir-tokens`
249
  * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.sentence_transformer.evaluation.InformationRetrievalEvaluator)
250
 
251
+ | Metric | chess-ir | chess-ir-tokens |
252
+ |:--------------------|:---------|:----------------|
253
+ | cosine_accuracy@1 | 0.06 | 0.127 |
254
+ | cosine_accuracy@10 | 0.255 | 0.3545 |
255
+ | cosine_precision@1 | 0.06 | 0.127 |
256
+ | cosine_precision@10 | 0.032 | 0.1048 |
257
+ | cosine_recall@1 | 0.02 | 0.0067 |
258
+ | cosine_recall@10 | 0.1067 | 0.0462 |
259
+ | **cosine_ndcg@10** | **0.08** | **0.1181** |
260
+ | cosine_mrr@10 | 0.1122 | 0.186 |
261
+ | cosine_map@100 | 0.0659 | 0.065 |
262
 
263
  <!--
264
  ## Bias, Risks and Limitations
 
278
 
279
  #### Unnamed Dataset
280
 
281
+ * Size: 1,619,946 training samples
282
  * Columns: <code>anchor</code> and <code>positive</code>
283
  * Approximate statistics based on the first 100 samples:
284
  | | anchor | positive |
285
  |:---------|:------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------|
286
  | type | string | string |
287
  | modality | text | text |
288
+ | details | <ul><li>min: 21 characters</li><li>mean: 75.57 characters</li><li>max: 122 characters</li></ul> | <ul><li>min: 86 characters</li><li>mean: 158.13 characters</li><li>max: 256 characters</li></ul> |
289
  * Samples:
290
+ | anchor | positive |
291
+ |:---------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
292
+ | <code>kingsideAttack mate mateIn1 middlegame oneMove Horwitz Defense Horwitz Defense [UNK] variations</code> | <code>themes kingsideAttack mate mateIn1 middlegame oneMove opening Horwitz Defense Horwitz Defense Other variations moves f7h8 g6g2 f7h8+g6g2</code> |
293
+ | <code>backRankMate endgame mate mateIn2 short Kings Knight Opening Kings Knight Opening [UNK] [UNK]</code> | <code>themes backRankMate endgame mate mateIn2 short opening Kings Knight Opening Kings Knight Opening Other variations moves c5d4 c3c8 g5d8 c8d8 c5d4+c3c8 c3c8+g5d8 g5d8+c8d8</code> |
294
+ | <code>kingsideAttack mate mateIn1 middlegame oneMove Sicilian Defense Sicilian Defense Paulsen-Basman Defense</code> | <code>themes kingsideAttack mate mateIn1 middlegame oneMove opening Sicilian Defense Sicilian Defense Paulsen-Basman Defense moves g3f3 c7h2 g3f3+c7h2</code> |
295
  * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
296
  ```json
297
  {
 
317
  ### Training Hyperparameters
318
  #### Non-Default Hyperparameters
319
 
320
+ - `per_device_train_batch_size`: 4096
321
+ - `num_train_epochs`: 20
322
+ - `learning_rate`: 0.01
323
  - `warmup_steps`: 0.1
324
  - `weight_decay`: 0.01
325
+ - `per_device_eval_batch_size`: 4096
326
  - `push_to_hub`: True
327
  - `hub_model_id`: oneryalcin/static-embedding-chess
328
  - `load_best_model_at_end`: True
 
331
  #### All Hyperparameters
332
  <details><summary>Click to expand</summary>
333
 
334
+ - `per_device_train_batch_size`: 4096
335
+ - `num_train_epochs`: 20
336
  - `max_steps`: -1
337
+ - `learning_rate`: 0.01
338
  - `lr_scheduler_type`: linear
339
  - `lr_scheduler_kwargs`: None
340
  - `warmup_steps`: 0.1
 
375
  - `trackio_space_id`: None
376
  - `trackio_bucket_id`: None
377
  - `trackio_static_space_id`: None
378
+ - `per_device_eval_batch_size`: 4096
379
  - `prediction_loss_only`: True
380
  - `eval_on_start`: False
381
  - `eval_do_concat_batches`: True
 
436
  ### Training Logs
437
  | Epoch | Step | Training Loss | chess-ir_cosine_ndcg@10 | chess-ir-tokens_cosine_ndcg@10 |
438
  |:------:|:----:|:-------------:|:-----------------------:|:------------------------------:|
439
+ | -1 | -1 | - | 0.0123 | 0.0561 |
440
+ | 0.0025 | 1 | 27.3123 | - | - |
441
+ | 0.2020 | 80 | 26.3304 | - | - |
442
+ | 0.4040 | 160 | 22.2114 | - | - |
443
+ | 0.6061 | 240 | 17.4522 | - | - |
444
+ | 0.8081 | 320 | 12.8864 | - | - |
445
+ | 1.0 | 396 | - | 0.0800 | 0.1181 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
446
 
447
 
448
  ### Training Time
449
+ - **Training**: 57.6 seconds
450
  - **Evaluation**: 0.1 seconds
451
+ - **Total**: 57.7 seconds
452
 
453
  ### Framework Versions
454
  - Python: 3.12.10
eval/Information-Retrieval_evaluation_chess-ir-tokens_results.csv CHANGED
@@ -1,4 +1,2 @@
1
  epoch,steps,cosine-Accuracy@1,cosine-Accuracy@10,cosine-Precision@1,cosine-Recall@1,cosine-Precision@10,cosine-Recall@10,cosine-MRR@10,cosine-NDCG@10,cosine-MAP@100
2
- 0.10007022471910113,285,0.1111111111111111,0.30158730158730157,0.1111111111111111,0.008191309640952804,0.0835978835978836,0.03797928598263959,0.16048962794994542,0.0963937043281825,0.05480807151213741
3
- 0.20014044943820225,570,0.05291005291005291,0.21164021164021163,0.05291005291005291,0.0032049522325313766,0.056613756613756616,0.023108435943979263,0.09312379272696733,0.062386658509055025,0.0369514194632888
4
- 0.30021067415730335,855,0.037037037037037035,0.21164021164021163,0.037037037037037035,0.0025144161912381744,0.047619047619047616,0.02212990521949281,0.08710842361636012,0.0517090496324674,0.028156284478181654
 
1
  epoch,steps,cosine-Accuracy@1,cosine-Accuracy@10,cosine-Precision@1,cosine-Recall@1,cosine-Precision@10,cosine-Recall@10,cosine-MRR@10,cosine-NDCG@10,cosine-MAP@100
2
+ 1.0,396,0.12698412698412698,0.3544973544973545,0.12698412698412698,0.0066613186633905,0.10476190476190476,0.0462228099305809,0.18598303518938442,0.11807198905104373,0.06497812950052975
 
 
eval/Information-Retrieval_evaluation_chess-ir_results.csv CHANGED
@@ -1,4 +1,2 @@
1
  epoch,steps,cosine-Accuracy@1,cosine-Accuracy@10,cosine-Precision@1,cosine-Recall@1,cosine-Precision@10,cosine-Recall@10,cosine-MRR@10,cosine-NDCG@10,cosine-MAP@100
2
- 0.10007022471910113,285,0.02,0.135,0.02,0.006666666666666666,0.0175,0.05833333333333333,0.05090277777777777,0.040260232965004236,0.03468285594907049
3
- 0.20014044943820225,570,0.01,0.06,0.01,0.003333333333333333,0.006999999999999999,0.02333333333333333,0.021797619047619052,0.0165414546823231,0.01826039464782554
4
- 0.30021067415730335,855,0.01,0.055,0.01,0.003333333333333333,0.006,0.019999999999999997,0.02086111111111111,0.014141653573050736,0.012561680163147302
 
1
  epoch,steps,cosine-Accuracy@1,cosine-Accuracy@10,cosine-Precision@1,cosine-Recall@1,cosine-Precision@10,cosine-Recall@10,cosine-MRR@10,cosine-NDCG@10,cosine-MAP@100
2
+ 1.0,396,0.06,0.255,0.06,0.02,0.032,0.10666666666666665,0.11224206349206348,0.07998649265394674,0.06593273410392075
 
 
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a585943d76bec5e52fc185f5414dba80093b3693261e4921542d31ea01c10fb8
3
  size 8880224
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0946dae682df6739a9cd9ab6a2c4699a9557dcff45cc062b465309d6d403b2e3
3
  size 8880224
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d1f79a123f09dc75fd3488fe5caef388a8c542815dabe7ec16811867955b17a2
3
  size 5713
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:426fc88cc7388ad3485f0a0e98b7edcbc0f7e7ad469707d5448cc9275c652053
3
  size 5713