Sentence Similarity
sentence-transformers
Safetensors
English
static-embedding
chess
retrieval
exploratory
Instructions to use oneryalcin/static-embedding-chess with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use oneryalcin/static-embedding-chess with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("oneryalcin/static-embedding-chess") sentences = [ "That is a happy person", "That is a happy dog", "That is a very happy person", "Today is a sunny day" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Notebooks
- Google Colab
- Kaggle
Training in progress, step 285
Browse files- README.md +158 -147
- chess_tokenizer.json +0 -0
- config_sentence_transformers.json +2 -2
- eval/Information-Retrieval_evaluation_chess-ir-tokens_results.csv +2 -0
- eval/Information-Retrieval_evaluation_chess-ir_results.csv +1 -6
- model.safetensors +2 -2
- tokenizer.json +0 -0
- training_args.bin +1 -1
README.md
CHANGED
|
@@ -8,39 +8,49 @@ tags:
|
|
| 8 |
- feature-extraction
|
| 9 |
- generated_from_trainer
|
| 10 |
- dataset_size:5832592
|
|
|
|
| 11 |
- loss:MultipleNegativesRankingLoss
|
| 12 |
widget:
|
| 13 |
- source_sentence: crushing middlegame sacrifice short
|
| 14 |
sentences:
|
| 15 |
-
- themes
|
| 16 |
-
|
| 17 |
-
- themes
|
|
|
|
|
|
|
|
|
|
| 18 |
- source_sentence: crushing endgame long
|
| 19 |
sentences:
|
| 20 |
-
- themes crushing endgame long moves e2c2 f5g5 c2g2 g5h6 g2h2 h6g7
|
| 21 |
-
|
| 22 |
-
- themes endgame
|
| 23 |
-
|
|
|
|
|
|
|
|
|
|
| 24 |
sentences:
|
| 25 |
-
- themes
|
| 26 |
-
|
| 27 |
-
- themes crushing endgame
|
| 28 |
-
|
| 29 |
-
|
| 30 |
-
- source_sentence: crushing
|
| 31 |
sentences:
|
| 32 |
-
- themes
|
| 33 |
-
|
| 34 |
-
- themes
|
| 35 |
-
|
| 36 |
-
|
| 37 |
-
-
|
|
|
|
|
|
|
|
|
|
| 38 |
sentences:
|
| 39 |
-
- themes crushing
|
| 40 |
-
|
| 41 |
-
|
| 42 |
-
|
| 43 |
-
|
| 44 |
pipeline_tag: sentence-similarity
|
| 45 |
library_name: sentence-transformers
|
| 46 |
metrics:
|
|
@@ -54,7 +64,7 @@ metrics:
|
|
| 54 |
- cosine_mrr@10
|
| 55 |
- cosine_map@100
|
| 56 |
model-index:
|
| 57 |
-
- name: Static chess embedding (
|
| 58 |
results:
|
| 59 |
- task:
|
| 60 |
type: information-retrieval
|
|
@@ -64,37 +74,71 @@ model-index:
|
|
| 64 |
type: chess-ir
|
| 65 |
metrics:
|
| 66 |
- type: cosine_accuracy@1
|
| 67 |
-
value: 0.
|
| 68 |
name: Cosine Accuracy@1
|
| 69 |
- type: cosine_accuracy@10
|
| 70 |
-
value: 0.
|
| 71 |
name: Cosine Accuracy@10
|
| 72 |
- type: cosine_precision@1
|
| 73 |
-
value: 0.
|
| 74 |
name: Cosine Precision@1
|
| 75 |
- type: cosine_precision@10
|
| 76 |
-
value: 0.
|
| 77 |
name: Cosine Precision@10
|
| 78 |
- type: cosine_recall@1
|
| 79 |
-
value: 0.
|
| 80 |
name: Cosine Recall@1
|
| 81 |
- type: cosine_recall@10
|
| 82 |
-
value: 0.
|
| 83 |
name: Cosine Recall@10
|
| 84 |
- type: cosine_ndcg@10
|
| 85 |
-
value: 0.
|
| 86 |
name: Cosine Ndcg@10
|
| 87 |
- type: cosine_mrr@10
|
| 88 |
-
value: 0.
|
| 89 |
name: Cosine Mrr@10
|
| 90 |
- type: cosine_map@100
|
| 91 |
-
value: 0.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 92 |
name: Cosine Map@100
|
| 93 |
---
|
| 94 |
|
| 95 |
-
# Static chess embedding (
|
| 96 |
|
| 97 |
-
This is a [sentence-transformers](https://www.SBERT.net) model trained. It maps sentences & paragraphs to a
|
| 98 |
|
| 99 |
## Model Details
|
| 100 |
|
|
@@ -102,7 +146,7 @@ This is a [sentence-transformers](https://www.SBERT.net) model trained. It maps
|
|
| 102 |
- **Model Type:** Sentence Transformer
|
| 103 |
<!-- - **Base model:** [Unknown](https://huggingface.co/unknown) -->
|
| 104 |
- **Maximum Sequence Length:** inf tokens
|
| 105 |
-
- **Output Dimensionality:**
|
| 106 |
- **Similarity Function:** Cosine Similarity
|
| 107 |
- **Supported Modality:** Text
|
| 108 |
<!-- - **Training Dataset:** Unknown -->
|
|
@@ -140,22 +184,22 @@ from sentence_transformers import SentenceTransformer
|
|
| 140 |
model = SentenceTransformer("oneryalcin/static-embedding-chess")
|
| 141 |
# Run inference
|
| 142 |
queries = [
|
| 143 |
-
'
|
| 144 |
]
|
| 145 |
documents = [
|
| 146 |
-
'themes
|
| 147 |
-
'themes crushing
|
| 148 |
-
'themes
|
| 149 |
]
|
| 150 |
query_embeddings = model.encode_query(queries)
|
| 151 |
document_embeddings = model.encode_document(documents)
|
| 152 |
print(query_embeddings.shape, document_embeddings.shape)
|
| 153 |
-
# [1,
|
| 154 |
|
| 155 |
# Get the similarity scores for the embeddings
|
| 156 |
similarities = model.similarity(query_embeddings, document_embeddings)
|
| 157 |
print(similarities)
|
| 158 |
-
# tensor([[ 0.
|
| 159 |
```
|
| 160 |
<!--
|
| 161 |
### Direct Usage (Transformers)
|
|
@@ -187,20 +231,20 @@ You can finetune this model on your own dataset.
|
|
| 187 |
|
| 188 |
#### Information Retrieval
|
| 189 |
|
| 190 |
-
*
|
| 191 |
* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.sentence_transformer.evaluation.InformationRetrievalEvaluator)
|
| 192 |
|
| 193 |
-
| Metric |
|
| 194 |
-
|:--------------------|:-----------|
|
| 195 |
-
| cosine_accuracy@1 | 0.
|
| 196 |
-
| cosine_accuracy@10 | 0.
|
| 197 |
-
| cosine_precision@1 | 0.
|
| 198 |
-
| cosine_precision@10 | 0.
|
| 199 |
-
| cosine_recall@1 | 0.
|
| 200 |
-
| cosine_recall@10 | 0.
|
| 201 |
-
| **cosine_ndcg@10** | **0.
|
| 202 |
-
| cosine_mrr@10 | 0.
|
| 203 |
-
| cosine_map@100 | 0.
|
| 204 |
|
| 205 |
<!--
|
| 206 |
## Bias, Risks and Limitations
|
|
@@ -223,29 +267,36 @@ You can finetune this model on your own dataset.
|
|
| 223 |
* Size: 5,832,592 training samples
|
| 224 |
* Columns: <code>anchor</code> and <code>positive</code>
|
| 225 |
* Approximate statistics based on the first 100 samples:
|
| 226 |
-
| | anchor | positive
|
| 227 |
-
|:---------|:------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------|
|
| 228 |
-
| type | string | string
|
| 229 |
-
| modality | text | text
|
| 230 |
-
| details | <ul><li>min: 14 characters</li><li>mean: 45.72 characters</li><li>max: 107 characters</li></ul> | <ul><li>min:
|
| 231 |
* Samples:
|
| 232 |
-
| anchor | positive
|
| 233 |
-
|:-----------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------|
|
| 234 |
-
| <code>crushing endgame fork short</code> | <code>themes crushing endgame fork short moves f7f6 g5e6 g7h6 e6c5</code> |
|
| 235 |
-
| <code>crushing discoveredAttack kingsideAttack middlegame short</code> | <code>themes crushing discoveredAttack kingsideAttack middlegame short moves e4g3 f3g3 f2g3 h5e2</code> |
|
| 236 |
-
| <code>crushing middlegame short</code> | <code>themes crushing middlegame short moves d7c8 e2g4 c8c7 c3b5</code> |
|
| 237 |
-
* Loss: [<code>
|
| 238 |
```json
|
| 239 |
{
|
| 240 |
-
"
|
| 241 |
-
"
|
| 242 |
-
|
| 243 |
-
|
| 244 |
-
|
|
|
|
|
|
|
| 245 |
],
|
| 246 |
-
"
|
| 247 |
-
|
| 248 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 249 |
}
|
| 250 |
```
|
| 251 |
|
|
@@ -254,11 +305,9 @@ You can finetune this model on your own dataset.
|
|
| 254 |
|
| 255 |
- `per_device_train_batch_size`: 2048
|
| 256 |
- `num_train_epochs`: 1
|
| 257 |
-
- `max_steps`: 500
|
| 258 |
- `learning_rate`: 0.05
|
| 259 |
- `warmup_steps`: 0.1
|
| 260 |
- `weight_decay`: 0.01
|
| 261 |
-
- `bf16`: True
|
| 262 |
- `per_device_eval_batch_size`: 2048
|
| 263 |
- `push_to_hub`: True
|
| 264 |
- `hub_model_id`: oneryalcin/static-embedding-chess
|
|
@@ -270,7 +319,7 @@ You can finetune this model on your own dataset.
|
|
| 270 |
|
| 271 |
- `per_device_train_batch_size`: 2048
|
| 272 |
- `num_train_epochs`: 1
|
| 273 |
-
- `max_steps`:
|
| 274 |
- `learning_rate`: 0.05
|
| 275 |
- `lr_scheduler_type`: linear
|
| 276 |
- `lr_scheduler_kwargs`: None
|
|
@@ -286,7 +335,7 @@ You can finetune this model on your own dataset.
|
|
| 286 |
- `average_tokens_across_devices`: True
|
| 287 |
- `max_grad_norm`: 1.0
|
| 288 |
- `label_smoothing_factor`: 0.0
|
| 289 |
-
- `bf16`:
|
| 290 |
- `fp16`: False
|
| 291 |
- `bf16_full_eval`: False
|
| 292 |
- `fp16_full_eval`: False
|
|
@@ -371,82 +420,32 @@ You can finetune this model on your own dataset.
|
|
| 371 |
</details>
|
| 372 |
|
| 373 |
### Training Logs
|
| 374 |
-
| Epoch | Step | Training Loss | chess-ir_cosine_ndcg@10 |
|
| 375 |
-
|:------:|:----:|:-------------:|:-----------------------:|
|
| 376 |
-
| -1 | -1 | - | 0.
|
| 377 |
-
| 0.0004 | 1 |
|
| 378 |
-
| 0.
|
| 379 |
-
| 0.
|
| 380 |
-
| 0.
|
| 381 |
-
| 0.
|
| 382 |
-
| 0.
|
| 383 |
-
| 0.
|
| 384 |
-
| 0.
|
| 385 |
-
| 0.
|
| 386 |
-
| 0.
|
| 387 |
-
| 0.
|
| 388 |
-
| 0.0193 | 55 | 1.4075 | - |
|
| 389 |
-
| 0.0211 | 60 | 1.4012 | - |
|
| 390 |
-
| 0.0228 | 65 | 1.4055 | - |
|
| 391 |
-
| 0.0246 | 70 | 1.3977 | - |
|
| 392 |
-
| 0.0263 | 75 | 1.3597 | - |
|
| 393 |
-
| 0.0281 | 80 | 1.3765 | - |
|
| 394 |
-
| 0.0298 | 85 | 1.3657 | - |
|
| 395 |
-
| 0.0316 | 90 | 1.3138 | - |
|
| 396 |
-
| 0.0334 | 95 | 1.3596 | - |
|
| 397 |
-
| 0.0351 | 100 | 1.3428 | 0.0335 |
|
| 398 |
-
| 0.0369 | 105 | 1.3302 | - |
|
| 399 |
-
| 0.0386 | 110 | 1.3281 | - |
|
| 400 |
-
| 0.0404 | 115 | 1.3520 | - |
|
| 401 |
-
| 0.0421 | 120 | 1.3127 | - |
|
| 402 |
-
| 0.0439 | 125 | 1.3362 | - |
|
| 403 |
-
| 0.0456 | 130 | 1.3174 | - |
|
| 404 |
-
| 0.0474 | 135 | 1.3103 | - |
|
| 405 |
-
| 0.0492 | 140 | 1.3428 | - |
|
| 406 |
-
| 0.0509 | 145 | 1.2886 | - |
|
| 407 |
-
| 0.0527 | 150 | 1.2895 | 0.0345 |
|
| 408 |
-
| 0.0544 | 155 | 1.3418 | - |
|
| 409 |
-
| 0.0562 | 160 | 1.3498 | - |
|
| 410 |
-
| 0.0579 | 165 | 1.3033 | - |
|
| 411 |
-
| 0.0597 | 170 | 1.2958 | - |
|
| 412 |
-
| 0.0614 | 175 | 1.3081 | - |
|
| 413 |
-
| 0.0632 | 180 | 1.3154 | - |
|
| 414 |
-
| 0.0650 | 185 | 1.3129 | - |
|
| 415 |
-
| 0.0667 | 190 | 1.3124 | - |
|
| 416 |
-
| 0.0685 | 195 | 1.3237 | - |
|
| 417 |
-
| 0.0702 | 200 | 1.3051 | 0.0451 |
|
| 418 |
-
| 0.0720 | 205 | 1.2801 | - |
|
| 419 |
-
| 0.0737 | 210 | 1.3404 | - |
|
| 420 |
-
| 0.0755 | 215 | 1.2916 | - |
|
| 421 |
-
| 0.0772 | 220 | 1.2981 | - |
|
| 422 |
-
| 0.0790 | 225 | 1.3321 | - |
|
| 423 |
-
| 0.0808 | 230 | 1.3369 | - |
|
| 424 |
-
| 0.0825 | 235 | 1.3059 | - |
|
| 425 |
-
| 0.0843 | 240 | 1.3213 | - |
|
| 426 |
-
| 0.0860 | 245 | 1.3127 | - |
|
| 427 |
-
| 0.0878 | 250 | 1.2801 | 0.0374 |
|
| 428 |
-
| 0.0895 | 255 | 1.2940 | - |
|
| 429 |
-
| 0.0913 | 260 | 1.3423 | - |
|
| 430 |
-
| 0.0930 | 265 | 1.2860 | - |
|
| 431 |
-
| 0.0948 | 270 | 1.3022 | - |
|
| 432 |
-
| 0.0966 | 275 | 1.3040 | - |
|
| 433 |
-
| 0.0983 | 280 | 1.2921 | - |
|
| 434 |
-
| 0.1001 | 285 | 1.2940 | - |
|
| 435 |
-
| 0.1018 | 290 | 1.3064 | - |
|
| 436 |
-
| 0.1036 | 295 | 1.3042 | - |
|
| 437 |
-
| 0.1053 | 300 | 1.3058 | 0.0392 |
|
| 438 |
|
| 439 |
|
| 440 |
### Training Time
|
| 441 |
-
- **Training**:
|
| 442 |
-
- **Evaluation**: 0.
|
| 443 |
-
- **Total**:
|
| 444 |
|
| 445 |
### Framework Versions
|
| 446 |
- Python: 3.12.10
|
| 447 |
- Sentence Transformers: 5.5.0
|
| 448 |
-
- Transformers: 5.8.
|
| 449 |
-
- PyTorch: 2.
|
| 450 |
- Accelerate: 1.13.0
|
| 451 |
- Datasets: 4.8.5
|
| 452 |
- Tokenizers: 0.22.2
|
|
@@ -468,6 +467,18 @@ You can finetune this model on your own dataset.
|
|
| 468 |
}
|
| 469 |
```
|
| 470 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 471 |
#### MultipleNegativesRankingLoss
|
| 472 |
```bibtex
|
| 473 |
@misc{oord2019representationlearningcontrastivepredictive,
|
|
|
|
| 8 |
- feature-extraction
|
| 9 |
- generated_from_trainer
|
| 10 |
- dataset_size:5832592
|
| 11 |
+
- loss:MatryoshkaLoss
|
| 12 |
- loss:MultipleNegativesRankingLoss
|
| 13 |
widget:
|
| 14 |
- source_sentence: crushing middlegame sacrifice short
|
| 15 |
sentences:
|
| 16 |
+
- themes advantage middlegame short moves f4f7 c4d5 f7d5 b3d5 f4f7+c4d5 c4d5+f7d5
|
| 17 |
+
f7d5+b3d5
|
| 18 |
+
- themes advantage fork middlegame short opening Four Knights Game Four Knights
|
| 19 |
+
Game Italian Variation moves c8f5 d5e7 g8h8 e7f5 c8f5+d5e7 d5e7+g8h8 g8h8+e7f5
|
| 20 |
+
- themes crushing middlegame sacrifice short moves g6g4 e1e6 f7e6 d2h6 g6g4+e1e6
|
| 21 |
+
e1e6+f7e6 f7e6+d2h6
|
| 22 |
- source_sentence: crushing endgame long
|
| 23 |
sentences:
|
| 24 |
+
- themes crushing endgame long moves e2c2 f5g5 c2g2 g5h6 g2h2 h6g7 e2c2+f5g5 f5g5+c2g2
|
| 25 |
+
c2g2+g5h6 g5h6+g2h2 g2h2+h6g7
|
| 26 |
+
- themes crushing endgame fork hangingPiece long moves c7c3 b2c3 d5f7 g5g7 f7g7
|
| 27 |
+
f8g7 c7c3+b2c3 b2c3+d5f7 d5f7+g5g7 g5g7+f7g7 f7g7+f8g7
|
| 28 |
+
- themes crushing intermezzo middlegame short moves c5b4 d1d3 f6e7 a3b4 c5b4+d1d3
|
| 29 |
+
d1d3+f6e7 f6e7+a3b4
|
| 30 |
+
- source_sentence: crushing endgame fork short
|
| 31 |
sentences:
|
| 32 |
+
- themes crushing endgame rookEndgame short skewer moves b4b3 h7h8 f8g7 h8b8 b4b3+h7h8
|
| 33 |
+
h7h8+f8g7 f8g7+h8b8
|
| 34 |
+
- themes crushing endgame fork short moves f2f1 f3d2 f1e2 d2c4 f2f1+f3d2 f3d2+f1e2
|
| 35 |
+
f1e2+d2c4
|
| 36 |
+
- themes mate mateIn1 middlegame oneMove moves d7d6 g3g7 d7d6+g3g7
|
| 37 |
+
- source_sentence: crushing fork middlegame veryLong
|
| 38 |
sentences:
|
| 39 |
+
- themes crushing endgame fork master short moves f7f5 a6g6 g5g6 h4g6 f7f5+a6g6
|
| 40 |
+
a6g6+g5g6 g5g6+h4g6
|
| 41 |
+
- themes attraction discoveredCheck doubleCheck long mate mateIn3 opening operaMate
|
| 42 |
+
sacrifice opening Bishops Opening Bishops Opening Ponziani Gambit moves h8g8 f6d8
|
| 43 |
+
e8d8 d2g5 d8e8 d1d8 h8g8+f6d8 f6d8+e8d8 e8d8+d2g5 d2g5+d8e8 d8e8+d1d8
|
| 44 |
+
- themes crushing fork middlegame veryLong moves h6h7 e8h5 f3g3 c5e3 h7h8q e3f4
|
| 45 |
+
g3g2 h5g4 g2h1 f4d2 a1g1 g4f3 h6h7+e8h5 e8h5+f3g3 f3g3+c5e3 c5e3+h7h8q h7h8q+e3f4
|
| 46 |
+
e3f4+g3g2 g3g2+h5g4 h5g4+g2h1 g2h1+f4d2 f4d2+a1g1 a1g1+g4f3
|
| 47 |
+
- source_sentence: endgame mate mateIn2 pillsburysMate short
|
| 48 |
sentences:
|
| 49 |
+
- themes bishopEndgame crushing defensiveMove endgame master short moves g3g4 h5h4
|
| 50 |
+
f4g5 h6g5 g3g4+h5h4 h5h4+f4g5 f4g5+h6g5
|
| 51 |
+
- themes endgame mate mateIn2 pillsburysMate short moves c4e3 b5b8 f5c8 b8c8 c4e3+b5b8
|
| 52 |
+
b5b8+f5c8 f5c8+b8c8
|
| 53 |
+
- themes endgame mate mateIn1 oneMove moves e5f4 g3g1 e5f4+g3g1
|
| 54 |
pipeline_tag: sentence-similarity
|
| 55 |
library_name: sentence-transformers
|
| 56 |
metrics:
|
|
|
|
| 64 |
- cosine_mrr@10
|
| 65 |
- cosine_map@100
|
| 66 |
model-index:
|
| 67 |
+
- name: Static chess embedding (512d) -- themes/openings <-> positions
|
| 68 |
results:
|
| 69 |
- task:
|
| 70 |
type: information-retrieval
|
|
|
|
| 74 |
type: chess-ir
|
| 75 |
metrics:
|
| 76 |
- type: cosine_accuracy@1
|
| 77 |
+
value: 0.02
|
| 78 |
name: Cosine Accuracy@1
|
| 79 |
- type: cosine_accuracy@10
|
| 80 |
+
value: 0.135
|
| 81 |
name: Cosine Accuracy@10
|
| 82 |
- type: cosine_precision@1
|
| 83 |
+
value: 0.02
|
| 84 |
name: Cosine Precision@1
|
| 85 |
- type: cosine_precision@10
|
| 86 |
+
value: 0.0175
|
| 87 |
name: Cosine Precision@10
|
| 88 |
- type: cosine_recall@1
|
| 89 |
+
value: 0.006666666666666666
|
| 90 |
name: Cosine Recall@1
|
| 91 |
- type: cosine_recall@10
|
| 92 |
+
value: 0.05833333333333333
|
| 93 |
name: Cosine Recall@10
|
| 94 |
- type: cosine_ndcg@10
|
| 95 |
+
value: 0.040260232965004236
|
| 96 |
name: Cosine Ndcg@10
|
| 97 |
- type: cosine_mrr@10
|
| 98 |
+
value: 0.05090277777777777
|
| 99 |
name: Cosine Mrr@10
|
| 100 |
- type: cosine_map@100
|
| 101 |
+
value: 0.03468285594907049
|
| 102 |
+
name: Cosine Map@100
|
| 103 |
+
- task:
|
| 104 |
+
type: information-retrieval
|
| 105 |
+
name: Information Retrieval
|
| 106 |
+
dataset:
|
| 107 |
+
name: chess ir tokens
|
| 108 |
+
type: chess-ir-tokens
|
| 109 |
+
metrics:
|
| 110 |
+
- type: cosine_accuracy@1
|
| 111 |
+
value: 0.1111111111111111
|
| 112 |
+
name: Cosine Accuracy@1
|
| 113 |
+
- type: cosine_accuracy@10
|
| 114 |
+
value: 0.30158730158730157
|
| 115 |
+
name: Cosine Accuracy@10
|
| 116 |
+
- type: cosine_precision@1
|
| 117 |
+
value: 0.1111111111111111
|
| 118 |
+
name: Cosine Precision@1
|
| 119 |
+
- type: cosine_precision@10
|
| 120 |
+
value: 0.0835978835978836
|
| 121 |
+
name: Cosine Precision@10
|
| 122 |
+
- type: cosine_recall@1
|
| 123 |
+
value: 0.008191309640952804
|
| 124 |
+
name: Cosine Recall@1
|
| 125 |
+
- type: cosine_recall@10
|
| 126 |
+
value: 0.03797928598263959
|
| 127 |
+
name: Cosine Recall@10
|
| 128 |
+
- type: cosine_ndcg@10
|
| 129 |
+
value: 0.0963937043281825
|
| 130 |
+
name: Cosine Ndcg@10
|
| 131 |
+
- type: cosine_mrr@10
|
| 132 |
+
value: 0.16048962794994542
|
| 133 |
+
name: Cosine Mrr@10
|
| 134 |
+
- type: cosine_map@100
|
| 135 |
+
value: 0.05480807151213741
|
| 136 |
name: Cosine Map@100
|
| 137 |
---
|
| 138 |
|
| 139 |
+
# Static chess embedding (512d) -- themes/openings <-> positions
|
| 140 |
|
| 141 |
+
This is a [sentence-transformers](https://www.SBERT.net) model trained. It maps sentences & paragraphs to a 512-dimensional dense vector space and can be used for retrieval.
|
| 142 |
|
| 143 |
## Model Details
|
| 144 |
|
|
|
|
| 146 |
- **Model Type:** Sentence Transformer
|
| 147 |
<!-- - **Base model:** [Unknown](https://huggingface.co/unknown) -->
|
| 148 |
- **Maximum Sequence Length:** inf tokens
|
| 149 |
+
- **Output Dimensionality:** 512 dimensions
|
| 150 |
- **Similarity Function:** Cosine Similarity
|
| 151 |
- **Supported Modality:** Text
|
| 152 |
<!-- - **Training Dataset:** Unknown -->
|
|
|
|
| 184 |
model = SentenceTransformer("oneryalcin/static-embedding-chess")
|
| 185 |
# Run inference
|
| 186 |
queries = [
|
| 187 |
+
'endgame mate mateIn2 pillsburysMate short',
|
| 188 |
]
|
| 189 |
documents = [
|
| 190 |
+
'themes endgame mate mateIn2 pillsburysMate short moves c4e3 b5b8 f5c8 b8c8 c4e3+b5b8 b5b8+f5c8 f5c8+b8c8',
|
| 191 |
+
'themes bishopEndgame crushing defensiveMove endgame master short moves g3g4 h5h4 f4g5 h6g5 g3g4+h5h4 h5h4+f4g5 f4g5+h6g5',
|
| 192 |
+
'themes endgame mate mateIn1 oneMove moves e5f4 g3g1 e5f4+g3g1',
|
| 193 |
]
|
| 194 |
query_embeddings = model.encode_query(queries)
|
| 195 |
document_embeddings = model.encode_document(documents)
|
| 196 |
print(query_embeddings.shape, document_embeddings.shape)
|
| 197 |
+
# [1, 512] [3, 512]
|
| 198 |
|
| 199 |
# Get the similarity scores for the embeddings
|
| 200 |
similarities = model.similarity(query_embeddings, document_embeddings)
|
| 201 |
print(similarities)
|
| 202 |
+
# tensor([[ 0.8014, -0.0485, 0.0709]])
|
| 203 |
```
|
| 204 |
<!--
|
| 205 |
### Direct Usage (Transformers)
|
|
|
|
| 231 |
|
| 232 |
#### Information Retrieval
|
| 233 |
|
| 234 |
+
* Datasets: `chess-ir` and `chess-ir-tokens`
|
| 235 |
* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.sentence_transformer.evaluation.InformationRetrievalEvaluator)
|
| 236 |
|
| 237 |
+
| Metric | chess-ir | chess-ir-tokens |
|
| 238 |
+
|:--------------------|:-----------|:----------------|
|
| 239 |
+
| cosine_accuracy@1 | 0.02 | 0.1111 |
|
| 240 |
+
| cosine_accuracy@10 | 0.135 | 0.3016 |
|
| 241 |
+
| cosine_precision@1 | 0.02 | 0.1111 |
|
| 242 |
+
| cosine_precision@10 | 0.0175 | 0.0836 |
|
| 243 |
+
| cosine_recall@1 | 0.0067 | 0.0082 |
|
| 244 |
+
| cosine_recall@10 | 0.0583 | 0.038 |
|
| 245 |
+
| **cosine_ndcg@10** | **0.0403** | **0.0964** |
|
| 246 |
+
| cosine_mrr@10 | 0.0509 | 0.1605 |
|
| 247 |
+
| cosine_map@100 | 0.0347 | 0.0548 |
|
| 248 |
|
| 249 |
<!--
|
| 250 |
## Bias, Risks and Limitations
|
|
|
|
| 267 |
* Size: 5,832,592 training samples
|
| 268 |
* Columns: <code>anchor</code> and <code>positive</code>
|
| 269 |
* Approximate statistics based on the first 100 samples:
|
| 270 |
+
| | anchor | positive |
|
| 271 |
+
|:---------|:------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------|
|
| 272 |
+
| type | string | string |
|
| 273 |
+
| modality | text | text |
|
| 274 |
+
| details | <ul><li>min: 14 characters</li><li>mean: 45.72 characters</li><li>max: 107 characters</li></ul> | <ul><li>min: 61 characters</li><li>mean: 121.98 characters</li><li>max: 233 characters</li></ul> |
|
| 275 |
* Samples:
|
| 276 |
+
| anchor | positive |
|
| 277 |
+
|:-----------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------|
|
| 278 |
+
| <code>crushing endgame fork short</code> | <code>themes crushing endgame fork short moves f7f6 g5e6 g7h6 e6c5 f7f6+g5e6 g5e6+g7h6 g7h6+e6c5</code> |
|
| 279 |
+
| <code>crushing discoveredAttack kingsideAttack middlegame short</code> | <code>themes crushing discoveredAttack kingsideAttack middlegame short moves e4g3 f3g3 f2g3 h5e2 e4g3+f3g3 f3g3+f2g3 f2g3+h5e2</code> |
|
| 280 |
+
| <code>crushing middlegame short</code> | <code>themes crushing middlegame short moves d7c8 e2g4 c8c7 c3b5 d7c8+e2g4 e2g4+c8c7 c8c7+c3b5</code> |
|
| 281 |
+
* Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
|
| 282 |
```json
|
| 283 |
{
|
| 284 |
+
"loss": "MultipleNegativesRankingLoss",
|
| 285 |
+
"matryoshka_dims": [
|
| 286 |
+
512,
|
| 287 |
+
256,
|
| 288 |
+
128,
|
| 289 |
+
64,
|
| 290 |
+
32
|
| 291 |
],
|
| 292 |
+
"matryoshka_weights": [
|
| 293 |
+
1,
|
| 294 |
+
1,
|
| 295 |
+
1,
|
| 296 |
+
1,
|
| 297 |
+
1
|
| 298 |
+
],
|
| 299 |
+
"n_dims_per_step": -1
|
| 300 |
}
|
| 301 |
```
|
| 302 |
|
|
|
|
| 305 |
|
| 306 |
- `per_device_train_batch_size`: 2048
|
| 307 |
- `num_train_epochs`: 1
|
|
|
|
| 308 |
- `learning_rate`: 0.05
|
| 309 |
- `warmup_steps`: 0.1
|
| 310 |
- `weight_decay`: 0.01
|
|
|
|
| 311 |
- `per_device_eval_batch_size`: 2048
|
| 312 |
- `push_to_hub`: True
|
| 313 |
- `hub_model_id`: oneryalcin/static-embedding-chess
|
|
|
|
| 319 |
|
| 320 |
- `per_device_train_batch_size`: 2048
|
| 321 |
- `num_train_epochs`: 1
|
| 322 |
+
- `max_steps`: -1
|
| 323 |
- `learning_rate`: 0.05
|
| 324 |
- `lr_scheduler_type`: linear
|
| 325 |
- `lr_scheduler_kwargs`: None
|
|
|
|
| 335 |
- `average_tokens_across_devices`: True
|
| 336 |
- `max_grad_norm`: 1.0
|
| 337 |
- `label_smoothing_factor`: 0.0
|
| 338 |
+
- `bf16`: False
|
| 339 |
- `fp16`: False
|
| 340 |
- `bf16_full_eval`: False
|
| 341 |
- `fp16_full_eval`: False
|
|
|
|
| 420 |
</details>
|
| 421 |
|
| 422 |
### Training Logs
|
| 423 |
+
| Epoch | Step | Training Loss | chess-ir_cosine_ndcg@10 | chess-ir-tokens_cosine_ndcg@10 |
|
| 424 |
+
|:------:|:----:|:-------------:|:-----------------------:|:------------------------------:|
|
| 425 |
+
| -1 | -1 | - | 0.0087 | 0.0476 |
|
| 426 |
+
| 0.0004 | 1 | 25.5090 | - | - |
|
| 427 |
+
| 0.0102 | 29 | 24.7398 | - | - |
|
| 428 |
+
| 0.0204 | 58 | 20.8309 | - | - |
|
| 429 |
+
| 0.0305 | 87 | 16.5176 | - | - |
|
| 430 |
+
| 0.0407 | 116 | 12.8534 | - | - |
|
| 431 |
+
| 0.0509 | 145 | 10.2759 | - | - |
|
| 432 |
+
| 0.0611 | 174 | 8.7313 | - | - |
|
| 433 |
+
| 0.0713 | 203 | 7.8373 | - | - |
|
| 434 |
+
| 0.0815 | 232 | 7.3665 | - | - |
|
| 435 |
+
| 0.0916 | 261 | 7.0534 | - | - |
|
| 436 |
+
| 0.1001 | 285 | - | 0.0403 | 0.0964 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 437 |
|
| 438 |
|
| 439 |
### Training Time
|
| 440 |
+
- **Training**: 16.5 seconds
|
| 441 |
+
- **Evaluation**: 0.1 seconds
|
| 442 |
+
- **Total**: 16.6 seconds
|
| 443 |
|
| 444 |
### Framework Versions
|
| 445 |
- Python: 3.12.10
|
| 446 |
- Sentence Transformers: 5.5.0
|
| 447 |
+
- Transformers: 5.8.0
|
| 448 |
+
- PyTorch: 2.11.0
|
| 449 |
- Accelerate: 1.13.0
|
| 450 |
- Datasets: 4.8.5
|
| 451 |
- Tokenizers: 0.22.2
|
|
|
|
| 467 |
}
|
| 468 |
```
|
| 469 |
|
| 470 |
+
#### MatryoshkaLoss
|
| 471 |
+
```bibtex
|
| 472 |
+
@misc{kusupati2024matryoshka,
|
| 473 |
+
title={Matryoshka Representation Learning},
|
| 474 |
+
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
|
| 475 |
+
year={2024},
|
| 476 |
+
eprint={2205.13147},
|
| 477 |
+
archivePrefix={arXiv},
|
| 478 |
+
primaryClass={cs.LG}
|
| 479 |
+
}
|
| 480 |
+
```
|
| 481 |
+
|
| 482 |
#### MultipleNegativesRankingLoss
|
| 483 |
```bibtex
|
| 484 |
@misc{oord2019representationlearningcontrastivepredictive,
|
chess_tokenizer.json
CHANGED
|
The diff for this file is too large to render.
See raw diff
|
|
|
config_sentence_transformers.json
CHANGED
|
@@ -1,8 +1,8 @@
|
|
| 1 |
{
|
| 2 |
"__version__": {
|
| 3 |
-
"pytorch": "2.
|
| 4 |
"sentence_transformers": "5.5.0",
|
| 5 |
-
"transformers": "5.8.
|
| 6 |
},
|
| 7 |
"default_prompt_name": null,
|
| 8 |
"model_type": "SentenceTransformer",
|
|
|
|
| 1 |
{
|
| 2 |
"__version__": {
|
| 3 |
+
"pytorch": "2.11.0",
|
| 4 |
"sentence_transformers": "5.5.0",
|
| 5 |
+
"transformers": "5.8.0"
|
| 6 |
},
|
| 7 |
"default_prompt_name": null,
|
| 8 |
"model_type": "SentenceTransformer",
|
eval/Information-Retrieval_evaluation_chess-ir-tokens_results.csv
ADDED
|
@@ -0,0 +1,2 @@
|
|
|
|
|
|
|
|
|
|
| 1 |
+
epoch,steps,cosine-Accuracy@1,cosine-Accuracy@10,cosine-Precision@1,cosine-Recall@1,cosine-Precision@10,cosine-Recall@10,cosine-MRR@10,cosine-NDCG@10,cosine-MAP@100
|
| 2 |
+
0.10007022471910113,285,0.1111111111111111,0.30158730158730157,0.1111111111111111,0.008191309640952804,0.0835978835978836,0.03797928598263959,0.16048962794994542,0.0963937043281825,0.05480807151213741
|
eval/Information-Retrieval_evaluation_chess-ir_results.csv
CHANGED
|
@@ -1,7 +1,2 @@
|
|
| 1 |
epoch,steps,cosine-Accuracy@1,cosine-Accuracy@10,cosine-Precision@1,cosine-Recall@1,cosine-Precision@10,cosine-Recall@10,cosine-MRR@10,cosine-NDCG@10,cosine-MAP@100
|
| 2 |
-
0.
|
| 3 |
-
0.0351123595505618,100,0.015,0.135,0.015,0.005,0.016,0.05333333333333333,0.04136111111111111,0.03352606053277749,0.025214543549657912
|
| 4 |
-
0.05266853932584269,150,0.02,0.12,0.02,0.006666666666666666,0.0155,0.051666666666666666,0.04391468253968253,0.034539315152376744,0.02851338765635309
|
| 5 |
-
0.0702247191011236,200,0.03,0.16,0.03,0.009999999999999998,0.02,0.06666666666666667,0.05857142857142858,0.045080933582823335,0.033163497941181515
|
| 6 |
-
0.0877808988764045,250,0.025,0.14,0.025,0.008333333333333333,0.017,0.056666666666666664,0.049240079365079355,0.037406426241984,0.02874627448743367
|
| 7 |
-
0.10533707865168539,300,0.025,0.125,0.025,0.008333333333333333,0.016,0.05333333333333333,0.053103174603174604,0.03923902062478621,0.03190843674305716
|
|
|
|
| 1 |
epoch,steps,cosine-Accuracy@1,cosine-Accuracy@10,cosine-Precision@1,cosine-Recall@1,cosine-Precision@10,cosine-Recall@10,cosine-MRR@10,cosine-NDCG@10,cosine-MAP@100
|
| 2 |
+
0.10007022471910113,285,0.02,0.135,0.02,0.006666666666666666,0.0175,0.05833333333333333,0.05090277777777777,0.040260232965004236,0.03468285594907049
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c50ae1fdd13646f6ccd8502b934c8d7f1ac91ee33935efefe06cbb8bd4c6cdd4
|
| 3 |
+
size 8880224
|
tokenizer.json
CHANGED
|
The diff for this file is too large to render.
See raw diff
|
|
|
training_args.bin
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 5713
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d1f79a123f09dc75fd3488fe5caef388a8c542815dabe7ec16811867955b17a2
|
| 3 |
size 5713
|