File size: 32,266 Bytes
fb1b084 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 | ---
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:46338
- loss:MatryoshkaLoss
- loss:MultipleNegativesRankingLoss
base_model: Snowflake/snowflake-arctic-embed-l
widget:
- source_sentence: What criteria must Member States consider when establishing penalties
for infringements of the specified Regulation, and what is the deadline for notifying
the Commission about these rules?
sentences:
- 'Enforcement
1.
Member States shall lay down the rules on penalties applicable to infringements
of this Regulation and shall take all measures necessary to ensure that they are
implemented. The penalties provided for must be effective, proportionate and dissuasive
taking into account, in particular, the nature, duration, recurrence and gravity
of the infringement. Member States shall, by 31 December 2024, notify the Commission
of those rules and of those measures and shall notify it without delay of any
subsequent amendment affecting them.
2.'
- Within the transitional periods established, Member States shall progressively
reduce their respective gaps with regard to the new minimum levels of taxation.
However, where the difference between the national level and the minimum level
does not exceed 3 % of that minimum level, the Member State concerned may wait
until the end of the period to adjust its national level.
- 'AR 10. ‘Indirect political contribution’ refers to those political contributions
made through an intermediary organisation such as a lobbyist or charity, or support
given to an organisation such as a think tank or trade association linked to or
supporting particular political parties or causes.
AR 11. When determining ‘comparable position’ in this standard, the undertaking
shall consider various factors, including level of responsibility and scope of
activities undertaken.
AR 12. The undertaking may provide the following information on its financial
or in-kind contributions with regard to its lobbying expenses:
(a)
the total monetary amount of such internal and external expenses; and
(b)'
- source_sentence: How does the use of AI systems impact access to essential public
assistance benefits and services?
sentences:
- X. Among these substances there are ‘priority hazardous substances’ which means
substances identified in accordance with Article 16(3) and (6) for which measures
have to be taken in accordance with Article 16(1) and (8). --- --- 31. ‘Pollutant’means
any substance liable to cause pollution, in particular those listed in Annex VIII.
--- --- 32. ‘Direct discharge to groundwater’means discharge of pollutants into
groundwater without percolation throughout the soil or subsoil. --- --- 33. ‘Pollution’means
the direct or indirect introduction, as a result of human activity, of substances
or heat into the air, water or land which may be harmful to human health or the
quality of aquatic ecosystems or terrestrial ecosystems directly depending on
- 'The competent authorities shall inform the requesting competent authorities of
any decision taken under the first subparagraph, stating the reasons therefor.
4.
In order to ensure uniform application of this Article, ESMA may develop draft
implementing technical standards to establish common procedures for competent
authorities to cooperate in on-the-spot verifications and investigations.
Power is conferred on the Commission to adopt the implementing technical standards
referred to in the first subparagraph in accordance with Article 15 of Regulation
(EU) No 1095/2010.
Article 55
Dispute settlement'
- (58) Another area in which the use of AI systems deserves special consideration
is the access to and enjoyment of certain essential private and public services
and benefits necessary for people to fully participate in society or to improve
one’s standard of living. In particular, natural persons applying for or receiving
essential public assistance benefits and services from public authorities namely
healthcare services, social security benefits, social services providing protection
in cases such as maternity, illness, industrial accidents, dependency or old age
and loss of employment and social and housing assistance, are typically dependent
on those benefits and services and in a vulnerable position in relation to the
responsible
- source_sentence: How does the context suggest promoting vulnerable customers' active
engagement in the energy market?
sentences:
- energy efficiency improvement measures as priority actions; --- --- (c) carry
out early, forward-looking investments in energy efficiency improvement measures
before distributional impacts from other policies and measures show their effect;
--- --- (d) foster technical assistance and the roll-out of enabling funding and
financial tools, such as on-bill schemes, local loan-loss reserve, guarantee funds,
funds targeting deep renovations and renovations with minimum energy gains; ---
--- (e) foster technical assistance for social actors to promote vulnerable customer’s
active engagement in the energy market, and positive changes in their energy consumption
behaviour; --- --- (f) ensure access to finance, grants or subsidies bound to minimum
- '4.
To the extent that the tasks relating to the implementation of the Innovation
Fund are not delegated to an implementing body, the Commission shall carry out
those tasks.
Article 18
Tasks of the implementing body
►M2 The implementing body designated in accordance with Article 17(1) of this
Regulation to implement the Innovation Fund in accordance with Article 17(2) may
be entrusted with the overall management of the calls for proposals, the disbursement
of the Innovation Fund support and the monitoring of the implementation of selected
projects. ◄ For that purpose, the implementing body may be entrusted with the
following tasks:
(a)
organising the call for proposals;
(b)'
- 'Calculation
Calculations of emissions shall be performed using the formula:
Activity data × Emission factor × Oxidation factor
Activity data (fuel used, production rate etc.) shall be monitored on the basis
of supply data or measurement.'
- source_sentence: What is the purpose of Directive 2004/109/EC of the European Parliament
and of the Council of 15 December 2004?
sentences:
- '3.7. Uses advised against ►M7 (see Section 1 of the safety data sheet) ◄
Where applicable, an indication of the uses which the registrant advises against
and why (i.e. non-statutory recommendations by supplier). This need not be an
exhaustive list.
4. CLASSIFICATION AND LABELLING
▼M3
4.1 The hazard classification of the substance(s), resulting from the application
of Title I and II of Regulation (EC) No 1272/2008 for all hazard classes and categories
in that Regulation,
In addition, for each entry, the reasons why no classification is given for a
hazard class or differentiation of a hazard class should be provided (i.e. if
data are lacking, inconclusive, or conclusive but not sufficient for classification),'
- '(b)
operations by which the user of an energy product makes its reuse possible in
his own undertaking provided that the taxation already paid on such product is
not less than the taxation which would be due if the reused energy product were
again to be liable to taxation;
(c)
an operation consisting of mixing, outside a production establishment or a tax
warehouse, energy products with other energy products or other materials, provided
that:
(i)
taxation on the components has been paid previously; and
(ii)
the amount paid is not less than the amount of the tax which would be chargeable
on the mixture.
The condition under (i) shall not apply where the mixture is exempted for a specific
use.
Article 22'
- '( 15 ) Directive 2004/109/EC of the European Parliament and of the Council of
15 December 2004 on the harmonisation of transparency requirements in relation
to information about issuers whose securities are admitted to trading on a regulated
market and amending Directive 2001/34/EC (OJ L 390, 31.12.2004, p. 38).
( 16 ) Regulation (EU) 2020/852 of the European Parliament and of the Council
of 18 June 2020 on the establishment of a framework to facilitate sustainable
investment, and amending Regulation (EU) 2019/2088 (OJ L 198, 22.6.2020, p. 13).
( 17 ) OJ L 142, 30.4.2004, p. 12.
( 18 ) OJ L 340, 22.12.2007, p. 66.'
- source_sentence: What are the main objectives of the directives mentioned in the
text regarding greenhouse gas emissions and carbon dioxide storage, and how do
they relate to environmental protection and sustainability within the European
Union?
sentences:
- '(24) Directive 2003/87/EC of the European Parliament and of the Council of 13
October 2003 establishing a scheme for greenhouse gas emission allowance trading
within the Union and amending Council Directive 96/61/EC (OJ L 275, 25.10.2003,
p. 32).
(25) Directive 2009/31/EC of the European Parliament and of the Council of 23
April 2009 on the geological storage of carbon dioxide and amending Council Directive
85/337/EEC, European Parliament and Council Directives 2000/60/EC, 2001/80/EC,
2004/35/EC, 2006/12/EC, 2008/1/EC and Regulation (EC) No 1013/2006 (OJ L 140,
5.6.2009, p. 114).
(26) Directive 2014/23/EU of the European Parliament and of the Council of 26
February 2014 on the award of concession contracts (OJ L 94, 28.3.2014, p. 1).'
- 'Article 33
Responsibility and liability for drawing up and publishing the financial statements
and the management report
▼M4
1.'
- '(b)
risks related to the undertaking’s dependencies on consumers and/or end-users
may include the loss of business continuity where an economic crisis makes consumers
unable to afford certain products or services;
(c)
►C1 opportunities related to the undertaking’s impacts on consumers and/or end-
users may include market differentiation and greater customer appeal from offering
safe products or privacy-respecting services; and ◄
(d)'
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- cosine_accuracy@1
- cosine_accuracy@3
- cosine_accuracy@5
- cosine_accuracy@10
- cosine_precision@1
- cosine_precision@3
- cosine_precision@5
- cosine_precision@10
- cosine_recall@1
- cosine_recall@3
- cosine_recall@5
- cosine_recall@10
- cosine_ndcg@10
- cosine_mrr@10
- cosine_map@100
model-index:
- name: SentenceTransformer based on Snowflake/snowflake-arctic-embed-l
results:
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: Unknown
type: unknown
metrics:
- type: cosine_accuracy@1
value: 0.6659761781460383
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.8841705506645952
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.9312963921974797
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.9672017952701536
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.6659761781460383
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.29472351688819837
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.1862592784394959
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.09672017952701535
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.6659761781460383
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.8841705506645952
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.9312963921974797
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.9672017952701536
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.8278291318026204
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.7818480980055302
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.783515504381956
name: Cosine Map@100
---
# SentenceTransformer based on Snowflake/snowflake-arctic-embed-l
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [Snowflake/snowflake-arctic-embed-l](https://huggingface.co/Snowflake/snowflake-arctic-embed-l). It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
## Model Details
### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [Snowflake/snowflake-arctic-embed-l](https://huggingface.co/Snowflake/snowflake-arctic-embed-l) <!-- at revision d8fb21ca8d905d2832ee8b96c894d3298964346b -->
- **Maximum Sequence Length:** 512 tokens
- **Output Dimensionality:** 1024 dimensions
- **Similarity Function:** Cosine Similarity
<!-- - **Training Dataset:** Unknown -->
<!-- - **Language:** Unknown -->
<!-- - **License:** Unknown -->
### Model Sources
- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
### Full Model Architecture
```
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
```
## Usage
### Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
```bash
pip install -U sentence-transformers
```
Then you can load this model and run inference.
```python
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'What are the main objectives of the directives mentioned in the text regarding greenhouse gas emissions and carbon dioxide storage, and how do they relate to environmental protection and sustainability within the European Union?',
'(24) Directive 2003/87/EC of the European Parliament and of the Council of 13 October 2003 establishing a scheme for greenhouse gas emission allowance trading within the Union and amending Council Directive 96/61/EC (OJ L 275, 25.10.2003, p. 32).\n\n(25) Directive 2009/31/EC of the European Parliament and of the Council of 23 April 2009 on the geological storage of carbon dioxide and amending Council Directive 85/337/EEC, European Parliament and Council Directives 2000/60/EC, 2001/80/EC, 2004/35/EC, 2006/12/EC, 2008/1/EC and Regulation (EC) No 1013/2006 (OJ L 140, 5.6.2009, p. 114).\n\n(26) Directive 2014/23/EU of the European Parliament and of the Council of 26 February 2014 on the award of concession contracts (OJ L 94, 28.3.2014, p. 1).',
'Article 33\n\nResponsibility and liability for drawing up and publishing the financial statements and the management report\n\n▼M4\n\n1.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
```
<!--
### Direct Usage (Transformers)
<details><summary>Click to see the direct usage in Transformers</summary>
</details>
-->
<!--
### Downstream Usage (Sentence Transformers)
You can finetune this model on your own dataset.
<details><summary>Click to expand</summary>
</details>
-->
<!--
### Out-of-Scope Use
*List how the model may foreseeably be misused and address what users ought not to do with the model.*
-->
## Evaluation
### Metrics
#### Information Retrieval
* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
| Metric | Value |
|:--------------------|:-----------|
| cosine_accuracy@1 | 0.666 |
| cosine_accuracy@3 | 0.8842 |
| cosine_accuracy@5 | 0.9313 |
| cosine_accuracy@10 | 0.9672 |
| cosine_precision@1 | 0.666 |
| cosine_precision@3 | 0.2947 |
| cosine_precision@5 | 0.1863 |
| cosine_precision@10 | 0.0967 |
| cosine_recall@1 | 0.666 |
| cosine_recall@3 | 0.8842 |
| cosine_recall@5 | 0.9313 |
| cosine_recall@10 | 0.9672 |
| **cosine_ndcg@10** | **0.8278** |
| cosine_mrr@10 | 0.7818 |
| cosine_map@100 | 0.7835 |
<!--
## Bias, Risks and Limitations
*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
-->
<!--
### Recommendations
*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
-->
## Training Details
### Training Dataset
#### Unnamed Dataset
* Size: 46,338 training samples
* Columns: <code>sentence_0</code> and <code>sentence_1</code>
* Approximate statistics based on the first 1000 samples:
| | sentence_0 | sentence_1 |
|:--------|:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
| type | string | string |
| details | <ul><li>min: 11 tokens</li><li>mean: 35.24 tokens</li><li>max: 206 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 193.39 tokens</li><li>max: 512 tokens</li></ul> |
* Samples:
| sentence_0 | sentence_1 |
|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <code>How is materiality defined in the context of an entity's sustainability reporting as per QC 4?</code> | <code>QC 4. Materiality is an entity-specific aspect of relevance based on the nature or magnitude, or both, of the items to which the information relates, as assessed in the context of the undertaking’s sustainability reporting (see chapter 3 of this Standard).<br><br>Faithful representation<br><br>QC 5. To be useful, the information must not only represent relevant phenomena, it must also faithfully represent the substance of the phenomena that it purports to represent. Faithful representation requires information to be (i) complete, (ii) neutral and (iii) accurate.</code> |
| <code>What procedure must be followed for the adoption of implementing acts as mentioned in the text?</code> | <code>Those implementing acts shall be adopted in accordance with the examination procedure referred to in Article 22a(2).<br><br>3.<br><br>Articles 9, 9a and 10 shall apply to maritime transport activities in the same manner as they apply to other activities covered by the EU ETS with the following exception with regard to the application of Article 10.</code> |
| <code>How should monitoring points be distributed for groundwater bodies that flow across Member State boundaries to effectively estimate groundwater flow?</code> | <code>The network shall include sufficient representative monitoring points to estimate the groundwater level in each groundwater body or group of bodies taking into account short and long-term variations in recharge and in particular:<br><br>— for groundwater bodies identified as being at risk of failing to achieve environmental objectives under Article 4, ensure sufficient density of monitoring points to assess the impact of abstractions and discharges on the groundwater level,<br><br>— for groundwater bodies within which groundwater flows across a Member State boundary, ensure sufficient monitoring points are provided to estimate the direction and rate of groundwater flow across the Member State boundary.<br><br>2.2.3. Monitoring frequency</code> |
* Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
```json
{
"loss": "MultipleNegativesRankingLoss",
"matryoshka_dims": [
1024,
768,
512,
256,
128,
64
],
"matryoshka_weights": [
1,
1,
1,
1,
1,
1
],
"n_dims_per_step": -1
}
```
### Training Hyperparameters
#### Non-Default Hyperparameters
- `eval_strategy`: steps
- `multi_dataset_batch_sampler`: round_robin
#### All Hyperparameters
<details><summary>Click to expand</summary>
- `overwrite_output_dir`: False
- `do_predict`: False
- `eval_strategy`: steps
- `prediction_loss_only`: True
- `per_device_train_batch_size`: 8
- `per_device_eval_batch_size`: 8
- `per_gpu_train_batch_size`: None
- `per_gpu_eval_batch_size`: None
- `gradient_accumulation_steps`: 1
- `eval_accumulation_steps`: None
- `torch_empty_cache_steps`: None
- `learning_rate`: 5e-05
- `weight_decay`: 0.0
- `adam_beta1`: 0.9
- `adam_beta2`: 0.999
- `adam_epsilon`: 1e-08
- `max_grad_norm`: 1
- `num_train_epochs`: 3
- `max_steps`: -1
- `lr_scheduler_type`: linear
- `lr_scheduler_kwargs`: {}
- `warmup_ratio`: 0.0
- `warmup_steps`: 0
- `log_level`: passive
- `log_level_replica`: warning
- `log_on_each_node`: True
- `logging_nan_inf_filter`: True
- `save_safetensors`: True
- `save_on_each_node`: False
- `save_only_model`: False
- `restore_callback_states_from_checkpoint`: False
- `no_cuda`: False
- `use_cpu`: False
- `use_mps_device`: False
- `seed`: 42
- `data_seed`: None
- `jit_mode_eval`: False
- `use_ipex`: False
- `bf16`: False
- `fp16`: False
- `fp16_opt_level`: O1
- `half_precision_backend`: auto
- `bf16_full_eval`: False
- `fp16_full_eval`: False
- `tf32`: None
- `local_rank`: 0
- `ddp_backend`: None
- `tpu_num_cores`: None
- `tpu_metrics_debug`: False
- `debug`: []
- `dataloader_drop_last`: False
- `dataloader_num_workers`: 0
- `dataloader_prefetch_factor`: None
- `past_index`: -1
- `disable_tqdm`: False
- `remove_unused_columns`: True
- `label_names`: None
- `load_best_model_at_end`: False
- `ignore_data_skip`: False
- `fsdp`: []
- `fsdp_min_num_params`: 0
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
- `fsdp_transformer_layer_cls_to_wrap`: None
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
- `deepspeed`: None
- `label_smoothing_factor`: 0.0
- `optim`: adamw_torch
- `optim_args`: None
- `adafactor`: False
- `group_by_length`: False
- `length_column_name`: length
- `ddp_find_unused_parameters`: None
- `ddp_bucket_cap_mb`: None
- `ddp_broadcast_buffers`: False
- `dataloader_pin_memory`: True
- `dataloader_persistent_workers`: False
- `skip_memory_metrics`: True
- `use_legacy_prediction_loop`: False
- `push_to_hub`: False
- `resume_from_checkpoint`: None
- `hub_model_id`: None
- `hub_strategy`: every_save
- `hub_private_repo`: None
- `hub_always_push`: False
- `gradient_checkpointing`: False
- `gradient_checkpointing_kwargs`: None
- `include_inputs_for_metrics`: False
- `include_for_metrics`: []
- `eval_do_concat_batches`: True
- `fp16_backend`: auto
- `push_to_hub_model_id`: None
- `push_to_hub_organization`: None
- `mp_parameters`:
- `auto_find_batch_size`: False
- `full_determinism`: False
- `torchdynamo`: None
- `ray_scope`: last
- `ddp_timeout`: 1800
- `torch_compile`: False
- `torch_compile_backend`: None
- `torch_compile_mode`: None
- `dispatch_batches`: None
- `split_batches`: None
- `include_tokens_per_second`: False
- `include_num_input_tokens_seen`: False
- `neftune_noise_alpha`: None
- `optim_target_modules`: None
- `batch_eval_metrics`: False
- `eval_on_start`: False
- `use_liger_kernel`: False
- `eval_use_gather_object`: False
- `average_tokens_across_devices`: False
- `prompts`: None
- `batch_sampler`: batch_sampler
- `multi_dataset_batch_sampler`: round_robin
</details>
### Training Logs
| Epoch | Step | Training Loss | cosine_ndcg@10 |
|:------:|:-----:|:-------------:|:--------------:|
| 0.0863 | 500 | 0.938 | - |
| 0.1726 | 1000 | 0.2188 | - |
| 0.2589 | 1500 | 0.1998 | - |
| 0.3452 | 2000 | 0.2162 | 0.7843 |
| 0.4316 | 2500 | 0.1921 | - |
| 0.5179 | 3000 | 0.1749 | - |
| 0.6042 | 3500 | 0.1741 | - |
| 0.6905 | 4000 | 0.2007 | 0.7779 |
| 0.7768 | 4500 | 0.1456 | - |
| 0.8631 | 5000 | 0.1034 | - |
| 0.9494 | 5500 | 0.1285 | - |
| 1.0 | 5793 | - | 0.7806 |
| 1.0357 | 6000 | 0.1011 | 0.7879 |
| 1.1220 | 6500 | 0.065 | - |
| 1.2084 | 7000 | 0.0754 | - |
| 1.2947 | 7500 | 0.067 | - |
| 1.3810 | 8000 | 0.059 | 0.7953 |
| 1.4673 | 8500 | 0.0644 | - |
| 1.5536 | 9000 | 0.0705 | - |
| 1.6399 | 9500 | 0.0425 | - |
| 1.7262 | 10000 | 0.0515 | 0.8171 |
| 1.8125 | 10500 | 0.0358 | - |
| 1.8988 | 11000 | 0.0515 | - |
| 1.9852 | 11500 | 0.043 | - |
| 2.0 | 11586 | - | 0.8201 |
| 2.0715 | 12000 | 0.0257 | 0.8208 |
| 2.1578 | 12500 | 0.0343 | - |
| 2.2441 | 13000 | 0.0307 | - |
| 2.3304 | 13500 | 0.0324 | - |
| 2.4167 | 14000 | 0.0225 | 0.8236 |
| 2.5030 | 14500 | 0.0362 | - |
| 2.5893 | 15000 | 0.0255 | - |
| 2.6756 | 15500 | 0.0203 | - |
| 2.7620 | 16000 | 0.0244 | 0.8240 |
| 2.8483 | 16500 | 0.0461 | - |
| 2.9346 | 17000 | 0.0226 | - |
| 3.0 | 17379 | - | 0.8278 |
### Framework Versions
- Python: 3.10.15
- Sentence Transformers: 3.4.1
- Transformers: 4.49.0
- PyTorch: 2.6.0+cu126
- Accelerate: 1.5.2
- Datasets: 3.4.1
- Tokenizers: 0.21.1
## Citation
### BibTeX
#### Sentence Transformers
```bibtex
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
```
#### MatryoshkaLoss
```bibtex
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
```
#### MultipleNegativesRankingLoss
```bibtex
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```
<!--
## Glossary
*Clearly define terms in order to be accessible across audiences.*
-->
<!--
## Model Card Authors
*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
-->
<!--
## Model Card Contact
*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
--> |