File size: 32,266 Bytes
fb1b084
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
---

tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:46338
- loss:MatryoshkaLoss
- loss:MultipleNegativesRankingLoss
base_model: Snowflake/snowflake-arctic-embed-l
widget:
- source_sentence: What criteria must Member States consider when establishing penalties
    for infringements of the specified Regulation, and what is the deadline for notifying
    the Commission about these rules?
  sentences:
  - 'Enforcement





    1.





    Member States shall lay down the rules on penalties applicable to infringements

    of this Regulation and shall take all measures necessary to ensure that they are

    implemented. The penalties provided for must be effective, proportionate and dissuasive

    taking into account, in particular, the nature, duration, recurrence and gravity

    of the infringement. Member States shall, by 31 December 2024, notify the Commission

    of those rules and of those measures and shall notify it without delay of any

    subsequent amendment affecting them.





    2.'
  - Within the transitional periods established, Member States shall progressively
    reduce their respective gaps with regard to the new minimum levels of taxation.
    However, where the difference between the national level and the minimum level
    does not exceed 3 % of that minimum level, the Member State concerned may wait
    until the end of the period to adjust its national level.
  - 'AR 10. ‘Indirect political contribution’ refers to those political contributions

    made through an intermediary organisation such as a lobbyist or charity, or support

    given to an organisation such as a think tank or trade association linked to or

    supporting particular political parties or causes.





    AR 11. When determining ‘comparable position’ in this standard, the undertaking

    shall consider various factors, including level of responsibility and scope of

    activities undertaken.





    AR 12. The undertaking may provide the following information on its financial

    or in-kind contributions with regard to its lobbying expenses:





    (a)





    the total monetary amount of such internal and external expenses; and





    (b)'
- source_sentence: How does the use of AI systems impact access to essential public
    assistance benefits and services?
  sentences:
  - X. Among these substances there are ‘priority hazardous substances’ which means
    substances identified in accordance with Article 16(3) and (6) for which measures
    have to be taken in accordance with Article 16(1) and (8). --- --- 31. ‘Pollutant’means
    any substance liable to cause pollution, in particular those listed in Annex VIII.
    --- --- 32. ‘Direct discharge to groundwater’means discharge of pollutants into
    groundwater without percolation throughout the soil or subsoil. --- --- 33. ‘Pollution’means
    the direct or indirect introduction, as a result of human activity, of substances
    or heat into the air, water or land which may be harmful to human health or the
    quality of aquatic ecosystems or terrestrial ecosystems directly depending on
  - 'The competent authorities shall inform the requesting competent authorities of

    any decision taken under the first subparagraph, stating the reasons therefor.





    4.





    In order to ensure uniform application of this Article, ESMA may develop draft

    implementing technical standards to establish common procedures for competent

    authorities to cooperate in on-the-spot verifications and investigations.





    Power is conferred on the Commission to adopt the implementing technical standards

    referred to in the first subparagraph in accordance with Article 15 of Regulation

    (EU) No 1095/2010.





    Article 55





    Dispute settlement'
  - (58) Another area in which the use of AI systems deserves special consideration
    is the access to and enjoyment of certain essential private and public services
    and benefits necessary for people to fully participate in society or to improve
    one’s standard of living. In particular, natural persons applying for or receiving
    essential public assistance benefits and services from public authorities namely
    healthcare services, social security benefits, social services providing protection
    in cases such as maternity, illness, industrial accidents, dependency or old age
    and loss of employment and social and housing assistance, are typically dependent
    on those benefits and services and in a vulnerable position in relation to the
    responsible
- source_sentence: How does the context suggest promoting vulnerable customers' active
    engagement in the energy market?
  sentences:
  - energy efficiency improvement measures as priority actions; --- --- (c) carry
    out early, forward-looking investments in energy efficiency improvement measures
    before distributional impacts from other policies and measures show their effect;
    --- --- (d) foster technical assistance and the roll-out of enabling funding and
    financial tools, such as on-bill schemes, local loan-loss reserve, guarantee funds,
    funds targeting deep renovations and renovations with minimum energy gains; ---
    --- (e) foster technical assistance for social actors to promote vulnerable customer’s
    active engagement in the energy market, and positive changes in their energy consumption
    behaviour; --- --- (f) ensure access to finance, grants or subsidies bound to minimum
  - '4.





    To the extent that the tasks relating to the implementation of the Innovation

    Fund are not delegated to an implementing body, the Commission shall carry out

    those tasks.





    Article 18





    Tasks of the implementing body





    ►M2 The implementing body designated in accordance with Article 17(1) of this

    Regulation to implement the Innovation Fund in accordance with Article 17(2) may

    be entrusted with the overall management of the calls for proposals, the disbursement

    of the Innovation Fund support and the monitoring of the implementation of selected

    projects. ◄ For that purpose, the implementing body may be entrusted with the

    following tasks:





    (a)





    organising the call for proposals;





    (b)'
  - 'Calculation





    Calculations of emissions shall be performed using the formula:





    Activity data × Emission factor × Oxidation factor





    Activity data (fuel used, production rate etc.) shall be monitored on the basis

    of supply data or measurement.'
- source_sentence: What is the purpose of Directive 2004/109/EC of the European Parliament
    and of the Council of 15 December 2004?
  sentences:
  - '3.7. Uses advised against ►M7 (see Section 1 of the safety data sheet) ◄





    Where applicable, an indication of the uses which the registrant advises against

    and why (i.e. non-statutory recommendations by supplier). This need not be an

    exhaustive list.





    4. CLASSIFICATION AND LABELLING





    ▼M3





    4.1 The hazard classification of the substance(s), resulting from the application

    of Title I and II of Regulation (EC) No 1272/2008 for all hazard classes and categories

    in that Regulation,





    In addition, for each entry, the reasons why no classification is given for a

    hazard class or differentiation of a hazard class should be provided (i.e. if

    data are lacking, inconclusive, or conclusive but not sufficient for classification),'
  - '(b)





    operations by which the user of an energy product makes its reuse possible in

    his own undertaking provided that the taxation already paid on such product is

    not less than the taxation which would be due if the reused energy product were

    again to be liable to taxation;





    (c)





    an operation consisting of mixing, outside a production establishment or a tax

    warehouse, energy products with other energy products or other materials, provided

    that:





    (i)





    taxation on the components has been paid previously; and





    (ii)





    the amount paid is not less than the amount of the tax which would be chargeable

    on the mixture.





    The condition under (i) shall not apply where the mixture is exempted for a specific

    use.





    Article 22'
  - '( 15 ) Directive 2004/109/EC of the European Parliament and of the Council of

    15 December 2004 on the harmonisation of transparency requirements in relation

    to information about issuers whose securities are admitted to trading on a regulated

    market and amending Directive 2001/34/EC (OJ L 390, 31.12.2004, p. 38).





    ( 16 ) Regulation (EU) 2020/852 of the European Parliament and of the Council

    of 18 June 2020 on the establishment of a framework to facilitate sustainable

    investment, and amending Regulation (EU) 2019/2088 (OJ L 198, 22.6.2020, p. 13).





    ( 17 ) OJ L 142, 30.4.2004, p. 12.





    ( 18 ) OJ L 340, 22.12.2007, p. 66.'
- source_sentence: What are the main objectives of the directives mentioned in the
    text regarding greenhouse gas emissions and carbon dioxide storage, and how do
    they relate to environmental protection and sustainability within the European
    Union?
  sentences:
  - '(24) Directive 2003/87/EC of the European Parliament and of the Council of 13

    October 2003 establishing a scheme for greenhouse gas emission allowance trading

    within the Union and amending Council Directive 96/61/EC (OJ L 275, 25.10.2003,

    p. 32).





    (25) Directive 2009/31/EC of the European Parliament and of the Council of 23

    April 2009 on the geological storage of carbon dioxide and amending Council Directive

    85/337/EEC, European Parliament and Council Directives 2000/60/EC, 2001/80/EC,

    2004/35/EC, 2006/12/EC, 2008/1/EC and Regulation (EC) No 1013/2006 (OJ L 140,

    5.6.2009, p. 114).





    (26) Directive 2014/23/EU of the European Parliament and of the Council of 26

    February 2014 on the award of concession contracts (OJ L 94, 28.3.2014, p. 1).'
  - 'Article 33





    Responsibility and liability for drawing up and publishing the financial statements

    and the management report





    ▼M4





    1.'
  - '(b)





    risks related to the undertaking’s dependencies on consumers and/or end-users

    may include the loss of business continuity where an economic crisis makes consumers

    unable to afford certain products or services;





    (c)





    ►C1 opportunities related to the undertaking’s impacts on consumers and/or end-

    users may include market differentiation and greater customer appeal from offering

    safe products or privacy-respecting services; and ◄





    (d)'
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- cosine_accuracy@1
- cosine_accuracy@3
- cosine_accuracy@5
- cosine_accuracy@10
- cosine_precision@1
- cosine_precision@3
- cosine_precision@5
- cosine_precision@10
- cosine_recall@1
- cosine_recall@3
- cosine_recall@5
- cosine_recall@10
- cosine_ndcg@10
- cosine_mrr@10
- cosine_map@100
model-index:
- name: SentenceTransformer based on Snowflake/snowflake-arctic-embed-l
  results:
  - task:
      type: information-retrieval
      name: Information Retrieval
    dataset:
      name: Unknown
      type: unknown
    metrics:
    - type: cosine_accuracy@1
      value: 0.6659761781460383
      name: Cosine Accuracy@1
    - type: cosine_accuracy@3
      value: 0.8841705506645952
      name: Cosine Accuracy@3
    - type: cosine_accuracy@5
      value: 0.9312963921974797
      name: Cosine Accuracy@5
    - type: cosine_accuracy@10
      value: 0.9672017952701536
      name: Cosine Accuracy@10
    - type: cosine_precision@1
      value: 0.6659761781460383
      name: Cosine Precision@1
    - type: cosine_precision@3
      value: 0.29472351688819837
      name: Cosine Precision@3
    - type: cosine_precision@5
      value: 0.1862592784394959
      name: Cosine Precision@5
    - type: cosine_precision@10
      value: 0.09672017952701535
      name: Cosine Precision@10
    - type: cosine_recall@1
      value: 0.6659761781460383
      name: Cosine Recall@1
    - type: cosine_recall@3
      value: 0.8841705506645952
      name: Cosine Recall@3
    - type: cosine_recall@5
      value: 0.9312963921974797
      name: Cosine Recall@5
    - type: cosine_recall@10
      value: 0.9672017952701536
      name: Cosine Recall@10
    - type: cosine_ndcg@10
      value: 0.8278291318026204
      name: Cosine Ndcg@10
    - type: cosine_mrr@10
      value: 0.7818480980055302
      name: Cosine Mrr@10
    - type: cosine_map@100
      value: 0.783515504381956
      name: Cosine Map@100
---


# SentenceTransformer based on Snowflake/snowflake-arctic-embed-l

This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [Snowflake/snowflake-arctic-embed-l](https://huggingface.co/Snowflake/snowflake-arctic-embed-l). It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

## Model Details

### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [Snowflake/snowflake-arctic-embed-l](https://huggingface.co/Snowflake/snowflake-arctic-embed-l) <!-- at revision d8fb21ca8d905d2832ee8b96c894d3298964346b -->
- **Maximum Sequence Length:** 512 tokens
- **Output Dimensionality:** 1024 dimensions
- **Similarity Function:** Cosine Similarity
<!-- - **Training Dataset:** Unknown -->
<!-- - **Language:** Unknown -->
<!-- - **License:** Unknown -->

### Model Sources

- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)

### Full Model Architecture

```

SentenceTransformer(

  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 

  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})

  (2): Normalize()

)

```

## Usage

### Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

```bash

pip install -U sentence-transformers

```

Then you can load this model and run inference.
```python

from sentence_transformers import SentenceTransformer



# Download from the 🤗 Hub

model = SentenceTransformer("sentence_transformers_model_id")

# Run inference

sentences = [

    'What are the main objectives of the directives mentioned in the text regarding greenhouse gas emissions and carbon dioxide storage, and how do they relate to environmental protection and sustainability within the European Union?',

    '(24) Directive 2003/87/EC of the European Parliament and of the Council of 13 October 2003 establishing a scheme for greenhouse gas emission allowance trading within the Union and amending Council Directive 96/61/EC (OJ L 275, 25.10.2003, p. 32).\n\n(25) Directive 2009/31/EC of the European Parliament and of the Council of 23 April 2009 on the geological storage of carbon dioxide and amending Council Directive 85/337/EEC, European Parliament and Council Directives 2000/60/EC, 2001/80/EC, 2004/35/EC, 2006/12/EC, 2008/1/EC and Regulation (EC) No 1013/2006 (OJ L 140, 5.6.2009, p. 114).\n\n(26) Directive 2014/23/EU of the European Parliament and of the Council of 26 February 2014 on the award of concession contracts (OJ L 94, 28.3.2014, p. 1).',

    'Article 33\n\nResponsibility and liability for drawing up and publishing the financial statements and the management report\n\n▼M4\n\n1.',

]

embeddings = model.encode(sentences)

print(embeddings.shape)

# [3, 1024]



# Get the similarity scores for the embeddings

similarities = model.similarity(embeddings, embeddings)

print(similarities.shape)

# [3, 3]

```

<!--
### Direct Usage (Transformers)

<details><summary>Click to see the direct usage in Transformers</summary>

</details>
-->

<!--
### Downstream Usage (Sentence Transformers)

You can finetune this model on your own dataset.

<details><summary>Click to expand</summary>

</details>
-->

<!--
### Out-of-Scope Use

*List how the model may foreseeably be misused and address what users ought not to do with the model.*
-->

## Evaluation

### Metrics

#### Information Retrieval

* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)

| Metric              | Value      |
|:--------------------|:-----------|
| cosine_accuracy@1   | 0.666      |

| cosine_accuracy@3   | 0.8842     |
| cosine_accuracy@5   | 0.9313     |

| cosine_accuracy@10  | 0.9672     |
| cosine_precision@1  | 0.666      |

| cosine_precision@3  | 0.2947     |
| cosine_precision@5  | 0.1863     |

| cosine_precision@10 | 0.0967     |
| cosine_recall@1     | 0.666      |

| cosine_recall@3     | 0.8842     |
| cosine_recall@5     | 0.9313     |

| cosine_recall@10    | 0.9672     |
| **cosine_ndcg@10**  | **0.8278** |

| cosine_mrr@10       | 0.7818     |

| cosine_map@100      | 0.7835     |



<!--

## Bias, Risks and Limitations



*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*

-->



<!--

### Recommendations



*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*

-->



## Training Details



### Training Dataset



#### Unnamed Dataset



* Size: 46,338 training samples

* Columns: <code>sentence_0</code> and <code>sentence_1</code>

* Approximate statistics based on the first 1000 samples:

  |         | sentence_0                                                                          | sentence_1                                                                          |

  |:--------|:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|

  | type    | string                                                                              | string                                                                              |

  | details | <ul><li>min: 11 tokens</li><li>mean: 35.24 tokens</li><li>max: 206 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 193.39 tokens</li><li>max: 512 tokens</li></ul> |

* Samples:

  | sentence_0                                                                                                                                                         | sentence_1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |

  |:-------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|

  | <code>How is materiality defined in the context of an entity's sustainability reporting as per QC 4?</code>                                                        | <code>QC 4. Materiality is an entity-specific aspect of relevance based on the nature or magnitude, or both, of the items to which the information relates, as assessed in the context of the undertaking’s sustainability reporting (see chapter 3 of this Standard).<br><br>Faithful representation<br><br>QC 5. To be useful, the information must not only represent relevant phenomena, it must also faithfully represent the substance of the phenomena that it purports to represent. Faithful representation requires information to be (i) complete, (ii) neutral and (iii) accurate.</code>                                                                                                                                                                                   |

  | <code>What procedure must be followed for the adoption of implementing acts as mentioned in the text?</code>                                                       | <code>Those implementing acts shall be adopted in accordance with the examination procedure referred to in Article 22a(2).<br><br>3.<br><br>Articles 9, 9a and 10 shall apply to maritime transport activities in the same manner as they apply to other activities covered by the EU ETS with the following exception with regard to the application of Article 10.</code>                                                                                                                                                                                                                                                                                                                                                                                                             |

  | <code>How should monitoring points be distributed for groundwater bodies that flow across Member State boundaries to effectively estimate groundwater flow?</code> | <code>The network shall include sufficient representative monitoring points to estimate the groundwater level in each groundwater body or group of bodies taking into account short and long-term variations in recharge and in particular:<br><br>— for groundwater bodies identified as being at risk of failing to achieve environmental objectives under Article 4, ensure sufficient density of monitoring points to assess the impact of abstractions and discharges on the groundwater level,<br><br>— for groundwater bodies within which groundwater flows across a Member State boundary, ensure sufficient monitoring points are provided to estimate the direction and rate of groundwater flow across the Member State boundary.<br><br>2.2.3. Monitoring frequency</code> |

* Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:

  ```json

  {

      "loss": "MultipleNegativesRankingLoss",

      "matryoshka_dims": [

          1024,

          768,

          512,

          256,

          128,

          64

      ],

      "matryoshka_weights": [

          1,

          1,

          1,

          1,

          1,

          1

      ],

      "n_dims_per_step": -1

  }

  ```



### Training Hyperparameters

#### Non-Default Hyperparameters



- `eval_strategy`: steps

- `multi_dataset_batch_sampler`: round_robin



#### All Hyperparameters

<details><summary>Click to expand</summary>



- `overwrite_output_dir`: False

- `do_predict`: False

- `eval_strategy`: steps

- `prediction_loss_only`: True

- `per_device_train_batch_size`: 8

- `per_device_eval_batch_size`: 8

- `per_gpu_train_batch_size`: None

- `per_gpu_eval_batch_size`: None

- `gradient_accumulation_steps`: 1

- `eval_accumulation_steps`: None

- `torch_empty_cache_steps`: None

- `learning_rate`: 5e-05

- `weight_decay`: 0.0

- `adam_beta1`: 0.9

- `adam_beta2`: 0.999

- `adam_epsilon`: 1e-08

- `max_grad_norm`: 1

- `num_train_epochs`: 3

- `max_steps`: -1

- `lr_scheduler_type`: linear

- `lr_scheduler_kwargs`: {}

- `warmup_ratio`: 0.0

- `warmup_steps`: 0

- `log_level`: passive

- `log_level_replica`: warning

- `log_on_each_node`: True

- `logging_nan_inf_filter`: True

- `save_safetensors`: True

- `save_on_each_node`: False

- `save_only_model`: False

- `restore_callback_states_from_checkpoint`: False

- `no_cuda`: False

- `use_cpu`: False

- `use_mps_device`: False

- `seed`: 42

- `data_seed`: None

- `jit_mode_eval`: False

- `use_ipex`: False

- `bf16`: False

- `fp16`: False

- `fp16_opt_level`: O1

- `half_precision_backend`: auto

- `bf16_full_eval`: False

- `fp16_full_eval`: False

- `tf32`: None

- `local_rank`: 0

- `ddp_backend`: None

- `tpu_num_cores`: None

- `tpu_metrics_debug`: False

- `debug`: []

- `dataloader_drop_last`: False

- `dataloader_num_workers`: 0

- `dataloader_prefetch_factor`: None

- `past_index`: -1

- `disable_tqdm`: False

- `remove_unused_columns`: True

- `label_names`: None

- `load_best_model_at_end`: False

- `ignore_data_skip`: False

- `fsdp`: []

- `fsdp_min_num_params`: 0

- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}

- `fsdp_transformer_layer_cls_to_wrap`: None

- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}

- `deepspeed`: None

- `label_smoothing_factor`: 0.0

- `optim`: adamw_torch

- `optim_args`: None

- `adafactor`: False

- `group_by_length`: False

- `length_column_name`: length

- `ddp_find_unused_parameters`: None

- `ddp_bucket_cap_mb`: None

- `ddp_broadcast_buffers`: False

- `dataloader_pin_memory`: True

- `dataloader_persistent_workers`: False

- `skip_memory_metrics`: True

- `use_legacy_prediction_loop`: False

- `push_to_hub`: False

- `resume_from_checkpoint`: None

- `hub_model_id`: None

- `hub_strategy`: every_save

- `hub_private_repo`: None

- `hub_always_push`: False

- `gradient_checkpointing`: False

- `gradient_checkpointing_kwargs`: None

- `include_inputs_for_metrics`: False

- `include_for_metrics`: []

- `eval_do_concat_batches`: True

- `fp16_backend`: auto

- `push_to_hub_model_id`: None

- `push_to_hub_organization`: None

- `mp_parameters`: 

- `auto_find_batch_size`: False

- `full_determinism`: False

- `torchdynamo`: None

- `ray_scope`: last

- `ddp_timeout`: 1800

- `torch_compile`: False

- `torch_compile_backend`: None

- `torch_compile_mode`: None

- `dispatch_batches`: None

- `split_batches`: None

- `include_tokens_per_second`: False

- `include_num_input_tokens_seen`: False

- `neftune_noise_alpha`: None

- `optim_target_modules`: None

- `batch_eval_metrics`: False

- `eval_on_start`: False

- `use_liger_kernel`: False

- `eval_use_gather_object`: False

- `average_tokens_across_devices`: False

- `prompts`: None

- `batch_sampler`: batch_sampler

- `multi_dataset_batch_sampler`: round_robin



</details>



### Training Logs

| Epoch  | Step  | Training Loss | cosine_ndcg@10 |

|:------:|:-----:|:-------------:|:--------------:|

| 0.0863 | 500   | 0.938         | -              |

| 0.1726 | 1000  | 0.2188        | -              |

| 0.2589 | 1500  | 0.1998        | -              |

| 0.3452 | 2000  | 0.2162        | 0.7843         |

| 0.4316 | 2500  | 0.1921        | -              |

| 0.5179 | 3000  | 0.1749        | -              |

| 0.6042 | 3500  | 0.1741        | -              |

| 0.6905 | 4000  | 0.2007        | 0.7779         |

| 0.7768 | 4500  | 0.1456        | -              |

| 0.8631 | 5000  | 0.1034        | -              |

| 0.9494 | 5500  | 0.1285        | -              |

| 1.0    | 5793  | -             | 0.7806         |

| 1.0357 | 6000  | 0.1011        | 0.7879         |

| 1.1220 | 6500  | 0.065         | -              |

| 1.2084 | 7000  | 0.0754        | -              |

| 1.2947 | 7500  | 0.067         | -              |

| 1.3810 | 8000  | 0.059         | 0.7953         |

| 1.4673 | 8500  | 0.0644        | -              |

| 1.5536 | 9000  | 0.0705        | -              |

| 1.6399 | 9500  | 0.0425        | -              |

| 1.7262 | 10000 | 0.0515        | 0.8171         |

| 1.8125 | 10500 | 0.0358        | -              |

| 1.8988 | 11000 | 0.0515        | -              |

| 1.9852 | 11500 | 0.043         | -              |

| 2.0    | 11586 | -             | 0.8201         |

| 2.0715 | 12000 | 0.0257        | 0.8208         |

| 2.1578 | 12500 | 0.0343        | -              |

| 2.2441 | 13000 | 0.0307        | -              |

| 2.3304 | 13500 | 0.0324        | -              |

| 2.4167 | 14000 | 0.0225        | 0.8236         |

| 2.5030 | 14500 | 0.0362        | -              |

| 2.5893 | 15000 | 0.0255        | -              |

| 2.6756 | 15500 | 0.0203        | -              |

| 2.7620 | 16000 | 0.0244        | 0.8240         |

| 2.8483 | 16500 | 0.0461        | -              |

| 2.9346 | 17000 | 0.0226        | -              |

| 3.0    | 17379 | -             | 0.8278         |





### Framework Versions

- Python: 3.10.15

- Sentence Transformers: 3.4.1

- Transformers: 4.49.0

- PyTorch: 2.6.0+cu126

- Accelerate: 1.5.2

- Datasets: 3.4.1

- Tokenizers: 0.21.1



## Citation



### BibTeX



#### Sentence Transformers

```bibtex

@inproceedings{reimers-2019-sentence-bert,

    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",

    author = "Reimers, Nils and Gurevych, Iryna",

    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",

    month = "11",

    year = "2019",

    publisher = "Association for Computational Linguistics",

    url = "https://arxiv.org/abs/1908.10084",

}

```



#### MatryoshkaLoss

```bibtex

@misc{kusupati2024matryoshka,

    title={Matryoshka Representation Learning},

    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},

    year={2024},

    eprint={2205.13147},

    archivePrefix={arXiv},

    primaryClass={cs.LG}

}

```



#### MultipleNegativesRankingLoss

```bibtex

@misc{henderson2017efficient,

    title={Efficient Natural Language Response Suggestion for Smart Reply},

    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},

    year={2017},

    eprint={1705.00652},

    archivePrefix={arXiv},

    primaryClass={cs.CL}

}

```



<!--

## Glossary



*Clearly define terms in order to be accessible across audiences.*

-->



<!--

## Model Card Authors



*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*

-->



<!--

## Model Card Contact



*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*

-->