Janari01 commited on
Commit
d2c8b18
·
verified ·
1 Parent(s): 5a73c6b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +451 -451
README.md CHANGED
@@ -1,452 +1,452 @@
1
- ---
2
- tags:
3
- - sentence-transformers
4
- - cross-encoder
5
- - generated_from_trainer
6
- - dataset_size:554403
7
- - loss:BinaryCrossEntropyLoss
8
- base_model: answerdotai/ModernBERT-base
9
- pipeline_tag: text-ranking
10
- library_name: sentence-transformers
11
- metrics:
12
- - map
13
- - mrr@10
14
- - ndcg@10
15
- model-index:
16
- - name: CrossEncoder based on answerdotai/ModernBERT-base
17
- results:
18
- - task:
19
- type: cross-encoder-reranking
20
- name: Cross Encoder Reranking
21
- dataset:
22
- name: s2orc dev
23
- type: s2orc-dev
24
- metrics:
25
- - type: map
26
- value: 0.8712
27
- name: Map
28
- - type: mrr@10
29
- value: 0.8711
30
- name: Mrr@10
31
- - type: ndcg@10
32
- value: 0.8765
33
- name: Ndcg@10
34
- - task:
35
- type: cross-encoder-reranking
36
- name: Cross Encoder Reranking
37
- dataset:
38
- name: NanoMSMARCO R100
39
- type: NanoMSMARCO_R100
40
- metrics:
41
- - type: map
42
- value: 0.4941
43
- name: Map
44
- - type: mrr@10
45
- value: 0.482
46
- name: Mrr@10
47
- - type: ndcg@10
48
- value: 0.5529
49
- name: Ndcg@10
50
- - task:
51
- type: cross-encoder-nano-beir
52
- name: Cross Encoder Nano BEIR
53
- dataset:
54
- name: NanoBEIR R100 mean
55
- type: NanoBEIR_R100_mean
56
- metrics:
57
- - type: map
58
- value: 0.4941
59
- name: Map
60
- - type: mrr@10
61
- value: 0.482
62
- name: Mrr@10
63
- - type: ndcg@10
64
- value: 0.5529
65
- name: Ndcg@10
66
- ---
67
-
68
- # CrossEncoder based on answerdotai/ModernBERT-base
69
-
70
- This is a [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.html) model finetuned from [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) using the [sentence-transformers](https://www.SBERT.net) library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
71
-
72
- ## Model Details
73
-
74
- ### Model Description
75
- - **Model Type:** Cross Encoder
76
- - **Base model:** [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) <!-- at revision 8949b909ec900327062f0ebf497f51aef5e6f0c8 -->
77
- - **Maximum Sequence Length:** 8192 tokens
78
- - **Number of Output Labels:** 1 label
79
- <!-- - **Training Dataset:** Unknown -->
80
- <!-- - **Language:** Unknown -->
81
- <!-- - **License:** Unknown -->
82
-
83
- ### Model Sources
84
-
85
- - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
86
- - **Documentation:** [Cross Encoder Documentation](https://www.sbert.net/docs/cross_encoder/usage/usage.html)
87
- - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
88
- - **Hugging Face:** [Cross Encoders on Hugging Face](https://huggingface.co/models?library=sentence-transformers&other=cross-encoder)
89
-
90
- ## Usage
91
-
92
- ### Direct Usage (Sentence Transformers)
93
-
94
- First install the Sentence Transformers library:
95
-
96
- ```bash
97
- pip install -U sentence-transformers
98
- ```
99
-
100
- Then you can load this model and run inference.
101
- ```python
102
- from sentence_transformers import CrossEncoder
103
-
104
- # Download from the 🤗 Hub
105
- model = CrossEncoder("cross_encoder_model_id")
106
- # Get scores for pairs of texts
107
- pairs = [
108
- ["Engineering students' understanding of the role of experimentation", 'Resource constraints have forced engineering schools to reduce laboratory provisions in undergraduate courses. In many instances hands-on experimentation has been replaced by demonstrations or computer simulations. Many engineering educators have cautioned against replacing experiments with simulations on the basis that this will lead to a misunderstanding of the role of experimentation in engineering practice. However, little is known about how students conceptualize the role of experimentation in developing engineering understanding. This study is based on interviews with third-year mechanical engineering students. Findings are presented on their perceptions in relation to the role of experimentation in developing engineering knowledge and practice.'],
109
- ["Engineering students' understanding of the role of experimentation", '"Excellent engineer training plan"was a core problem for cultivating students\' engineering ability,but at present the students in engineering ability and the enterprise demand disjointed phenomenon had more commons.Based on process equipment and control engineering as an example,for the general undergraduate colleges and universities to cultivate students\' engineering ability and enterprise demand disjointed phenomenon and the existing problems were analyzed,and the relevant approach was put forward,in order to improve students\' engineering ability to provide reference ideas.'],
110
- ["Engineering students' understanding of the role of experimentation", 'This paper contributes to the discussion of pedagogical training of engineering teachers based on a case study carried out in higher education institutions in Brazil, namely in Electrical Engineering. For this purpose, the authors chose to articulate two research methods: document analysis of the courses offered in the postgraduate programs (Master and PhD) in Electrical Engineering and a survey conducted with students and teachers from 58 of these postgraduate electrical engineering programs. The data analysis indicated that most of the teachers agreed that pedagogical training should be offered to engineering students. Postgraduate students also showed interest in enrolling courses with pedagogic focus. With this analysis we can state that there is a need to rethink engineering education, in order to create conditions for the development of competences related with teaching and learning innovation. This study shows the needs and presents some recommendations to deal with these issues in this field.'],
111
- ["Engineering students' understanding of the role of experimentation", 'Engineering practical teaching reform in higher institutions centers on improving students’ comprehensive quality,developing their innovative spirit and engineering practice ability,building teaching system for engineering training and demonstration center for engineering training.The article implements practical teaching reform on metalworking practice and electronic practice and provides students with a platform for integrated engineering training,leading them toward competence,quality and innovation development.'],
112
- ["Engineering students' understanding of the role of experimentation", 'Lisa Benson is an Associate Professor of Engineering and Science Education at Clemson University, with a joint appointment in Bioengineering. Her research focuses on the interactions between student motivation and their learning experiences. Her projects involve the study of student perceptions, beliefs and attitudes towards becoming engineers and scientists, and their problem solving processes. Other projects in the Benson group include effects of student-centered active learning, self-regulated learning, and incorporating engineering into secondary science and mathematics classrooms. Her education includes a B.S. in Bioengineering from the University of Vermont, and M.S. and Ph.D. in Bioengineering from Clemson University.'],
113
- ]
114
- scores = model.predict(pairs)
115
- print(scores.shape)
116
- # (5,)
117
-
118
- # Or rank different texts based on similarity to a single text
119
- ranks = model.rank(
120
- "Engineering students' understanding of the role of experimentation",
121
- [
122
- 'Resource constraints have forced engineering schools to reduce laboratory provisions in undergraduate courses. In many instances hands-on experimentation has been replaced by demonstrations or computer simulations. Many engineering educators have cautioned against replacing experiments with simulations on the basis that this will lead to a misunderstanding of the role of experimentation in engineering practice. However, little is known about how students conceptualize the role of experimentation in developing engineering understanding. This study is based on interviews with third-year mechanical engineering students. Findings are presented on their perceptions in relation to the role of experimentation in developing engineering knowledge and practice.',
123
- '"Excellent engineer training plan"was a core problem for cultivating students\' engineering ability,but at present the students in engineering ability and the enterprise demand disjointed phenomenon had more commons.Based on process equipment and control engineering as an example,for the general undergraduate colleges and universities to cultivate students\' engineering ability and enterprise demand disjointed phenomenon and the existing problems were analyzed,and the relevant approach was put forward,in order to improve students\' engineering ability to provide reference ideas.',
124
- 'This paper contributes to the discussion of pedagogical training of engineering teachers based on a case study carried out in higher education institutions in Brazil, namely in Electrical Engineering. For this purpose, the authors chose to articulate two research methods: document analysis of the courses offered in the postgraduate programs (Master and PhD) in Electrical Engineering and a survey conducted with students and teachers from 58 of these postgraduate electrical engineering programs. The data analysis indicated that most of the teachers agreed that pedagogical training should be offered to engineering students. Postgraduate students also showed interest in enrolling courses with pedagogic focus. With this analysis we can state that there is a need to rethink engineering education, in order to create conditions for the development of competences related with teaching and learning innovation. This study shows the needs and presents some recommendations to deal with these issues in this field.',
125
- 'Engineering practical teaching reform in higher institutions centers on improving students’ comprehensive quality,developing their innovative spirit and engineering practice ability,building teaching system for engineering training and demonstration center for engineering training.The article implements practical teaching reform on metalworking practice and electronic practice and provides students with a platform for integrated engineering training,leading them toward competence,quality and innovation development.',
126
- 'Lisa Benson is an Associate Professor of Engineering and Science Education at Clemson University, with a joint appointment in Bioengineering. Her research focuses on the interactions between student motivation and their learning experiences. Her projects involve the study of student perceptions, beliefs and attitudes towards becoming engineers and scientists, and their problem solving processes. Other projects in the Benson group include effects of student-centered active learning, self-regulated learning, and incorporating engineering into secondary science and mathematics classrooms. Her education includes a B.S. in Bioengineering from the University of Vermont, and M.S. and Ph.D. in Bioengineering from Clemson University.',
127
- ]
128
- )
129
- # [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
130
- ```
131
-
132
- <!--
133
- ### Direct Usage (Transformers)
134
-
135
- <details><summary>Click to see the direct usage in Transformers</summary>
136
-
137
- </details>
138
- -->
139
-
140
- <!--
141
- ### Downstream Usage (Sentence Transformers)
142
-
143
- You can finetune this model on your own dataset.
144
-
145
- <details><summary>Click to expand</summary>
146
-
147
- </details>
148
- -->
149
-
150
- <!--
151
- ### Out-of-Scope Use
152
-
153
- *List how the model may foreseeably be misused and address what users ought not to do with the model.*
154
- -->
155
-
156
- ## Evaluation
157
-
158
- ### Metrics
159
-
160
- #### Cross Encoder Reranking
161
-
162
- * Dataset: `s2orc-dev`
163
- * Evaluated with [<code>CrossEncoderRerankingEvaluator</code>](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CrossEncoderRerankingEvaluator) with these parameters:
164
- ```json
165
- {
166
- "at_k": 10,
167
- "always_rerank_positives": false
168
- }
169
- ```
170
-
171
- | Metric | Value |
172
- |:------------|:---------------------|
173
- | map | 0.8712 (+0.1333) |
174
- | mrr@10 | 0.8711 (+0.1351) |
175
- | **ndcg@10** | **0.8765 (+0.1106)** |
176
-
177
- #### Cross Encoder Reranking
178
-
179
- * Dataset: `NanoMSMARCO_R100`
180
- * Evaluated with [<code>CrossEncoderRerankingEvaluator</code>](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CrossEncoderRerankingEvaluator) with these parameters:
181
- ```json
182
- {
183
- "at_k": 10,
184
- "always_rerank_positives": true
185
- }
186
- ```
187
-
188
- | Metric | Value |
189
- |:------------|:---------------------|
190
- | map | 0.4941 (+0.0045) |
191
- | mrr@10 | 0.4820 (+0.0045) |
192
- | **ndcg@10** | **0.5529 (+0.0124)** |
193
-
194
- #### Cross Encoder Nano BEIR
195
-
196
- * Dataset: `NanoBEIR_R100_mean`
197
- * Evaluated with [<code>CrossEncoderNanoBEIREvaluator</code>](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CrossEncoderNanoBEIREvaluator) with these parameters:
198
- ```json
199
- {
200
- "dataset_names": [
201
- "msmarco"
202
- ],
203
- "rerank_k": 100,
204
- "at_k": 10,
205
- "always_rerank_positives": true
206
- }
207
- ```
208
-
209
- | Metric | Value |
210
- |:------------|:---------------------|
211
- | map | 0.4941 (+0.0045) |
212
- | mrr@10 | 0.4820 (+0.0045) |
213
- | **ndcg@10** | **0.5529 (+0.0124)** |
214
-
215
- <!--
216
- ## Bias, Risks and Limitations
217
-
218
- *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
219
- -->
220
-
221
- <!--
222
- ### Recommendations
223
-
224
- *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
225
- -->
226
-
227
- ## Training Details
228
-
229
- ### Training Dataset
230
-
231
- #### Unnamed Dataset
232
-
233
- * Size: 554,403 training samples
234
- * Columns: <code>title</code>, <code>abstract</code>, and <code>label</code>
235
- * Approximate statistics based on the first 1000 samples:
236
- | | title | abstract | label |
237
- |:--------|:------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------|:------------------------------------------------|
238
- | type | string | string | int |
239
- | details | <ul><li>min: 33 characters</li><li>mean: 83.77 characters</li><li>max: 162 characters</li></ul> | <ul><li>min: 91 characters</li><li>mean: 678.94 characters</li><li>max: 1023 characters</li></ul> | <ul><li>0: ~81.80%</li><li>1: ~18.20%</li></ul> |
240
- * Samples:
241
- | title | abstract | label |
242
- |:--------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
243
- | <code>Engineering students' understanding of the role of experimentation</code> | <code>Resource constraints have forced engineering schools to reduce laboratory provisions in undergraduate courses. In many instances hands-on experimentation has been replaced by demonstrations or computer simulations. Many engineering educators have cautioned against replacing experiments with simulations on the basis that this will lead to a misunderstanding of the role of experimentation in engineering practice. However, little is known about how students conceptualize the role of experimentation in developing engineering understanding. This study is based on interviews with third-year mechanical engineering students. Findings are presented on their perceptions in relation to the role of experimentation in developing engineering knowledge and practice.</code> | <code>1</code> |
244
- | <code>Engineering students' understanding of the role of experimentation</code> | <code>"Excellent engineer training plan"was a core problem for cultivating students' engineering ability,but at present the students in engineering ability and the enterprise demand disjointed phenomenon had more commons.Based on process equipment and control engineering as an example,for the general undergraduate colleges and universities to cultivate students' engineering ability and enterprise demand disjointed phenomenon and the existing problems were analyzed,and the relevant approach was put forward,in order to improve students' engineering ability to provide reference ideas.</code> | <code>0</code> |
245
- | <code>Engineering students' understanding of the role of experimentation</code> | <code>This paper contributes to the discussion of pedagogical training of engineering teachers based on a case study carried out in higher education institutions in Brazil, namely in Electrical Engineering. For this purpose, the authors chose to articulate two research methods: document analysis of the courses offered in the postgraduate programs (Master and PhD) in Electrical Engineering and a survey conducted with students and teachers from 58 of these postgraduate electrical engineering programs. The data analysis indicated that most of the teachers agreed that pedagogical training should be offered to engineering students. Postgraduate students also showed interest in enrolling courses with pedagogic focus. With this analysis we can state that there is a need to rethink engineering education, in order to create conditions for the development of competences related with teaching and learning innovation. This study shows the needs and presents some recommendations to deal with these issues...</code> | <code>0</code> |
246
- * Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
247
- ```json
248
- {
249
- "activation_fn": "torch.nn.modules.linear.Identity",
250
- "pos_weight": 5
251
- }
252
- ```
253
-
254
- ### Training Hyperparameters
255
- #### Non-Default Hyperparameters
256
-
257
- - `eval_strategy`: steps
258
- - `per_device_train_batch_size`: 16
259
- - `per_device_eval_batch_size`: 16
260
- - `learning_rate`: 2e-05
261
- - `num_train_epochs`: 1
262
- - `warmup_ratio`: 0.1
263
- - `seed`: 12
264
- - `bf16`: True
265
- - `dataloader_num_workers`: 6
266
- - `load_best_model_at_end`: True
267
-
268
- #### All Hyperparameters
269
- <details><summary>Click to expand</summary>
270
-
271
- - `overwrite_output_dir`: False
272
- - `do_predict`: False
273
- - `eval_strategy`: steps
274
- - `prediction_loss_only`: True
275
- - `per_device_train_batch_size`: 16
276
- - `per_device_eval_batch_size`: 16
277
- - `per_gpu_train_batch_size`: None
278
- - `per_gpu_eval_batch_size`: None
279
- - `gradient_accumulation_steps`: 1
280
- - `eval_accumulation_steps`: None
281
- - `torch_empty_cache_steps`: None
282
- - `learning_rate`: 2e-05
283
- - `weight_decay`: 0.0
284
- - `adam_beta1`: 0.9
285
- - `adam_beta2`: 0.999
286
- - `adam_epsilon`: 1e-08
287
- - `max_grad_norm`: 1.0
288
- - `num_train_epochs`: 1
289
- - `max_steps`: -1
290
- - `lr_scheduler_type`: linear
291
- - `lr_scheduler_kwargs`: {}
292
- - `warmup_ratio`: 0.1
293
- - `warmup_steps`: 0
294
- - `log_level`: passive
295
- - `log_level_replica`: warning
296
- - `log_on_each_node`: True
297
- - `logging_nan_inf_filter`: True
298
- - `save_safetensors`: True
299
- - `save_on_each_node`: False
300
- - `save_only_model`: False
301
- - `restore_callback_states_from_checkpoint`: False
302
- - `no_cuda`: False
303
- - `use_cpu`: False
304
- - `use_mps_device`: False
305
- - `seed`: 12
306
- - `data_seed`: None
307
- - `jit_mode_eval`: False
308
- - `use_ipex`: False
309
- - `bf16`: True
310
- - `fp16`: False
311
- - `fp16_opt_level`: O1
312
- - `half_precision_backend`: auto
313
- - `bf16_full_eval`: False
314
- - `fp16_full_eval`: False
315
- - `tf32`: None
316
- - `local_rank`: 0
317
- - `ddp_backend`: None
318
- - `tpu_num_cores`: None
319
- - `tpu_metrics_debug`: False
320
- - `debug`: []
321
- - `dataloader_drop_last`: False
322
- - `dataloader_num_workers`: 6
323
- - `dataloader_prefetch_factor`: None
324
- - `past_index`: -1
325
- - `disable_tqdm`: False
326
- - `remove_unused_columns`: True
327
- - `label_names`: None
328
- - `load_best_model_at_end`: True
329
- - `ignore_data_skip`: False
330
- - `fsdp`: []
331
- - `fsdp_min_num_params`: 0
332
- - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
333
- - `fsdp_transformer_layer_cls_to_wrap`: None
334
- - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
335
- - `deepspeed`: None
336
- - `label_smoothing_factor`: 0.0
337
- - `optim`: adamw_torch
338
- - `optim_args`: None
339
- - `adafactor`: False
340
- - `group_by_length`: False
341
- - `length_column_name`: length
342
- - `ddp_find_unused_parameters`: None
343
- - `ddp_bucket_cap_mb`: None
344
- - `ddp_broadcast_buffers`: False
345
- - `dataloader_pin_memory`: True
346
- - `dataloader_persistent_workers`: False
347
- - `skip_memory_metrics`: True
348
- - `use_legacy_prediction_loop`: False
349
- - `push_to_hub`: False
350
- - `resume_from_checkpoint`: None
351
- - `hub_model_id`: None
352
- - `hub_strategy`: every_save
353
- - `hub_private_repo`: None
354
- - `hub_always_push`: False
355
- - `gradient_checkpointing`: False
356
- - `gradient_checkpointing_kwargs`: None
357
- - `include_inputs_for_metrics`: False
358
- - `include_for_metrics`: []
359
- - `eval_do_concat_batches`: True
360
- - `fp16_backend`: auto
361
- - `push_to_hub_model_id`: None
362
- - `push_to_hub_organization`: None
363
- - `mp_parameters`:
364
- - `auto_find_batch_size`: False
365
- - `full_determinism`: False
366
- - `torchdynamo`: None
367
- - `ray_scope`: last
368
- - `ddp_timeout`: 1800
369
- - `torch_compile`: False
370
- - `torch_compile_backend`: None
371
- - `torch_compile_mode`: None
372
- - `include_tokens_per_second`: False
373
- - `include_num_input_tokens_seen`: False
374
- - `neftune_noise_alpha`: None
375
- - `optim_target_modules`: None
376
- - `batch_eval_metrics`: False
377
- - `eval_on_start`: False
378
- - `use_liger_kernel`: False
379
- - `eval_use_gather_object`: False
380
- - `average_tokens_across_devices`: False
381
- - `prompts`: None
382
- - `batch_sampler`: batch_sampler
383
- - `multi_dataset_batch_sampler`: proportional
384
-
385
- </details>
386
-
387
- ### Training Logs
388
- | Epoch | Step | Training Loss | s2orc-dev_ndcg@10 | NanoMSMARCO_R100_ndcg@10 | NanoBEIR_R100_mean_ndcg@10 |
389
- |:------:|:----:|:-------------:|:-----------------:|:------------------------:|:--------------------------:|
390
- | -1 | -1 | - | 0.1165 (-0.6495) | 0.0426 (-0.4978) | 0.0426 (-0.4978) |
391
- | 0.0000 | 1 | 1.0682 | - | - | - |
392
- | 0.0144 | 500 | 1.1555 | - | - | - |
393
- | 0.0289 | 1000 | 0.7743 | - | - | - |
394
- | 0.0433 | 1500 | 0.538 | - | - | - |
395
- | 0.0577 | 2000 | 0.5771 | - | - | - |
396
- | 0.0721 | 2500 | 0.5345 | - | - | - |
397
- | 0.0866 | 3000 | 0.4394 | - | - | - |
398
- | 0.1010 | 3500 | 0.4607 | - | - | - |
399
- | 0.1154 | 4000 | 0.3866 | 0.8685 (+0.1025) | 0.5469 (+0.0064) | 0.5469 (+0.0064) |
400
- | 0.1299 | 4500 | 0.4222 | - | - | - |
401
- | 0.1443 | 5000 | 0.3734 | - | - | - |
402
- | 0.1587 | 5500 | 0.3558 | - | - | - |
403
- | 0.1732 | 6000 | 0.3968 | - | - | - |
404
- | 0.1876 | 6500 | 0.3203 | - | - | - |
405
- | 0.2020 | 7000 | 0.3354 | - | - | - |
406
- | 0.2164 | 7500 | 0.3579 | - | - | - |
407
- | 0.2309 | 8000 | 0.3349 | 0.8765 (+0.1106) | 0.5529 (+0.0124) | 0.5529 (+0.0124) |
408
-
409
-
410
- ### Framework Versions
411
- - Python: 3.9.13
412
- - Sentence Transformers: 4.1.0
413
- - Transformers: 4.52.4
414
- - PyTorch: 2.7.1+cu118
415
- - Accelerate: 1.7.0
416
- - Datasets: 3.6.0
417
- - Tokenizers: 0.21.1
418
-
419
- ## Citation
420
-
421
- ### BibTeX
422
-
423
- #### Sentence Transformers
424
- ```bibtex
425
- @inproceedings{reimers-2019-sentence-bert,
426
- title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
427
- author = "Reimers, Nils and Gurevych, Iryna",
428
- booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
429
- month = "11",
430
- year = "2019",
431
- publisher = "Association for Computational Linguistics",
432
- url = "https://arxiv.org/abs/1908.10084",
433
- }
434
- ```
435
-
436
- <!--
437
- ## Glossary
438
-
439
- *Clearly define terms in order to be accessible across audiences.*
440
- -->
441
-
442
- <!--
443
- ## Model Card Authors
444
-
445
- *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
446
- -->
447
-
448
- <!--
449
- ## Model Card Contact
450
-
451
- *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
452
  -->
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - cross-encoder
5
+ - generated_from_trainer
6
+ - dataset_size:554403
7
+ - loss:BinaryCrossEntropyLoss
8
+ base_model: answerdotai/ModernBERT-base
9
+ pipeline_tag: text-ranking
10
+ library_name: sentence-transformers
11
+ metrics:
12
+ - map
13
+ - mrr@10
14
+ - ndcg@10
15
+ model-index:
16
+ - name: CrossEncoder based on answerdotai/ModernBERT-base
17
+ results:
18
+ - task:
19
+ type: cross-encoder-reranking
20
+ name: Cross Encoder Reranking
21
+ dataset:
22
+ name: s2orc dev
23
+ type: s2orc-dev
24
+ metrics:
25
+ - type: map
26
+ value: 0.8712
27
+ name: Map
28
+ - type: mrr@10
29
+ value: 0.8711
30
+ name: Mrr@10
31
+ - type: ndcg@10
32
+ value: 0.8765
33
+ name: Ndcg@10
34
+ - task:
35
+ type: cross-encoder-reranking
36
+ name: Cross Encoder Reranking
37
+ dataset:
38
+ name: NanoMSMARCO R100
39
+ type: NanoMSMARCO_R100
40
+ metrics:
41
+ - type: map
42
+ value: 0.4941
43
+ name: Map
44
+ - type: mrr@10
45
+ value: 0.482
46
+ name: Mrr@10
47
+ - type: ndcg@10
48
+ value: 0.5529
49
+ name: Ndcg@10
50
+ - task:
51
+ type: cross-encoder-nano-beir
52
+ name: Cross Encoder Nano BEIR
53
+ dataset:
54
+ name: NanoBEIR R100 mean
55
+ type: NanoBEIR_R100_mean
56
+ metrics:
57
+ - type: map
58
+ value: 0.4941
59
+ name: Map
60
+ - type: mrr@10
61
+ value: 0.482
62
+ name: Mrr@10
63
+ - type: ndcg@10
64
+ value: 0.5529
65
+ name: Ndcg@10
66
+ ---
67
+
68
+ # CrossEncoder based on answerdotai/ModernBERT-base
69
+
70
+ This is a [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.html) model finetuned from [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) using the [sentence-transformers](https://www.SBERT.net) library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
71
+
72
+ ## Model Details
73
+
74
+ ### Model Description
75
+ - **Model Type:** Cross Encoder
76
+ - **Base model:** [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) <!-- at revision 8949b909ec900327062f0ebf497f51aef5e6f0c8 -->
77
+ - **Maximum Sequence Length:** 8192 tokens
78
+ - **Number of Output Labels:** 1 label
79
+ <!-- - **Training Dataset:** Unknown -->
80
+ <!-- - **Language:** Unknown -->
81
+ <!-- - **License:** Unknown -->
82
+
83
+ ### Model Sources
84
+
85
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
86
+ - **Documentation:** [Cross Encoder Documentation](https://www.sbert.net/docs/cross_encoder/usage/usage.html)
87
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
88
+ - **Hugging Face:** [Cross Encoders on Hugging Face](https://huggingface.co/models?library=sentence-transformers&other=cross-encoder)
89
+
90
+ ## Usage
91
+
92
+ ### Direct Usage (Sentence Transformers)
93
+
94
+ First install the Sentence Transformers library:
95
+
96
+ ```bash
97
+ pip install -U sentence-transformers
98
+ ```
99
+
100
+ Then you can load this model and run inference.
101
+ ```python
102
+ from sentence_transformers import CrossEncoder
103
+
104
+ # Download from the 🤗 Hub
105
+ model = CrossEncoder("Janari01/reranker-ModernBERT-base-s2orc")
106
+ # Get scores for pairs of texts
107
+ pairs = [
108
+ ["Engineering students' understanding of the role of experimentation", 'Resource constraints have forced engineering schools to reduce laboratory provisions in undergraduate courses. In many instances hands-on experimentation has been replaced by demonstrations or computer simulations. Many engineering educators have cautioned against replacing experiments with simulations on the basis that this will lead to a misunderstanding of the role of experimentation in engineering practice. However, little is known about how students conceptualize the role of experimentation in developing engineering understanding. This study is based on interviews with third-year mechanical engineering students. Findings are presented on their perceptions in relation to the role of experimentation in developing engineering knowledge and practice.'],
109
+ ["Engineering students' understanding of the role of experimentation", '"Excellent engineer training plan"was a core problem for cultivating students\' engineering ability,but at present the students in engineering ability and the enterprise demand disjointed phenomenon had more commons.Based on process equipment and control engineering as an example,for the general undergraduate colleges and universities to cultivate students\' engineering ability and enterprise demand disjointed phenomenon and the existing problems were analyzed,and the relevant approach was put forward,in order to improve students\' engineering ability to provide reference ideas.'],
110
+ ["Engineering students' understanding of the role of experimentation", 'This paper contributes to the discussion of pedagogical training of engineering teachers based on a case study carried out in higher education institutions in Brazil, namely in Electrical Engineering. For this purpose, the authors chose to articulate two research methods: document analysis of the courses offered in the postgraduate programs (Master and PhD) in Electrical Engineering and a survey conducted with students and teachers from 58 of these postgraduate electrical engineering programs. The data analysis indicated that most of the teachers agreed that pedagogical training should be offered to engineering students. Postgraduate students also showed interest in enrolling courses with pedagogic focus. With this analysis we can state that there is a need to rethink engineering education, in order to create conditions for the development of competences related with teaching and learning innovation. This study shows the needs and presents some recommendations to deal with these issues in this field.'],
111
+ ["Engineering students' understanding of the role of experimentation", 'Engineering practical teaching reform in higher institutions centers on improving students’ comprehensive quality,developing their innovative spirit and engineering practice ability,building teaching system for engineering training and demonstration center for engineering training.The article implements practical teaching reform on metalworking practice and electronic practice and provides students with a platform for integrated engineering training,leading them toward competence,quality and innovation development.'],
112
+ ["Engineering students' understanding of the role of experimentation", 'Lisa Benson is an Associate Professor of Engineering and Science Education at Clemson University, with a joint appointment in Bioengineering. Her research focuses on the interactions between student motivation and their learning experiences. Her projects involve the study of student perceptions, beliefs and attitudes towards becoming engineers and scientists, and their problem solving processes. Other projects in the Benson group include effects of student-centered active learning, self-regulated learning, and incorporating engineering into secondary science and mathematics classrooms. Her education includes a B.S. in Bioengineering from the University of Vermont, and M.S. and Ph.D. in Bioengineering from Clemson University.'],
113
+ ]
114
+ scores = model.predict(pairs)
115
+ print(scores.shape)
116
+ # (5,)
117
+
118
+ # Or rank different texts based on similarity to a single text
119
+ ranks = model.rank(
120
+ "Engineering students' understanding of the role of experimentation",
121
+ [
122
+ 'Resource constraints have forced engineering schools to reduce laboratory provisions in undergraduate courses. In many instances hands-on experimentation has been replaced by demonstrations or computer simulations. Many engineering educators have cautioned against replacing experiments with simulations on the basis that this will lead to a misunderstanding of the role of experimentation in engineering practice. However, little is known about how students conceptualize the role of experimentation in developing engineering understanding. This study is based on interviews with third-year mechanical engineering students. Findings are presented on their perceptions in relation to the role of experimentation in developing engineering knowledge and practice.',
123
+ '"Excellent engineer training plan"was a core problem for cultivating students\' engineering ability,but at present the students in engineering ability and the enterprise demand disjointed phenomenon had more commons.Based on process equipment and control engineering as an example,for the general undergraduate colleges and universities to cultivate students\' engineering ability and enterprise demand disjointed phenomenon and the existing problems were analyzed,and the relevant approach was put forward,in order to improve students\' engineering ability to provide reference ideas.',
124
+ 'This paper contributes to the discussion of pedagogical training of engineering teachers based on a case study carried out in higher education institutions in Brazil, namely in Electrical Engineering. For this purpose, the authors chose to articulate two research methods: document analysis of the courses offered in the postgraduate programs (Master and PhD) in Electrical Engineering and a survey conducted with students and teachers from 58 of these postgraduate electrical engineering programs. The data analysis indicated that most of the teachers agreed that pedagogical training should be offered to engineering students. Postgraduate students also showed interest in enrolling courses with pedagogic focus. With this analysis we can state that there is a need to rethink engineering education, in order to create conditions for the development of competences related with teaching and learning innovation. This study shows the needs and presents some recommendations to deal with these issues in this field.',
125
+ 'Engineering practical teaching reform in higher institutions centers on improving students’ comprehensive quality,developing their innovative spirit and engineering practice ability,building teaching system for engineering training and demonstration center for engineering training.The article implements practical teaching reform on metalworking practice and electronic practice and provides students with a platform for integrated engineering training,leading them toward competence,quality and innovation development.',
126
+ 'Lisa Benson is an Associate Professor of Engineering and Science Education at Clemson University, with a joint appointment in Bioengineering. Her research focuses on the interactions between student motivation and their learning experiences. Her projects involve the study of student perceptions, beliefs and attitudes towards becoming engineers and scientists, and their problem solving processes. Other projects in the Benson group include effects of student-centered active learning, self-regulated learning, and incorporating engineering into secondary science and mathematics classrooms. Her education includes a B.S. in Bioengineering from the University of Vermont, and M.S. and Ph.D. in Bioengineering from Clemson University.',
127
+ ]
128
+ )
129
+ # [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
130
+ ```
131
+
132
+ <!--
133
+ ### Direct Usage (Transformers)
134
+
135
+ <details><summary>Click to see the direct usage in Transformers</summary>
136
+
137
+ </details>
138
+ -->
139
+
140
+ <!--
141
+ ### Downstream Usage (Sentence Transformers)
142
+
143
+ You can finetune this model on your own dataset.
144
+
145
+ <details><summary>Click to expand</summary>
146
+
147
+ </details>
148
+ -->
149
+
150
+ <!--
151
+ ### Out-of-Scope Use
152
+
153
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
154
+ -->
155
+
156
+ ## Evaluation
157
+
158
+ ### Metrics
159
+
160
+ #### Cross Encoder Reranking
161
+
162
+ * Dataset: `s2orc-dev`
163
+ * Evaluated with [<code>CrossEncoderRerankingEvaluator</code>](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CrossEncoderRerankingEvaluator) with these parameters:
164
+ ```json
165
+ {
166
+ "at_k": 10,
167
+ "always_rerank_positives": false
168
+ }
169
+ ```
170
+
171
+ | Metric | Value |
172
+ |:------------|:---------------------|
173
+ | map | 0.8712 (+0.1333) |
174
+ | mrr@10 | 0.8711 (+0.1351) |
175
+ | **ndcg@10** | **0.8765 (+0.1106)** |
176
+
177
+ #### Cross Encoder Reranking
178
+
179
+ * Dataset: `NanoMSMARCO_R100`
180
+ * Evaluated with [<code>CrossEncoderRerankingEvaluator</code>](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CrossEncoderRerankingEvaluator) with these parameters:
181
+ ```json
182
+ {
183
+ "at_k": 10,
184
+ "always_rerank_positives": true
185
+ }
186
+ ```
187
+
188
+ | Metric | Value |
189
+ |:------------|:---------------------|
190
+ | map | 0.4941 (+0.0045) |
191
+ | mrr@10 | 0.4820 (+0.0045) |
192
+ | **ndcg@10** | **0.5529 (+0.0124)** |
193
+
194
+ #### Cross Encoder Nano BEIR
195
+
196
+ * Dataset: `NanoBEIR_R100_mean`
197
+ * Evaluated with [<code>CrossEncoderNanoBEIREvaluator</code>](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CrossEncoderNanoBEIREvaluator) with these parameters:
198
+ ```json
199
+ {
200
+ "dataset_names": [
201
+ "msmarco"
202
+ ],
203
+ "rerank_k": 100,
204
+ "at_k": 10,
205
+ "always_rerank_positives": true
206
+ }
207
+ ```
208
+
209
+ | Metric | Value |
210
+ |:------------|:---------------------|
211
+ | map | 0.4941 (+0.0045) |
212
+ | mrr@10 | 0.4820 (+0.0045) |
213
+ | **ndcg@10** | **0.5529 (+0.0124)** |
214
+
215
+ <!--
216
+ ## Bias, Risks and Limitations
217
+
218
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
219
+ -->
220
+
221
+ <!--
222
+ ### Recommendations
223
+
224
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
225
+ -->
226
+
227
+ ## Training Details
228
+
229
+ ### Training Dataset
230
+
231
+ #### Unnamed Dataset
232
+
233
+ * Size: 554,403 training samples
234
+ * Columns: <code>title</code>, <code>abstract</code>, and <code>label</code>
235
+ * Approximate statistics based on the first 1000 samples:
236
+ | | title | abstract | label |
237
+ |:--------|:------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------|:------------------------------------------------|
238
+ | type | string | string | int |
239
+ | details | <ul><li>min: 33 characters</li><li>mean: 83.77 characters</li><li>max: 162 characters</li></ul> | <ul><li>min: 91 characters</li><li>mean: 678.94 characters</li><li>max: 1023 characters</li></ul> | <ul><li>0: ~81.80%</li><li>1: ~18.20%</li></ul> |
240
+ * Samples:
241
+ | title | abstract | label |
242
+ |:--------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
243
+ | <code>Engineering students' understanding of the role of experimentation</code> | <code>Resource constraints have forced engineering schools to reduce laboratory provisions in undergraduate courses. In many instances hands-on experimentation has been replaced by demonstrations or computer simulations. Many engineering educators have cautioned against replacing experiments with simulations on the basis that this will lead to a misunderstanding of the role of experimentation in engineering practice. However, little is known about how students conceptualize the role of experimentation in developing engineering understanding. This study is based on interviews with third-year mechanical engineering students. Findings are presented on their perceptions in relation to the role of experimentation in developing engineering knowledge and practice.</code> | <code>1</code> |
244
+ | <code>Engineering students' understanding of the role of experimentation</code> | <code>"Excellent engineer training plan"was a core problem for cultivating students' engineering ability,but at present the students in engineering ability and the enterprise demand disjointed phenomenon had more commons.Based on process equipment and control engineering as an example,for the general undergraduate colleges and universities to cultivate students' engineering ability and enterprise demand disjointed phenomenon and the existing problems were analyzed,and the relevant approach was put forward,in order to improve students' engineering ability to provide reference ideas.</code> | <code>0</code> |
245
+ | <code>Engineering students' understanding of the role of experimentation</code> | <code>This paper contributes to the discussion of pedagogical training of engineering teachers based on a case study carried out in higher education institutions in Brazil, namely in Electrical Engineering. For this purpose, the authors chose to articulate two research methods: document analysis of the courses offered in the postgraduate programs (Master and PhD) in Electrical Engineering and a survey conducted with students and teachers from 58 of these postgraduate electrical engineering programs. The data analysis indicated that most of the teachers agreed that pedagogical training should be offered to engineering students. Postgraduate students also showed interest in enrolling courses with pedagogic focus. With this analysis we can state that there is a need to rethink engineering education, in order to create conditions for the development of competences related with teaching and learning innovation. This study shows the needs and presents some recommendations to deal with these issues...</code> | <code>0</code> |
246
+ * Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
247
+ ```json
248
+ {
249
+ "activation_fn": "torch.nn.modules.linear.Identity",
250
+ "pos_weight": 5
251
+ }
252
+ ```
253
+
254
+ ### Training Hyperparameters
255
+ #### Non-Default Hyperparameters
256
+
257
+ - `eval_strategy`: steps
258
+ - `per_device_train_batch_size`: 16
259
+ - `per_device_eval_batch_size`: 16
260
+ - `learning_rate`: 2e-05
261
+ - `num_train_epochs`: 1
262
+ - `warmup_ratio`: 0.1
263
+ - `seed`: 12
264
+ - `bf16`: True
265
+ - `dataloader_num_workers`: 6
266
+ - `load_best_model_at_end`: True
267
+
268
+ #### All Hyperparameters
269
+ <details><summary>Click to expand</summary>
270
+
271
+ - `overwrite_output_dir`: False
272
+ - `do_predict`: False
273
+ - `eval_strategy`: steps
274
+ - `prediction_loss_only`: True
275
+ - `per_device_train_batch_size`: 16
276
+ - `per_device_eval_batch_size`: 16
277
+ - `per_gpu_train_batch_size`: None
278
+ - `per_gpu_eval_batch_size`: None
279
+ - `gradient_accumulation_steps`: 1
280
+ - `eval_accumulation_steps`: None
281
+ - `torch_empty_cache_steps`: None
282
+ - `learning_rate`: 2e-05
283
+ - `weight_decay`: 0.0
284
+ - `adam_beta1`: 0.9
285
+ - `adam_beta2`: 0.999
286
+ - `adam_epsilon`: 1e-08
287
+ - `max_grad_norm`: 1.0
288
+ - `num_train_epochs`: 1
289
+ - `max_steps`: -1
290
+ - `lr_scheduler_type`: linear
291
+ - `lr_scheduler_kwargs`: {}
292
+ - `warmup_ratio`: 0.1
293
+ - `warmup_steps`: 0
294
+ - `log_level`: passive
295
+ - `log_level_replica`: warning
296
+ - `log_on_each_node`: True
297
+ - `logging_nan_inf_filter`: True
298
+ - `save_safetensors`: True
299
+ - `save_on_each_node`: False
300
+ - `save_only_model`: False
301
+ - `restore_callback_states_from_checkpoint`: False
302
+ - `no_cuda`: False
303
+ - `use_cpu`: False
304
+ - `use_mps_device`: False
305
+ - `seed`: 12
306
+ - `data_seed`: None
307
+ - `jit_mode_eval`: False
308
+ - `use_ipex`: False
309
+ - `bf16`: True
310
+ - `fp16`: False
311
+ - `fp16_opt_level`: O1
312
+ - `half_precision_backend`: auto
313
+ - `bf16_full_eval`: False
314
+ - `fp16_full_eval`: False
315
+ - `tf32`: None
316
+ - `local_rank`: 0
317
+ - `ddp_backend`: None
318
+ - `tpu_num_cores`: None
319
+ - `tpu_metrics_debug`: False
320
+ - `debug`: []
321
+ - `dataloader_drop_last`: False
322
+ - `dataloader_num_workers`: 6
323
+ - `dataloader_prefetch_factor`: None
324
+ - `past_index`: -1
325
+ - `disable_tqdm`: False
326
+ - `remove_unused_columns`: True
327
+ - `label_names`: None
328
+ - `load_best_model_at_end`: True
329
+ - `ignore_data_skip`: False
330
+ - `fsdp`: []
331
+ - `fsdp_min_num_params`: 0
332
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
333
+ - `fsdp_transformer_layer_cls_to_wrap`: None
334
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
335
+ - `deepspeed`: None
336
+ - `label_smoothing_factor`: 0.0
337
+ - `optim`: adamw_torch
338
+ - `optim_args`: None
339
+ - `adafactor`: False
340
+ - `group_by_length`: False
341
+ - `length_column_name`: length
342
+ - `ddp_find_unused_parameters`: None
343
+ - `ddp_bucket_cap_mb`: None
344
+ - `ddp_broadcast_buffers`: False
345
+ - `dataloader_pin_memory`: True
346
+ - `dataloader_persistent_workers`: False
347
+ - `skip_memory_metrics`: True
348
+ - `use_legacy_prediction_loop`: False
349
+ - `push_to_hub`: False
350
+ - `resume_from_checkpoint`: None
351
+ - `hub_model_id`: None
352
+ - `hub_strategy`: every_save
353
+ - `hub_private_repo`: None
354
+ - `hub_always_push`: False
355
+ - `gradient_checkpointing`: False
356
+ - `gradient_checkpointing_kwargs`: None
357
+ - `include_inputs_for_metrics`: False
358
+ - `include_for_metrics`: []
359
+ - `eval_do_concat_batches`: True
360
+ - `fp16_backend`: auto
361
+ - `push_to_hub_model_id`: None
362
+ - `push_to_hub_organization`: None
363
+ - `mp_parameters`:
364
+ - `auto_find_batch_size`: False
365
+ - `full_determinism`: False
366
+ - `torchdynamo`: None
367
+ - `ray_scope`: last
368
+ - `ddp_timeout`: 1800
369
+ - `torch_compile`: False
370
+ - `torch_compile_backend`: None
371
+ - `torch_compile_mode`: None
372
+ - `include_tokens_per_second`: False
373
+ - `include_num_input_tokens_seen`: False
374
+ - `neftune_noise_alpha`: None
375
+ - `optim_target_modules`: None
376
+ - `batch_eval_metrics`: False
377
+ - `eval_on_start`: False
378
+ - `use_liger_kernel`: False
379
+ - `eval_use_gather_object`: False
380
+ - `average_tokens_across_devices`: False
381
+ - `prompts`: None
382
+ - `batch_sampler`: batch_sampler
383
+ - `multi_dataset_batch_sampler`: proportional
384
+
385
+ </details>
386
+
387
+ ### Training Logs
388
+ | Epoch | Step | Training Loss | s2orc-dev_ndcg@10 | NanoMSMARCO_R100_ndcg@10 | NanoBEIR_R100_mean_ndcg@10 |
389
+ |:------:|:----:|:-------------:|:-----------------:|:------------------------:|:--------------------------:|
390
+ | -1 | -1 | - | 0.1165 (-0.6495) | 0.0426 (-0.4978) | 0.0426 (-0.4978) |
391
+ | 0.0000 | 1 | 1.0682 | - | - | - |
392
+ | 0.0144 | 500 | 1.1555 | - | - | - |
393
+ | 0.0289 | 1000 | 0.7743 | - | - | - |
394
+ | 0.0433 | 1500 | 0.538 | - | - | - |
395
+ | 0.0577 | 2000 | 0.5771 | - | - | - |
396
+ | 0.0721 | 2500 | 0.5345 | - | - | - |
397
+ | 0.0866 | 3000 | 0.4394 | - | - | - |
398
+ | 0.1010 | 3500 | 0.4607 | - | - | - |
399
+ | 0.1154 | 4000 | 0.3866 | 0.8685 (+0.1025) | 0.5469 (+0.0064) | 0.5469 (+0.0064) |
400
+ | 0.1299 | 4500 | 0.4222 | - | - | - |
401
+ | 0.1443 | 5000 | 0.3734 | - | - | - |
402
+ | 0.1587 | 5500 | 0.3558 | - | - | - |
403
+ | 0.1732 | 6000 | 0.3968 | - | - | - |
404
+ | 0.1876 | 6500 | 0.3203 | - | - | - |
405
+ | 0.2020 | 7000 | 0.3354 | - | - | - |
406
+ | 0.2164 | 7500 | 0.3579 | - | - | - |
407
+ | 0.2309 | 8000 | 0.3349 | 0.8765 (+0.1106) | 0.5529 (+0.0124) | 0.5529 (+0.0124) |
408
+
409
+
410
+ ### Framework Versions
411
+ - Python: 3.9.13
412
+ - Sentence Transformers: 4.1.0
413
+ - Transformers: 4.52.4
414
+ - PyTorch: 2.7.1+cu118
415
+ - Accelerate: 1.7.0
416
+ - Datasets: 3.6.0
417
+ - Tokenizers: 0.21.1
418
+
419
+ ## Citation
420
+
421
+ ### BibTeX
422
+
423
+ #### Sentence Transformers
424
+ ```bibtex
425
+ @inproceedings{reimers-2019-sentence-bert,
426
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
427
+ author = "Reimers, Nils and Gurevych, Iryna",
428
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
429
+ month = "11",
430
+ year = "2019",
431
+ publisher = "Association for Computational Linguistics",
432
+ url = "https://arxiv.org/abs/1908.10084",
433
+ }
434
+ ```
435
+
436
+ <!--
437
+ ## Glossary
438
+
439
+ *Clearly define terms in order to be accessible across audiences.*
440
+ -->
441
+
442
+ <!--
443
+ ## Model Card Authors
444
+
445
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
446
+ -->
447
+
448
+ <!--
449
+ ## Model Card Contact
450
+
451
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
452
  -->