Add new SentenceTransformer model.

122685f verified over 1 year ago

27.4 kB

	---
	language: []
	library_name: sentence-transformers
	tags:
	- sentence-transformers
	- sentence-similarity
	- feature-extraction
	- dataset_size:1M<n<10M
	- loss:CoSENTLoss
	metrics:
	- pearson_cosine
	- spearman_cosine
	- pearson_manhattan
	- spearman_manhattan
	- pearson_euclidean
	- spearman_euclidean
	- pearson_dot
	- spearman_dot
	- pearson_max
	- spearman_max
	base_model: distilbert/distilbert-base-uncased
	widget:
	- source_sentence: B C C_L CENTER TUNNEL VERT Other XXXX GENERIC G-S
	sentences:
	- T L ENG TO RAD SWITCH 90 Deg Front 2015 P552 VOLTS
	- T RCM ENS 071 RCM ENS EFPR VOLT 90 Deg Front 2021 CX430 VOLTS
	- T L ROCKER AT B PILLAR LONG 90 Deg Front 2020 V363N G-S
	- source_sentence: T L F DUMMY PELVIS LAT 90 Deg Front 2021 P702 G-S
	sentences:
	- T L F DUMMY PELVIS LAT 90 Deg Front 2021 CX727 G-S
	- T FIXTURE BASE FRONT ACCEL VERT ACCEL Linear Test 2025 U717 G-S
	- T R ROCKER AT B_PILLAR LONG 30 Deg Front Angular Right 2025 CX430 G-S
	- source_sentence: T L F DUMMY PELVIS LAT 90 Deg Front 2021 CX727 G-S
	sentences:
	- T R F DUMMY PELVIS LAT 90 Deg Front 2021 P702 G-S
	- T L F DUMMY PELVIS LONG 30 Deg Front Angular Left 2020 P558 G-S
	- T R F DUMMY L LOWER TIBIA MY LOAD 90 Deg Front 2022 U553 IN-LBS
	- source_sentence: T R F DUMMY CHEST VERT 90 Deg Front 2021 P702 G-S
	sentences:
	- T R F DUMMY CHEST VERT 90 Deg Front 2015 P552 G-S
	- T L F DUMMY R LOWER TIBIA MX LOAD 90 Deg Front 2021 CX727 IN-LBS
	- T REAR DIFFERENTIAL LONG 30 Deg Front Angular Left 2020 P558 G-S
	- source_sentence: T ENGINE TRANS TOP LAT 90 Deg Front 2025 U717 G-S
	sentences:
	- T R F ACTIVE VENT SQUIB VOLT 90 Deg Front 2021 P702 VOLTS
	- T ENGINE TRANS TOP LAT 30 Deg Front Angular Left 2020 P558 G-S
	- T R F DUMMY CHEST VERT 90 Deg Frontal Impact Simulation 2024 CX727 G-S
	pipeline_tag: sentence-similarity
	model-index:
	- name: SentenceTransformer based on distilbert/distilbert-base-uncased
	results:
	- task:
	type: semantic-similarity
	name: Semantic Similarity
	dataset:
	name: sts dev
	type: sts-dev
	metrics:
	- type: pearson_cosine
	value: 0.4517523751963131
	name: Pearson Cosine
	- type: spearman_cosine
	value: 0.4761555869182568
	name: Spearman Cosine
	- type: pearson_manhattan
	value: 0.42531457338882206
	name: Pearson Manhattan
	- type: spearman_manhattan
	value: 0.46381946353811704
	name: Spearman Manhattan
	- type: pearson_euclidean
	value: 0.4261708588640235
	name: Pearson Euclidean
	- type: spearman_euclidean
	value: 0.4651666003446995
	name: Spearman Euclidean
	- type: pearson_dot
	value: 0.3897944292190218
	name: Pearson Dot
	- type: spearman_dot
	value: 0.37404050621023377
	name: Spearman Dot
	- type: pearson_max
	value: 0.4517523751963131
	name: Pearson Max
	- type: spearman_max
	value: 0.4761555869182568
	name: Spearman Max
	- type: pearson_cosine
	value: 0.4412143708585779
	name: Pearson Cosine
	- type: spearman_cosine
	value: 0.4670631031564122
	name: Spearman Cosine
	- type: pearson_manhattan
	value: 0.4156386809751022
	name: Pearson Manhattan
	- type: spearman_manhattan
	value: 0.4559676784726118
	name: Spearman Manhattan
	- type: pearson_euclidean
	value: 0.41671687323124873
	name: Pearson Euclidean
	- type: spearman_euclidean
	value: 0.45746069501329756
	name: Spearman Euclidean
	- type: pearson_dot
	value: 0.37528926047569405
	name: Pearson Dot
	- type: spearman_dot
	value: 0.36286227520562186
	name: Spearman Dot
	- type: pearson_max
	value: 0.4412143708585779
	name: Pearson Max
	- type: spearman_max
	value: 0.4670631031564122
	name: Spearman Max
	---

	# SentenceTransformer based on distilbert/distilbert-base-uncased

	This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [distilbert/distilbert-base-uncased](https://huggingface.co/distilbert/distilbert-base-uncased). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

	## Model Details

	### Model Description
	- Model Type: Sentence Transformer
	- Base model: [distilbert/distilbert-base-uncased](https://huggingface.co/distilbert/distilbert-base-uncased) <!-- at revision 12040accade4e8a0f71eabdb258fecc2e7e948be -->
	- Maximum Sequence Length: 512 tokens
	- Output Dimensionality: 768 tokens
	- Similarity Function: Cosine Similarity
	<!-- - Training Dataset: Unknown -->
	<!-- - Language: Unknown -->
	<!-- - License: Unknown -->

	### Model Sources

	- Documentation: [Sentence Transformers Documentation](https://sbert.net)
	- Repository: [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
	- Hugging Face: [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)

	### Full Model Architecture

	```
	SentenceTransformer(
	(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: DistilBertModel
	(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
	)
	```

	## Usage

	### Direct Usage (Sentence Transformers)

	First install the Sentence Transformers library:

	```bash
	pip install -U sentence-transformers
	```

	Then you can load this model and run inference.
	```python
	from sentence_transformers import SentenceTransformer

	# Download from the 🤗 Hub
	model = SentenceTransformer("sentence_transformers_model_id")
	# Run inference
	sentences = [
	'T ENGINE TRANS TOP LAT 90 Deg Front 2025 U717 G-S',
	'T R F ACTIVE VENT SQUIB VOLT 90 Deg Front 2021 P702 VOLTS',
	'T ENGINE TRANS TOP LAT 30 Deg Front Angular Left 2020 P558 G-S',
	]
	embeddings = model.encode(sentences)
	print(embeddings.shape)
	# [3, 768]

	# Get the similarity scores for the embeddings
	similarities = model.similarity(embeddings, embeddings)
	print(similarities.shape)
	# [3, 3]
	```

	<!--
	### Direct Usage (Transformers)

	<details><summary>Click to see the direct usage in Transformers</summary>

	</details>
	-->

	<!--
	### Downstream Usage (Sentence Transformers)

	You can finetune this model on your own dataset.

	<details><summary>Click to expand</summary>

	</details>
	-->

	<!--
	### Out-of-Scope Use

	List how the model may foreseeably be misused and address what users ought not to do with the model.
	-->

	## Evaluation

	### Metrics

	#### Semantic Similarity
	* Dataset: `sts-dev`
	* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)

	\| Metric \| Value \|
	\|:--------------------\|:-----------\|
	\| pearson_cosine \| 0.4518 \|
	\| spearman_cosine \| 0.4762 \|
	\| pearson_manhattan \| 0.4253 \|
	\| spearman_manhattan \| 0.4638 \|
	\| pearson_euclidean \| 0.4262 \|
	\| spearman_euclidean \| 0.4652 \|
	\| pearson_dot \| 0.3898 \|
	\| spearman_dot \| 0.374 \|
	\| pearson_max \| 0.4518 \|
	\| spearman_max \| 0.4762 \|

	#### Semantic Similarity
	* Dataset: `sts-dev`
	* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)

	\| Metric \| Value \|
	\|:--------------------\|:-----------\|
	\| pearson_cosine \| 0.4412 \|
	\| spearman_cosine \| 0.4671 \|
	\| pearson_manhattan \| 0.4156 \|
	\| spearman_manhattan \| 0.456 \|
	\| pearson_euclidean \| 0.4167 \|
	\| spearman_euclidean \| 0.4575 \|
	\| pearson_dot \| 0.3753 \|
	\| spearman_dot \| 0.3629 \|
	\| pearson_max \| 0.4412 \|
	\| spearman_max \| 0.4671 \|

	<!--
	## Bias, Risks and Limitations

	What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.
	-->

	<!--
	### Recommendations

	What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.
	-->

	## Training Details

	### Training Dataset

	#### Unnamed Dataset


	* Size: 8,081,275 training samples
	* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \| score \|
	\|:--------\|:-----------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------\|:---------------------------------------------------------------\|
	\| type \| string \| string \| float \|
	\| details \| <ul><li>min: 23 tokens</li><li>mean: 31.48 tokens</li><li>max: 40 tokens</li></ul> \| <ul><li>min: 16 tokens</li><li>mean: 30.06 tokens</li><li>max: 55 tokens</li></ul> \| <ul><li>min: 0.0</li><li>mean: 0.44</li><li>max: 1.0</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \| score \|
	\|:--------------------------------------------------------------------------\|:--------------------------------------------------------------------------------------------------\|:---------------------------------\|
	\| <code>T L F DUMMY PELVIS VERT Dynamic Seat Sled Test 2025 U718 G-S</code> \| <code>T SCS R2 HY REF 059 R C PLR REF Y SM LAT 90 Deg / Left Side Decel-4g 2020 CX483 G-S</code> \| <code>0.21129386503072142</code> \|
	\| <code>T L F DUMMY PELVIS VERT Dynamic Seat Sled Test 2025 U718 G-S</code> \| <code>T R F DUMMY PELVIS VERT 75 Deg Oblique Right Side 10 in. Pole 2015 P552 G-S</code> \| <code>0.4972955033248179</code> \|
	\| <code>T L F DUMMY PELVIS VERT Dynamic Seat Sled Test 2025 U718 G-S</code> \| <code>T SCS L1 HY REF 053 L B PLR REF Y SM LAT 90 Deg Front Bumper Override 2021 CX727 G-S</code> \| <code>0.5701051768787058</code> \|
	* Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
	```json
	{
	"scale": 20.0,
	"similarity_fct": "pairwise_cos_sim"
	}
	```

	### Evaluation Dataset

	#### Unnamed Dataset


	* Size: 1,726,581 evaluation samples
	* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \| score \|
	\|:--------\|:----------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------\|:---------------------------------------------------------------\|
	\| type \| string \| string \| float \|
	\| details \| <ul><li>min: 22 tokens</li><li>mean: 25.0 tokens</li><li>max: 30 tokens</li></ul> \| <ul><li>min: 16 tokens</li><li>mean: 31.04 tokens</li><li>max: 53 tokens</li></ul> \| <ul><li>min: 0.0</li><li>mean: 0.44</li><li>max: 1.0</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \| score \|
	\|:-------------------------------------------------------------------------------------------------\|:--------------------------------------------------------------------------------------------------\|:---------------------------------\|
	\| <code>T R F ADAPTIVE TETHER VENT SQUIB VOLT 30 Deg Front Angular Right 20xx GENERIC VOLTS</code> \| <code>T L F DUMMY T12 LONG 27 Deg Crabbed Left Side NHTSA 214 MDB to vehicle 2015 P552 G-S</code> \| <code>0.6835618484879796</code> \|
	\| <code>T R F ADAPTIVE TETHER VENT SQUIB VOLT 30 Deg Front Angular Right 20xx GENERIC VOLTS</code> \| <code>T L F DUMMY R FEMUR LONG 90 Deg Front 2022 U553 G-S</code> \| <code>0.666531064739</code> \|
	\| <code>T R F ADAPTIVE TETHER VENT SQUIB VOLT 30 Deg Front Angular Right 20xx GENERIC VOLTS</code> \| <code>T R F DUMMY NECK UPPER MZ LOAD 90 Deg Front 2019 P375ICA IN-LBS</code> \| <code>0.46391834212079874</code> \|
	* Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
	```json
	{
	"scale": 20.0,
	"similarity_fct": "pairwise_cos_sim"
	}
	```

	### Training Hyperparameters
	#### Non-Default Hyperparameters

	- `per_device_train_batch_size`: 32
	- `per_device_eval_batch_size`: 32
	- `learning_rate`: 3e-05
	- `num_train_epochs`: 4
	- `warmup_ratio`: 0.1
	- `fp16`: True

	#### All Hyperparameters
	<details><summary>Click to expand</summary>

	- `overwrite_output_dir`: False
	- `do_predict`: False
	- `prediction_loss_only`: True
	- `per_device_train_batch_size`: 32
	- `per_device_eval_batch_size`: 32
	- `per_gpu_train_batch_size`: None
	- `per_gpu_eval_batch_size`: None
	- `gradient_accumulation_steps`: 1
	- `eval_accumulation_steps`: None
	- `learning_rate`: 3e-05
	- `weight_decay`: 0.0
	- `adam_beta1`: 0.9
	- `adam_beta2`: 0.999
	- `adam_epsilon`: 1e-08
	- `max_grad_norm`: 1.0
	- `num_train_epochs`: 4
	- `max_steps`: -1
	- `lr_scheduler_type`: linear
	- `warmup_ratio`: 0.1
	- `warmup_steps`: 0
	- `log_level`: passive
	- `log_level_replica`: warning
	- `log_on_each_node`: True
	- `logging_nan_inf_filter`: True
	- `save_safetensors`: True
	- `save_on_each_node`: False
	- `no_cuda`: False
	- `use_cpu`: False
	- `use_mps_device`: False
	- `seed`: 42
	- `data_seed`: None
	- `jit_mode_eval`: False
	- `use_ipex`: False
	- `bf16`: False
	- `fp16`: True
	- `fp16_opt_level`: O1
	- `half_precision_backend`: auto
	- `bf16_full_eval`: False
	- `fp16_full_eval`: False
	- `tf32`: None
	- `local_rank`: 4
	- `ddp_backend`: None
	- `tpu_num_cores`: None
	- `tpu_metrics_debug`: False
	- `debug`: []
	- `dataloader_drop_last`: True
	- `dataloader_num_workers`: 0
	- `past_index`: -1
	- `disable_tqdm`: False
	- `remove_unused_columns`: True
	- `label_names`: None
	- `load_best_model_at_end`: False
	- `ignore_data_skip`: False
	- `fsdp`: []
	- `fsdp_min_num_params`: 0
	- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_grad_ckpt': False}
	- `fsdp_transformer_layer_cls_to_wrap`: None
	- `deepspeed`: None
	- `label_smoothing_factor`: 0.0
	- `optim`: adamw_torch
	- `optim_args`: None
	- `adafactor`: False
	- `group_by_length`: False
	- `length_column_name`: length
	- `ddp_find_unused_parameters`: None
	- `ddp_bucket_cap_mb`: None
	- `ddp_broadcast_buffers`: False
	- `dataloader_pin_memory`: True
	- `skip_memory_metrics`: True
	- `use_legacy_prediction_loop`: False
	- `push_to_hub`: False
	- `resume_from_checkpoint`: None
	- `hub_model_id`: None
	- `hub_strategy`: every_save
	- `hub_private_repo`: False
	- `hub_always_push`: False
	- `gradient_checkpointing`: False
	- `gradient_checkpointing_kwargs`: None
	- `include_inputs_for_metrics`: False
	- `fp16_backend`: auto
	- `push_to_hub_model_id`: None
	- `push_to_hub_organization`: None
	- `mp_parameters`:
	- `auto_find_batch_size`: False
	- `full_determinism`: False
	- `torchdynamo`: None
	- `ray_scope`: last
	- `ddp_timeout`: 1800
	- `torch_compile`: False
	- `torch_compile_backend`: None
	- `torch_compile_mode`: None
	- `dispatch_batches`: None
	- `split_batches`: False
	- `include_tokens_per_second`: False
	- `neftune_noise_alpha`: None
	- `batch_sampler`: batch_sampler
	- `multi_dataset_batch_sampler`: proportional

	</details>

	### Training Logs
	<details><summary>Click to expand</summary>

	\| Epoch \| Step \| Training Loss \| loss \| sts-dev_spearman_cosine \|
	\|:------:\|:------:\|:-------------:\|:------:\|:-----------------------:\|
	\| 0.0317 \| 1000 \| 6.3069 \| - \| - \|
	\| 0.0634 \| 2000 \| 6.1793 \| - \| - \|
	\| 0.0950 \| 3000 \| 6.1607 \| - \| - \|
	\| 0.1267 \| 4000 \| 6.1512 \| - \| - \|
	\| 0.1584 \| 5000 \| 6.1456 \| - \| - \|
	\| 0.1901 \| 6000 \| 6.1419 \| - \| - \|
	\| 0.2218 \| 7000 \| 6.1398 \| - \| - \|
	\| 0.2534 \| 8000 \| 6.1377 \| - \| - \|
	\| 0.2851 \| 9000 \| 6.1352 \| - \| - \|
	\| 0.3168 \| 10000 \| 6.1338 \| - \| - \|
	\| 0.3485 \| 11000 \| 6.1332 \| - \| - \|
	\| 0.3801 \| 12000 \| 6.1309 \| - \| - \|
	\| 0.4118 \| 13000 \| 6.1315 \| - \| - \|
	\| 0.4435 \| 14000 \| 6.1283 \| - \| - \|
	\| 0.4752 \| 15000 \| 6.129 \| - \| - \|
	\| 0.5069 \| 16000 \| 6.1271 \| - \| - \|
	\| 0.5385 \| 17000 \| 6.1265 \| - \| - \|
	\| 0.5702 \| 18000 \| 6.1238 \| - \| - \|
	\| 0.6019 \| 19000 \| 6.1234 \| - \| - \|
	\| 0.6336 \| 20000 \| 6.1225 \| - \| - \|
	\| 0.6653 \| 21000 \| 6.1216 \| - \| - \|
	\| 0.6969 \| 22000 \| 6.1196 \| - \| - \|
	\| 0.7286 \| 23000 \| 6.1198 \| - \| - \|
	\| 0.7603 \| 24000 \| 6.1178 \| - \| - \|
	\| 0.7920 \| 25000 \| 6.117 \| - \| - \|
	\| 0.8236 \| 26000 \| 6.1167 \| - \| - \|
	\| 0.8553 \| 27000 \| 6.1165 \| - \| - \|
	\| 0.8870 \| 28000 \| 6.1149 \| - \| - \|
	\| 0.9187 \| 29000 \| 6.1146 \| - \| - \|
	\| 0.9504 \| 30000 \| 6.113 \| - \| - \|
	\| 0.9820 \| 31000 \| 6.1143 \| - \| - \|
	\| 1.0 \| 31567 \| - \| 6.1150 \| 0.4829 \|
	\| 1.0137 \| 32000 \| 6.1115 \| - \| - \|
	\| 1.0454 \| 33000 \| 6.111 \| - \| - \|
	\| 1.0771 \| 34000 \| 6.1091 \| - \| - \|
	\| 1.1088 \| 35000 \| 6.1094 \| - \| - \|
	\| 1.1404 \| 36000 \| 6.1078 \| - \| - \|
	\| 1.1721 \| 37000 \| 6.1095 \| - \| - \|
	\| 1.2038 \| 38000 \| 6.106 \| - \| - \|
	\| 1.2355 \| 39000 \| 6.1071 \| - \| - \|
	\| 1.2671 \| 40000 \| 6.1073 \| - \| - \|
	\| 1.2988 \| 41000 \| 6.1064 \| - \| - \|
	\| 1.3305 \| 42000 \| 6.1047 \| - \| - \|
	\| 1.3622 \| 43000 \| 6.1054 \| - \| - \|
	\| 1.3939 \| 44000 \| 6.1048 \| - \| - \|
	\| 1.4255 \| 45000 \| 6.1053 \| - \| - \|
	\| 1.4572 \| 46000 \| 6.1058 \| - \| - \|
	\| 1.4889 \| 47000 \| 6.1037 \| - \| - \|
	\| 1.5206 \| 48000 \| 6.1041 \| - \| - \|
	\| 1.5523 \| 49000 \| 6.1023 \| - \| - \|
	\| 1.5839 \| 50000 \| 6.1018 \| - \| - \|
	\| 1.6156 \| 51000 \| 6.104 \| - \| - \|
	\| 1.6473 \| 52000 \| 6.1004 \| - \| - \|
	\| 1.6790 \| 53000 \| 6.1027 \| - \| - \|
	\| 1.7106 \| 54000 \| 6.1017 \| - \| - \|
	\| 1.7423 \| 55000 \| 6.1011 \| - \| - \|
	\| 1.7740 \| 56000 \| 6.1002 \| - \| - \|
	\| 1.8057 \| 57000 \| 6.0994 \| - \| - \|
	\| 1.8374 \| 58000 \| 6.0985 \| - \| - \|
	\| 1.8690 \| 59000 \| 6.0986 \| - \| - \|
	\| 1.9007 \| 60000 \| 6.1006 \| - \| - \|
	\| 1.9324 \| 61000 \| 6.0983 \| - \| - \|
	\| 1.9641 \| 62000 \| 6.0983 \| - \| - \|
	\| 1.9958 \| 63000 \| 6.0973 \| - \| - \|
	\| 2.0 \| 63134 \| - \| 6.1193 \| 0.4828 \|
	\| 2.0274 \| 64000 \| 6.0943 \| - \| - \|
	\| 2.0591 \| 65000 \| 6.0941 \| - \| - \|
	\| 2.0908 \| 66000 \| 6.0936 \| - \| - \|
	\| 2.1225 \| 67000 \| 6.0909 \| - \| - \|
	\| 2.1541 \| 68000 \| 6.0925 \| - \| - \|
	\| 2.1858 \| 69000 \| 6.0932 \| - \| - \|
	\| 2.2175 \| 70000 \| 6.0939 \| - \| - \|
	\| 2.2492 \| 71000 \| 6.0919 \| - \| - \|
	\| 2.2809 \| 72000 \| 6.0932 \| - \| - \|
	\| 2.3125 \| 73000 \| 6.0916 \| - \| - \|
	\| 2.3442 \| 74000 \| 6.0919 \| - \| - \|
	\| 2.3759 \| 75000 \| 6.0919 \| - \| - \|
	\| 2.4076 \| 76000 \| 6.0911 \| - \| - \|
	\| 2.4393 \| 77000 \| 6.0924 \| - \| - \|
	\| 2.4709 \| 78000 \| 6.0911 \| - \| - \|
	\| 2.5026 \| 79000 \| 6.0922 \| - \| - \|
	\| 2.5343 \| 80000 \| 6.0926 \| - \| - \|
	\| 2.5660 \| 81000 \| 6.0911 \| - \| - \|
	\| 2.5976 \| 82000 \| 6.0897 \| - \| - \|
	\| 2.6293 \| 83000 \| 6.0922 \| - \| - \|
	\| 2.6610 \| 84000 \| 6.0908 \| - \| - \|
	\| 2.6927 \| 85000 \| 6.0884 \| - \| - \|
	\| 2.7244 \| 86000 \| 6.0907 \| - \| - \|
	\| 2.7560 \| 87000 \| 6.0904 \| - \| - \|
	\| 2.7877 \| 88000 \| 6.0881 \| - \| - \|
	\| 2.8194 \| 89000 \| 6.0902 \| - \| - \|
	\| 2.8511 \| 90000 \| 6.088 \| - \| - \|
	\| 2.8828 \| 91000 \| 6.0888 \| - \| - \|
	\| 2.9144 \| 92000 \| 6.0884 \| - \| - \|
	\| 2.9461 \| 93000 \| 6.0881 \| - \| - \|
	\| 2.9778 \| 94000 \| 6.0896 \| - \| - \|
	\| 3.0 \| 94701 \| - \| 6.1225 \| 0.4788 \|
	\| 3.0095 \| 95000 \| 6.0857 \| - \| - \|
	\| 3.0412 \| 96000 \| 6.0838 \| - \| - \|
	\| 3.0728 \| 97000 \| 6.0843 \| - \| - \|
	\| 3.1045 \| 98000 \| 6.0865 \| - \| - \|
	\| 3.1362 \| 99000 \| 6.0827 \| - \| - \|
	\| 3.1679 \| 100000 \| 6.0836 \| - \| - \|
	\| 3.1995 \| 101000 \| 6.0837 \| - \| - \|
	\| 3.2312 \| 102000 \| 6.0836 \| - \| - \|
	\| 3.2629 \| 103000 \| 6.0837 \| - \| - \|
	\| 3.2946 \| 104000 \| 6.084 \| - \| - \|
	\| 3.3263 \| 105000 \| 6.0836 \| - \| - \|
	\| 3.3579 \| 106000 \| 6.0808 \| - \| - \|
	\| 3.3896 \| 107000 \| 6.0821 \| - \| - \|
	\| 3.4213 \| 108000 \| 6.0817 \| - \| - \|
	\| 3.4530 \| 109000 \| 6.082 \| - \| - \|
	\| 3.4847 \| 110000 \| 6.083 \| - \| - \|
	\| 3.5163 \| 111000 \| 6.0829 \| - \| - \|
	\| 3.5480 \| 112000 \| 6.0832 \| - \| - \|
	\| 3.5797 \| 113000 \| 6.0829 \| - \| - \|
	\| 3.6114 \| 114000 \| 6.0837 \| - \| - \|
	\| 3.6430 \| 115000 \| 6.082 \| - \| - \|
	\| 3.6747 \| 116000 \| 6.0823 \| - \| - \|
	\| 3.7064 \| 117000 \| 6.082 \| - \| - \|
	\| 3.7381 \| 118000 \| 6.0833 \| - \| - \|
	\| 3.7698 \| 119000 \| 6.0831 \| - \| - \|
	\| 3.8014 \| 120000 \| 6.0814 \| - \| - \|
	\| 3.8331 \| 121000 \| 6.0813 \| - \| - \|
	\| 3.8648 \| 122000 \| 6.0797 \| - \| - \|
	\| 3.8965 \| 123000 \| 6.0793 \| - \| - \|
	\| 3.9282 \| 124000 \| 6.0818 \| - \| - \|
	\| 3.9598 \| 125000 \| 6.0806 \| - \| - \|
	\| 3.9915 \| 126000 \| 6.08 \| - \| - \|
	\| 4.0 \| 126268 \| - \| 6.1266 \| 0.4671 \|

	</details>

	### Framework Versions
	- Python: 3.10.6
	- Sentence Transformers: 3.0.0
	- Transformers: 4.35.0
	- PyTorch: 2.1.0a0+4136153
	- Accelerate: 0.30.1
	- Datasets: 2.14.1
	- Tokenizers: 0.14.1

	## Citation

	### BibTeX

	#### Sentence Transformers
	```bibtex
	@inproceedings{reimers-2019-sentence-bert,
	title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
	author = "Reimers, Nils and Gurevych, Iryna",
	booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
	month = "11",
	year = "2019",
	publisher = "Association for Computational Linguistics",
	url = "https://arxiv.org/abs/1908.10084",
	}
	```

	#### CoSENTLoss
	```bibtex
	@online{kexuefm-8847,
	title={CoSENT: A more efficient sentence vector scheme than Sentence-BERT},
	author={Su Jianlin},
	year={2022},
	month={Jan},
	url={https://kexue.fm/archives/8847},
	}
	```

	<!--
	## Glossary

	Clearly define terms in order to be accessible across audiences.
	-->

	<!--
	## Model Card Authors

	Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.
	-->

	<!--
	## Model Card Contact

	Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.
	-->