Add new SentenceTransformer model
Browse files- README.md +152 -143
- model.safetensors +1 -1
README.md
CHANGED
|
@@ -7,7 +7,7 @@ tags:
|
|
| 7 |
- feature-extraction
|
| 8 |
- dense
|
| 9 |
- generated_from_trainer
|
| 10 |
-
- dataset_size:
|
| 11 |
- loss:AnglELoss
|
| 12 |
- loss:CoSENTLoss
|
| 13 |
- loss:CachedMultipleNegativesRankingLoss
|
|
@@ -37,39 +37,41 @@ widget:
|
|
| 37 |
\ pediatrician, or paediatrician. The word pediatrics and its cognates mean healer\
|
| 38 |
\ of children; they derive from two Greek words: Ï\x80αá¿\x96Ï\x82 (pais child)\
|
| 39 |
\ and ἰαÏ\x84Ï\x81Ï\x8CÏ\x82 (iatros doctor, healer)."
|
| 40 |
-
- source_sentence:
|
| 41 |
-
|
| 42 |
sentences:
|
| 43 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 44 |
.
|
| 45 |
-
|
| 46 |
-
songs are rarely performed by folk musicians .
|
| 47 |
-
- After May 4 , 2012 , Gordon M. Snow was replaced by Joseph M. Demarest and then
|
| 48 |
-
Michael S. Welch with limited formal announcement .
|
| 49 |
-
- source_sentence: A woman is playing the flute.
|
| 50 |
sentences:
|
| 51 |
-
- A
|
| 52 |
-
- A
|
| 53 |
-
- A
|
| 54 |
-
- source_sentence:
|
| 55 |
sentences:
|
| 56 |
-
-
|
| 57 |
-
|
| 58 |
-
- The
|
| 59 |
-
|
|
|
|
|
|
|
| 60 |
sentences:
|
| 61 |
-
-
|
| 62 |
-
|
| 63 |
-
|
| 64 |
-
|
| 65 |
-
-
|
| 66 |
-
|
| 67 |
-
|
| 68 |
-
|
| 69 |
-
- '
|
| 70 |
-
|
| 71 |
-
|
| 72 |
-
|
| 73 |
datasets:
|
| 74 |
- google-research-datasets/paws
|
| 75 |
- nyu-mll/glue
|
|
@@ -151,12 +153,12 @@ from sentence_transformers import SentenceTransformer
|
|
| 151 |
model = SentenceTransformer("tasksource/ettin-32m-embed")
|
| 152 |
# Run inference
|
| 153 |
queries = [
|
| 154 |
-
"
|
| 155 |
]
|
| 156 |
documents = [
|
| 157 |
-
|
| 158 |
-
'
|
| 159 |
-
'
|
| 160 |
]
|
| 161 |
query_embeddings = model.encode_query(queries)
|
| 162 |
document_embeddings = model.encode_document(documents)
|
|
@@ -166,7 +168,7 @@ print(query_embeddings.shape, document_embeddings.shape)
|
|
| 166 |
# Get the similarity scores for the embeddings
|
| 167 |
similarities = model.similarity(query_embeddings, document_embeddings)
|
| 168 |
print(similarities)
|
| 169 |
-
# tensor([[ 0.
|
| 170 |
```
|
| 171 |
|
| 172 |
<!--
|
|
@@ -213,19 +215,19 @@ You can finetune this model on your own dataset.
|
|
| 213 |
#### paws/labeled_final
|
| 214 |
|
| 215 |
* Dataset: [paws/labeled_final](https://huggingface.co/datasets/paws) at [161ece9](https://huggingface.co/datasets/paws/tree/161ece9501cf0a11f3e48bd356eaa82de46d6a09)
|
| 216 |
-
* Size:
|
| 217 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
|
| 218 |
* Approximate statistics based on the first 1000 samples:
|
| 219 |
-
| | sentence1 | sentence2
|
| 220 |
-
|
| 221 |
-
| type | string | string
|
| 222 |
-
| details | <ul><li>min:
|
| 223 |
* Samples:
|
| 224 |
-
| sentence1
|
| 225 |
-
|
| 226 |
-
| <code>
|
| 227 |
-
| <code>
|
| 228 |
-
| <code>
|
| 229 |
* Loss: [<code>AnglELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#angleloss) with these parameters:
|
| 230 |
```json
|
| 231 |
{
|
|
@@ -239,19 +241,19 @@ You can finetune this model on your own dataset.
|
|
| 239 |
#### glue/mrpc
|
| 240 |
|
| 241 |
* Dataset: [glue/mrpc](https://huggingface.co/datasets/glue) at [bcdcba7](https://huggingface.co/datasets/glue/tree/bcdcba79d07bc864c1c254ccfcedcce55bcc9a8c)
|
| 242 |
-
* Size:
|
| 243 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
|
| 244 |
* Approximate statistics based on the first 1000 samples:
|
| 245 |
| | sentence1 | sentence2 | label |
|
| 246 |
|:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:------------------------------------------------|
|
| 247 |
| type | string | string | int |
|
| 248 |
-
| details | <ul><li>min:
|
| 249 |
* Samples:
|
| 250 |
-
| sentence1
|
| 251 |
-
|
| 252 |
-
| <code>
|
| 253 |
-
| <code>
|
| 254 |
-
| <code>
|
| 255 |
* Loss: [<code>AnglELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#angleloss) with these parameters:
|
| 256 |
```json
|
| 257 |
{
|
|
@@ -265,19 +267,19 @@ You can finetune this model on your own dataset.
|
|
| 265 |
#### fever-evidence-related
|
| 266 |
|
| 267 |
* Dataset: [fever-evidence-related](https://huggingface.co/datasets/mwong/fever-evidence-related) at [14aba00](https://huggingface.co/datasets/mwong/fever-evidence-related/tree/14aba009b5fcd97b1a9ee6f3e3b0da0e308cf7cb)
|
| 268 |
-
* Size:
|
| 269 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
|
| 270 |
* Approximate statistics based on the first 1000 samples:
|
| 271 |
| | sentence1 | sentence2 | label |
|
| 272 |
|:--------|:----------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------|:------------------------------------------------|
|
| 273 |
| type | string | string | int |
|
| 274 |
-
| details | <ul><li>min:
|
| 275 |
* Samples:
|
| 276 |
-
| sentence1
|
| 277 |
-
|
| 278 |
-
| <code>
|
| 279 |
-
| <code>
|
| 280 |
-
| <code>
|
| 281 |
* Loss: [<code>AnglELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#angleloss) with these parameters:
|
| 282 |
```json
|
| 283 |
{
|
|
@@ -291,19 +293,19 @@ You can finetune this model on your own dataset.
|
|
| 291 |
#### parade
|
| 292 |
|
| 293 |
* Dataset: [parade](https://huggingface.co/datasets/tasksource/parade) at [466978f](https://huggingface.co/datasets/tasksource/parade/tree/466978f31aebf4d052287f32ea3ae393f178f386)
|
| 294 |
-
* Size:
|
| 295 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
|
| 296 |
* Approximate statistics based on the first 1000 samples:
|
| 297 |
| | sentence1 | sentence2 | label |
|
| 298 |
|:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:------------------------------------------------|
|
| 299 |
| type | string | string | int |
|
| 300 |
-
| details | <ul><li>min: 6 tokens</li><li>mean: 21
|
| 301 |
* Samples:
|
| 302 |
-
| sentence1
|
| 303 |
-
|
| 304 |
-
| <code>
|
| 305 |
-
| <code>
|
| 306 |
-
| <code>
|
| 307 |
* Loss: [<code>AnglELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#angleloss) with these parameters:
|
| 308 |
```json
|
| 309 |
{
|
|
@@ -317,19 +319,19 @@ You can finetune this model on your own dataset.
|
|
| 317 |
#### apt
|
| 318 |
|
| 319 |
* Dataset: [apt](https://huggingface.co/datasets/tasksource/apt) at [f6c07f6](https://huggingface.co/datasets/tasksource/apt/tree/f6c07f66d3eccebd36418885ce10aff295d436dd)
|
| 320 |
-
* Size:
|
| 321 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
|
| 322 |
* Approximate statistics based on the first 1000 samples:
|
| 323 |
| | sentence1 | sentence2 | label |
|
| 324 |
|:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:------------------------------------------------|
|
| 325 |
| type | string | string | int |
|
| 326 |
-
| details | <ul><li>min: 4 tokens</li><li>mean: 17.
|
| 327 |
* Samples:
|
| 328 |
-
| sentence1
|
| 329 |
-
|
| 330 |
-
| <code>
|
| 331 |
-
| <code>
|
| 332 |
-
| <code>
|
| 333 |
* Loss: [<code>AnglELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#angleloss) with these parameters:
|
| 334 |
```json
|
| 335 |
{
|
|
@@ -343,19 +345,19 @@ You can finetune this model on your own dataset.
|
|
| 343 |
#### glue/stsb
|
| 344 |
|
| 345 |
* Dataset: [glue/stsb](https://huggingface.co/datasets/glue) at [bcdcba7](https://huggingface.co/datasets/glue/tree/bcdcba79d07bc864c1c254ccfcedcce55bcc9a8c)
|
| 346 |
-
* Size:
|
| 347 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
|
| 348 |
* Approximate statistics based on the first 1000 samples:
|
| 349 |
| | sentence1 | sentence2 | label |
|
| 350 |
|:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
|
| 351 |
| type | string | string | float |
|
| 352 |
-
| details | <ul><li>min: 6 tokens</li><li>mean:
|
| 353 |
* Samples:
|
| 354 |
-
| sentence1
|
| 355 |
-
|
| 356 |
-
| <code>
|
| 357 |
-
| <code>
|
| 358 |
-
| <code>
|
| 359 |
* Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
|
| 360 |
```json
|
| 361 |
{
|
|
@@ -369,19 +371,19 @@ You can finetune this model on your own dataset.
|
|
| 369 |
#### sick/relatedness
|
| 370 |
|
| 371 |
* Dataset: sick/relatedness
|
| 372 |
-
* Size:
|
| 373 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
|
| 374 |
* Approximate statistics based on the first 1000 samples:
|
| 375 |
| | sentence1 | sentence2 | label |
|
| 376 |
|:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
|
| 377 |
| type | string | string | float |
|
| 378 |
-
| details | <ul><li>min: 6 tokens</li><li>mean: 12.
|
| 379 |
* Samples:
|
| 380 |
-
| sentence1
|
| 381 |
-
|
| 382 |
-
| <code>A
|
| 383 |
-
| <code>
|
| 384 |
-
| <code>
|
| 385 |
* Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
|
| 386 |
```json
|
| 387 |
{
|
|
@@ -395,19 +397,19 @@ You can finetune this model on your own dataset.
|
|
| 395 |
#### sts-companion
|
| 396 |
|
| 397 |
* Dataset: [sts-companion](https://huggingface.co/datasets/tasksource/sts-companion) at [fd8beff](https://huggingface.co/datasets/tasksource/sts-companion/tree/fd8beffb788df5f6673bc688e6dcbe3690a3acc6)
|
| 398 |
-
* Size:
|
| 399 |
* Columns: <code>label</code>, <code>sentence1</code>, and <code>sentence2</code>
|
| 400 |
* Approximate statistics based on the first 1000 samples:
|
| 401 |
-
| | label | sentence1 | sentence2
|
| 402 |
-
|
| 403 |
-
| type | float | string | string
|
| 404 |
-
| details | <ul><li>min: 0.0</li><li>mean: 3.
|
| 405 |
* Samples:
|
| 406 |
-
| label | sentence1
|
| 407 |
-
|
| 408 |
-
| <code>
|
| 409 |
-
| <code>3.
|
| 410 |
-
| <code>
|
| 411 |
* Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
|
| 412 |
```json
|
| 413 |
{
|
|
@@ -421,19 +423,19 @@ You can finetune this model on your own dataset.
|
|
| 421 |
#### zero-shot-label-nli
|
| 422 |
|
| 423 |
* Dataset: [zero-shot-label-nli](https://huggingface.co/datasets/tasksource/zero-shot-label-nli) at [ee693db](https://huggingface.co/datasets/tasksource/zero-shot-label-nli/tree/ee693dba923b5d5484aa9232b7357c5e45dd39b8)
|
| 424 |
-
* Size:
|
| 425 |
* Columns: <code>label</code>, <code>sentence1</code>, and <code>sentence2</code>
|
| 426 |
* Approximate statistics based on the first 1000 samples:
|
| 427 |
| | label | sentence1 | sentence2 |
|
| 428 |
|:--------|:------------------------------------------------|:------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|
|
| 429 |
| type | int | string | string |
|
| 430 |
-
| details | <ul><li>0: ~
|
| 431 |
* Samples:
|
| 432 |
-
| label | sentence1
|
| 433 |
-
|
| 434 |
-
| <code>0</code> | <code>
|
| 435 |
-
| <code>
|
| 436 |
-
| <code>0</code> | <code>
|
| 437 |
* Loss: [<code>AnglELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#angleloss) with these parameters:
|
| 438 |
```json
|
| 439 |
{
|
|
@@ -726,11 +728,11 @@ You can finetune this model on your own dataset.
|
|
| 726 |
### Training Hyperparameters
|
| 727 |
#### Non-Default Hyperparameters
|
| 728 |
|
| 729 |
-
- `per_device_train_batch_size`:
|
| 730 |
-
- `learning_rate`:
|
| 731 |
-
- `weight_decay`:
|
| 732 |
- `num_train_epochs`: 1
|
| 733 |
-
- `warmup_ratio`: 0.
|
| 734 |
- `fp16`: True
|
| 735 |
- `gradient_checkpointing`: True
|
| 736 |
- `torch_compile`: True
|
|
@@ -743,15 +745,15 @@ You can finetune this model on your own dataset.
|
|
| 743 |
- `do_predict`: False
|
| 744 |
- `eval_strategy`: no
|
| 745 |
- `prediction_loss_only`: True
|
| 746 |
-
- `per_device_train_batch_size`:
|
| 747 |
- `per_device_eval_batch_size`: 8
|
| 748 |
- `per_gpu_train_batch_size`: None
|
| 749 |
- `per_gpu_eval_batch_size`: None
|
| 750 |
- `gradient_accumulation_steps`: 1
|
| 751 |
- `eval_accumulation_steps`: None
|
| 752 |
- `torch_empty_cache_steps`: None
|
| 753 |
-
- `learning_rate`:
|
| 754 |
-
- `weight_decay`:
|
| 755 |
- `adam_beta1`: 0.9
|
| 756 |
- `adam_beta2`: 0.999
|
| 757 |
- `adam_epsilon`: 1e-08
|
|
@@ -760,7 +762,7 @@ You can finetune this model on your own dataset.
|
|
| 760 |
- `max_steps`: -1
|
| 761 |
- `lr_scheduler_type`: linear
|
| 762 |
- `lr_scheduler_kwargs`: {}
|
| 763 |
-
- `warmup_ratio`: 0.
|
| 764 |
- `warmup_steps`: 0
|
| 765 |
- `log_level`: passive
|
| 766 |
- `log_level_replica`: warning
|
|
@@ -864,38 +866,45 @@ You can finetune this model on your own dataset.
|
|
| 864 |
### Training Logs
|
| 865 |
| Epoch | Step | Training Loss |
|
| 866 |
|:------:|:-----:|:-------------:|
|
| 867 |
-
| 0.
|
| 868 |
-
| 0.
|
| 869 |
-
| 0.
|
| 870 |
-
| 0.
|
| 871 |
-
| 0.
|
| 872 |
-
| 0.
|
| 873 |
-
| 0.
|
| 874 |
-
| 0.
|
| 875 |
-
| 0.
|
| 876 |
-
| 0.
|
| 877 |
-
| 0.
|
| 878 |
-
| 0.
|
| 879 |
-
| 0.
|
| 880 |
-
| 0.
|
| 881 |
-
| 0.
|
| 882 |
-
| 0.
|
| 883 |
-
| 0.
|
| 884 |
-
| 0.
|
| 885 |
-
| 0.
|
| 886 |
-
| 0.
|
| 887 |
-
| 0.
|
| 888 |
-
| 0.
|
| 889 |
-
| 0.
|
| 890 |
-
| 0.
|
| 891 |
-
| 0.
|
| 892 |
-
| 0.
|
| 893 |
-
| 0.
|
| 894 |
-
| 0.
|
| 895 |
-
| 0.
|
| 896 |
-
| 0.
|
| 897 |
-
| 0.
|
| 898 |
-
| 0.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 899 |
|
| 900 |
|
| 901 |
### Framework Versions
|
|
|
|
| 7 |
- feature-extraction
|
| 8 |
- dense
|
| 9 |
- generated_from_trainer
|
| 10 |
+
- dataset_size:7176192
|
| 11 |
- loss:AnglELoss
|
| 12 |
- loss:CoSENTLoss
|
| 13 |
- loss:CachedMultipleNegativesRankingLoss
|
|
|
|
| 37 |
\ pediatrician, or paediatrician. The word pediatrics and its cognates mean healer\
|
| 38 |
\ of children; they derive from two Greek words: Ï\x80αá¿\x96Ï\x82 (pais child)\
|
| 39 |
\ and ἰαÏ\x84Ï\x81Ï\x8CÏ\x82 (iatros doctor, healer)."
|
| 40 |
+
- source_sentence: Creek Township borders Elsinboro Township , Pennsville Township
|
| 41 |
+
and Salem .
|
| 42 |
sentences:
|
| 43 |
+
- Today , Galesburg-Augusta Community Schools consists of a primary school and a
|
| 44 |
+
high school in Galesburg and a middle school in Augusta .
|
| 45 |
+
- Elsinboro Township borders with the Lower Alloways Creek Township , Pennsville
|
| 46 |
+
Township and Salem .
|
| 47 |
+
- In 1953 , he married the actress Gilda Neeltje , sister of the actress Diane Holland
|
| 48 |
.
|
| 49 |
+
- source_sentence: A man is riding on one wheel on a motorcycle.
|
|
|
|
|
|
|
|
|
|
|
|
|
| 50 |
sentences:
|
| 51 |
+
- A person is performing tricks on a motorcycle.
|
| 52 |
+
- A boy jumping in the air on the beach.
|
| 53 |
+
- A woman is pouring ingredients into a frying pan.
|
| 54 |
+
- source_sentence: '''Why don''t you find out?'
|
| 55 |
sentences:
|
| 56 |
+
- He is suggesting that the lack of effort focusing on the concept is making it
|
| 57 |
+
seem unrealistic.
|
| 58 |
+
- The military stated that the 244th Engineer Battalion has been handling the construction
|
| 59 |
+
of playgrounds, cleaning up the rubble and restoring irrigation services in Iraq.
|
| 60 |
+
- Why you haven't find out?.
|
| 61 |
+
- source_sentence: what are the three subatomic particles called?
|
| 62 |
sentences:
|
| 63 |
+
- Subatomic particles include electrons, the negatively charged, almost massless
|
| 64 |
+
particles that nevertheless account for most of the size of the atom, and they
|
| 65 |
+
include the heavier building blocks of the small but very dense nucleus of the
|
| 66 |
+
atom, the positively charged protons and the electrically neutral neutrons.
|
| 67 |
+
- Your body needs cholesterol to build healthy cells, but high levels of cholesterol
|
| 68 |
+
can increase your risk of heart disease. With high cholesterol, you can develop
|
| 69 |
+
fatty deposits in your blood vessels. Eventually, these deposits grow, making
|
| 70 |
+
it difficult for enough blood to flow through your arteries.
|
| 71 |
+
- 'If you experience any of the following symptoms, stop taking ibuprofen and call
|
| 72 |
+
your doctor: stomach pain, heartburn, vomit that is bloody or looks like coffee
|
| 73 |
+
grounds, blood in the stool, or black and tarry stools. Keep all appointments
|
| 74 |
+
with your doctor and the laboratory.'
|
| 75 |
datasets:
|
| 76 |
- google-research-datasets/paws
|
| 77 |
- nyu-mll/glue
|
|
|
|
| 153 |
model = SentenceTransformer("tasksource/ettin-32m-embed")
|
| 154 |
# Run inference
|
| 155 |
queries = [
|
| 156 |
+
"what are the three subatomic particles called?",
|
| 157 |
]
|
| 158 |
documents = [
|
| 159 |
+
'Subatomic particles include electrons, the negatively charged, almost massless particles that nevertheless account for most of the size of the atom, and they include the heavier building blocks of the small but very dense nucleus of the atom, the positively charged protons and the electrically neutral neutrons.',
|
| 160 |
+
'Your body needs cholesterol to build healthy cells, but high levels of cholesterol can increase your risk of heart disease. With high cholesterol, you can develop fatty deposits in your blood vessels. Eventually, these deposits grow, making it difficult for enough blood to flow through your arteries.',
|
| 161 |
+
'If you experience any of the following symptoms, stop taking ibuprofen and call your doctor: stomach pain, heartburn, vomit that is bloody or looks like coffee grounds, blood in the stool, or black and tarry stools. Keep all appointments with your doctor and the laboratory.',
|
| 162 |
]
|
| 163 |
query_embeddings = model.encode_query(queries)
|
| 164 |
document_embeddings = model.encode_document(documents)
|
|
|
|
| 168 |
# Get the similarity scores for the embeddings
|
| 169 |
similarities = model.similarity(query_embeddings, document_embeddings)
|
| 170 |
print(similarities)
|
| 171 |
+
# tensor([[ 0.6600, -0.0148, 0.0229]])
|
| 172 |
```
|
| 173 |
|
| 174 |
<!--
|
|
|
|
| 215 |
#### paws/labeled_final
|
| 216 |
|
| 217 |
* Dataset: [paws/labeled_final](https://huggingface.co/datasets/paws) at [161ece9](https://huggingface.co/datasets/paws/tree/161ece9501cf0a11f3e48bd356eaa82de46d6a09)
|
| 218 |
+
* Size: 148,203 training samples
|
| 219 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
|
| 220 |
* Approximate statistics based on the first 1000 samples:
|
| 221 |
+
| | sentence1 | sentence2 | label |
|
| 222 |
+
|:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:------------------------------------------------|
|
| 223 |
+
| type | string | string | int |
|
| 224 |
+
| details | <ul><li>min: 11 tokens</li><li>mean: 27.65 tokens</li><li>max: 57 tokens</li></ul> | <ul><li>min: 11 tokens</li><li>mean: 27.73 tokens</li><li>max: 57 tokens</li></ul> | <ul><li>0: ~57.50%</li><li>1: ~42.50%</li></ul> |
|
| 225 |
* Samples:
|
| 226 |
+
| sentence1 | sentence2 | label |
|
| 227 |
+
|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
|
| 228 |
+
| <code>Ceremonial music ( `` rokon fada '' ) is listed as a status symbol , and musicians are generally chosen for political reasons as opposed to musical ones .</code> | <code>Ceremonial music ( `` rokon fada '' ) is performed as a status symbol , and musicians are generally chosen for musical reasons as opposed to political ones .</code> | <code>0</code> |
|
| 229 |
+
| <code>In 1989 he travelled to South Africa , Johannesburg and Angola , Mozambique on a peace-seeking mission .</code> | <code>In 1989 , he traveled to Mozambique , Johannesburg , and Angola , South Africa on a peace-seeking mission .</code> | <code>1</code> |
|
| 230 |
+
| <code>In this way , the Nestorian faith was established in the East under tragic signs .</code> | <code>In this way , under Nestorian auspices , the tragic faith was established in the East .</code> | <code>0</code> |
|
| 231 |
* Loss: [<code>AnglELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#angleloss) with these parameters:
|
| 232 |
```json
|
| 233 |
{
|
|
|
|
| 241 |
#### glue/mrpc
|
| 242 |
|
| 243 |
* Dataset: [glue/mrpc](https://huggingface.co/datasets/glue) at [bcdcba7](https://huggingface.co/datasets/glue/tree/bcdcba79d07bc864c1c254ccfcedcce55bcc9a8c)
|
| 244 |
+
* Size: 11,004 training samples
|
| 245 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
|
| 246 |
* Approximate statistics based on the first 1000 samples:
|
| 247 |
| | sentence1 | sentence2 | label |
|
| 248 |
|:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:------------------------------------------------|
|
| 249 |
| type | string | string | int |
|
| 250 |
+
| details | <ul><li>min: 11 tokens</li><li>mean: 27.23 tokens</li><li>max: 52 tokens</li></ul> | <ul><li>min: 11 tokens</li><li>mean: 27.29 tokens</li><li>max: 53 tokens</li></ul> | <ul><li>0: ~33.10%</li><li>1: ~66.90%</li></ul> |
|
| 251 |
* Samples:
|
| 252 |
+
| sentence1 | sentence2 | label |
|
| 253 |
+
|:------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
|
| 254 |
+
| <code>Tony Blair has taken a hardline stance arguing nothing should be done to lessen the pressure on Mugabe at the gathering in the capital Abuja .</code> | <code>The Prime Minister has taken a hardline stance arguing nothing should be done to lessen the pressure on Mugabe .</code> | <code>0</code> |
|
| 255 |
+
| <code>The identical rovers will act as robotic geologists , searching for evidence of past water .</code> | <code>The rovers act as robotic geologists , moving on six wheels .</code> | <code>0</code> |
|
| 256 |
+
| <code>" We make no apologies for finding every legal way possible to protect the American public from further terrorist attack , " Barbara Comstock said .</code> | <code>" We make no apologies for finding every legal way possible to protect the American public from further terrorist attacks , " said Barbara Comstock , Ashcroft 's press secretary .</code> | <code>1</code> |
|
| 257 |
* Loss: [<code>AnglELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#angleloss) with these parameters:
|
| 258 |
```json
|
| 259 |
{
|
|
|
|
| 267 |
#### fever-evidence-related
|
| 268 |
|
| 269 |
* Dataset: [fever-evidence-related](https://huggingface.co/datasets/mwong/fever-evidence-related) at [14aba00](https://huggingface.co/datasets/mwong/fever-evidence-related/tree/14aba009b5fcd97b1a9ee6f3e3b0da0e308cf7cb)
|
| 270 |
+
* Size: 800,000 training samples
|
| 271 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
|
| 272 |
* Approximate statistics based on the first 1000 samples:
|
| 273 |
| | sentence1 | sentence2 | label |
|
| 274 |
|:--------|:----------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------|:------------------------------------------------|
|
| 275 |
| type | string | string | int |
|
| 276 |
+
| details | <ul><li>min: 7 tokens</li><li>mean: 13.65 tokens</li><li>max: 28 tokens</li></ul> | <ul><li>min: 28 tokens</li><li>mean: 318.06 tokens</li><li>max: 1024 tokens</li></ul> | <ul><li>0: ~30.20%</li><li>1: ~69.80%</li></ul> |
|
| 277 |
* Samples:
|
| 278 |
+
| sentence1 | sentence2 | label |
|
| 279 |
+
|:-----------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
|
| 280 |
+
| <code>Batman: The Killing Joke features characters.</code> | <code>notice. Cantonese Pinyin -LRB- , also known as 教院式拼音方案 -RRB- is a romanization system for Cantonese developed by Rev. Yu Ping Chiu in 1971 , and subsequently modified by the Education Department -LRB- merged into the Education and Manpower Bureau since 2003 -RRB- of Hong Kong and Prof. Zhan Bohui of the Chinese Dialects Research Centre of the Jinan University , Guangdong , PRC , and honorary professor of the School of Chinese , University of Hong Kong .. romanization. romanization. Cantonese. Cantonese. Education and Manpower Bureau. Education and Manpower Bureau. Zhan Bohui. Zhan Bohui. It is the only romanization system accepted by Education and Manpower Bureau of Hong Kong and Hong Kong Examinations and Assessment Authority .. romanization. romanization. Education and Manpower Bureau. Education and Manpower Bureau. Hong Kong Examinations and Assessment Authority. Hong Kong Examinations and Assessment Authority. The formal and short forms of the system 's Chinese names mean respectiv...</code> | <code>1</code> |
|
| 281 |
+
| <code>Jon Snow is played by a person.</code> | <code>Cao'an is a temple in Jinjiang , Fujian .. Originally constructed by Chinese Manicheans , it was viewed by later worshipers as a Buddhist temple .. Manicheans. Manichaeism. This `` Manichean temple in Buddhist disguise ''. is seen by modern experts on Manichaeism as `` the only extant Manichean temple in China '' , or `` the only Manichean building which has survived intact '' .</code> | <code>1</code> |
|
| 282 |
+
| <code>Scotland includes islands.</code> | <code>Scotland -LRB- -LSB- ˈskɒt.lənd -RSB- Scots : -LSB- - scoˈskɔt.lənd -RSB- Alba -LSB- ˈalˠapə -RSB- -RRB- is a country that is part of the United Kingdom and covers the northern third of the island of Great Britain .. Scots. Scots language. Scotland. Scots Law. Alba. Alba. country. country. part. Countries of the United Kingdom. United Kingdom. United Kingdom. Great Britain. Great Britain. It shares a border with England to the south , and is otherwise surrounded by the Atlantic Ocean , with the North Sea to the east and the North Channel and Irish Sea to the south-west .. England. England. Atlantic Ocean. Atlantic Ocean. North Sea. North Sea. North Channel. North Channel ( British Isles ). Irish Sea. Irish Sea. In addition to the mainland , the country is made up of more than 790 islands , including the Northern Isles and the Hebrides .. country. country. Northern Isles. Northern Isles. Hebrides. Hebrides. The Kingdom of Scotland emerged as an independent sovereign state in the Early ...</code> | <code>0</code> |
|
| 283 |
* Loss: [<code>AnglELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#angleloss) with these parameters:
|
| 284 |
```json
|
| 285 |
{
|
|
|
|
| 293 |
#### parade
|
| 294 |
|
| 295 |
* Dataset: [parade](https://huggingface.co/datasets/tasksource/parade) at [466978f](https://huggingface.co/datasets/tasksource/parade/tree/466978f31aebf4d052287f32ea3ae393f178f386)
|
| 296 |
+
* Size: 22,650 training samples
|
| 297 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
|
| 298 |
* Approximate statistics based on the first 1000 samples:
|
| 299 |
| | sentence1 | sentence2 | label |
|
| 300 |
|:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:------------------------------------------------|
|
| 301 |
| type | string | string | int |
|
| 302 |
+
| details | <ul><li>min: 6 tokens</li><li>mean: 22.21 tokens</li><li>max: 61 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 21.48 tokens</li><li>max: 49 tokens</li></ul> | <ul><li>0: ~54.80%</li><li>1: ~45.20%</li></ul> |
|
| 303 |
* Samples:
|
| 304 |
+
| sentence1 | sentence2 | label |
|
| 305 |
+
|:---------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
|
| 306 |
+
| <code>access to device itself application specific data (network services, dns, html, http, etc)</code> | <code>(upper layer data)facilitates communication between such programs and lower-layer network services. high-level apis, including resource sharing, remote file access.</code> | <code>0</code> |
|
| 307 |
+
| <code>an important element of information management, but it is just one part of a larger whole</code> | <code>converting facts and figures into useful information</code> | <code>0</code> |
|
| 308 |
+
| <code>web site that has a field for you to type in a search query, as it will search the internet for you using your search criteria.</code> | <code>web-based search tool that locates a web page using a keyword</code> | <code>1</code> |
|
| 309 |
* Loss: [<code>AnglELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#angleloss) with these parameters:
|
| 310 |
```json
|
| 311 |
{
|
|
|
|
| 319 |
#### apt
|
| 320 |
|
| 321 |
* Dataset: [apt](https://huggingface.co/datasets/tasksource/apt) at [f6c07f6](https://huggingface.co/datasets/tasksource/apt/tree/f6c07f66d3eccebd36418885ce10aff295d436dd)
|
| 322 |
+
* Size: 10,047 training samples
|
| 323 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
|
| 324 |
* Approximate statistics based on the first 1000 samples:
|
| 325 |
| | sentence1 | sentence2 | label |
|
| 326 |
|:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:------------------------------------------------|
|
| 327 |
| type | string | string | int |
|
| 328 |
+
| details | <ul><li>min: 4 tokens</li><li>mean: 17.32 tokens</li><li>max: 213 tokens</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 16.46 tokens</li><li>max: 121 tokens</li></ul> | <ul><li>0: ~35.80%</li><li>1: ~64.20%</li></ul> |
|
| 329 |
* Samples:
|
| 330 |
+
| sentence1 | sentence2 | label |
|
| 331 |
+
|:------------------------------------------------------------------|:-------------------------------------------------------------------------|:---------------|
|
| 332 |
+
| <code>Watch out.</code> | <code>U.S. Bank</code> | <code>0</code> |
|
| 333 |
+
| <code>Oh! we spent all night, used all the fancy machines.</code> | <code>We spent all night using the luxurious equipment.</code> | <code>1</code> |
|
| 334 |
+
| <code>I'm willing to give you all this information...</code> | <code>This information, all of it, I'm inclined to provide you...</code> | <code>1</code> |
|
| 335 |
* Loss: [<code>AnglELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#angleloss) with these parameters:
|
| 336 |
```json
|
| 337 |
{
|
|
|
|
| 345 |
#### glue/stsb
|
| 346 |
|
| 347 |
* Dataset: [glue/stsb](https://huggingface.co/datasets/glue) at [bcdcba7](https://huggingface.co/datasets/glue/tree/bcdcba79d07bc864c1c254ccfcedcce55bcc9a8c)
|
| 348 |
+
* Size: 17,247 training samples
|
| 349 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
|
| 350 |
* Approximate statistics based on the first 1000 samples:
|
| 351 |
| | sentence1 | sentence2 | label |
|
| 352 |
|:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
|
| 353 |
| type | string | string | float |
|
| 354 |
+
| details | <ul><li>min: 6 tokens</li><li>mean: 14.68 tokens</li><li>max: 57 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 14.84 tokens</li><li>max: 68 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 2.64</li><li>max: 5.0</li></ul> |
|
| 355 |
* Samples:
|
| 356 |
+
| sentence1 | sentence2 | label |
|
| 357 |
+
|:----------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------|:--------------------------------|
|
| 358 |
+
| <code>Mandela's condition has 'improved'</code> | <code>Mandela's condition has 'worsened over past 48 hours'</code> | <code>1.0</code> |
|
| 359 |
+
| <code>the cfe is very important for european security.</code> | <code>the cfe is a cornerstone of european security.</code> | <code>5.0</code> |
|
| 360 |
+
| <code>The Nasdaq fell about 1.3% for the month, snapping a seven-month winning streak.</code> | <code>The Nasdaq is down roughly 0.4 percent for the month, on track to snap a 7-month streak of gains.</code> | <code>2.4000000953674316</code> |
|
| 361 |
* Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
|
| 362 |
```json
|
| 363 |
{
|
|
|
|
| 371 |
#### sick/relatedness
|
| 372 |
|
| 373 |
* Dataset: sick/relatedness
|
| 374 |
+
* Size: 13,317 training samples
|
| 375 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
|
| 376 |
* Approximate statistics based on the first 1000 samples:
|
| 377 |
| | sentence1 | sentence2 | label |
|
| 378 |
|:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
|
| 379 |
| type | string | string | float |
|
| 380 |
+
| details | <ul><li>min: 6 tokens</li><li>mean: 12.25 tokens</li><li>max: 28 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 12.11 tokens</li><li>max: 38 tokens</li></ul> | <ul><li>min: 1.0</li><li>mean: 3.51</li><li>max: 5.0</li></ul> |
|
| 381 |
* Samples:
|
| 382 |
+
| sentence1 | sentence2 | label |
|
| 383 |
+
|:------------------------------------------------------|:------------------------------------------------------------------------------------|:--------------------------------|
|
| 384 |
+
| <code>A cold cyclist is celebrating</code> | <code>A bike is being held over his head by a bicyclist in a group of people</code> | <code>2.299999952316284</code> |
|
| 385 |
+
| <code>Nobody is cutting a capsicum into pieces</code> | <code>The person is slicing a clove of garlic into pieces</code> | <code>3.0999999046325684</code> |
|
| 386 |
+
| <code>A woman is not cutting shrimps</code> | <code>A man is chopping butter into a container</code> | <code>1.7999999523162842</code> |
|
| 387 |
* Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
|
| 388 |
```json
|
| 389 |
{
|
|
|
|
| 397 |
#### sts-companion
|
| 398 |
|
| 399 |
* Dataset: [sts-companion](https://huggingface.co/datasets/tasksource/sts-companion) at [fd8beff](https://huggingface.co/datasets/tasksource/sts-companion/tree/fd8beffb788df5f6673bc688e6dcbe3690a3acc6)
|
| 400 |
+
* Size: 14,280 training samples
|
| 401 |
* Columns: <code>label</code>, <code>sentence1</code>, and <code>sentence2</code>
|
| 402 |
* Approximate statistics based on the first 1000 samples:
|
| 403 |
+
| | label | sentence1 | sentence2 |
|
| 404 |
+
|:--------|:---------------------------------------------------------------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
|
| 405 |
+
| type | float | string | string |
|
| 406 |
+
| details | <ul><li>min: 0.0</li><li>mean: 3.13</li><li>max: 5.0</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 18.95 tokens</li><li>max: 91 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 17.55 tokens</li><li>max: 269 tokens</li></ul> |
|
| 407 |
* Samples:
|
| 408 |
+
| label | sentence1 | sentence2 |
|
| 409 |
+
|:-----------------|:-----------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------|
|
| 410 |
+
| <code>4.2</code> | <code>I am calling BS!!! NYTimes: Morsi Says His Slurs of Jews Were Taken Out of Context</code> | <code>Morsi Says Slurs of Jews Were Taken Out of Context</code> |
|
| 411 |
+
| <code>3.0</code> | <code>The driver of the coach tried to avoid it by swerving hard, but still grazed the right side of the lorry.</code> | <code>The driver of the last to try to avoid it through a sudden move, but he fell short by his right side.</code> |
|
| 412 |
+
| <code>5.0</code> | <code>create a mess or disorder</code> | <code>make a mess of or create disorder in.</code> |
|
| 413 |
* Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
|
| 414 |
```json
|
| 415 |
{
|
|
|
|
| 423 |
#### zero-shot-label-nli
|
| 424 |
|
| 425 |
* Dataset: [zero-shot-label-nli](https://huggingface.co/datasets/tasksource/zero-shot-label-nli) at [ee693db](https://huggingface.co/datasets/tasksource/zero-shot-label-nli/tree/ee693dba923b5d5484aa9232b7357c5e45dd39b8)
|
| 426 |
+
* Size: 1,090,333 training samples
|
| 427 |
* Columns: <code>label</code>, <code>sentence1</code>, and <code>sentence2</code>
|
| 428 |
* Approximate statistics based on the first 1000 samples:
|
| 429 |
| | label | sentence1 | sentence2 |
|
| 430 |
|:--------|:------------------------------------------------|:------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|
|
| 431 |
| type | int | string | string |
|
| 432 |
+
| details | <ul><li>0: ~50.70%</li><li>1: ~49.30%</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 68.51 tokens</li><li>max: 1024 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 7.95 tokens</li><li>max: 17 tokens</li></ul> |
|
| 433 |
* Samples:
|
| 434 |
+
| label | sentence1 | sentence2 |
|
| 435 |
+
|:---------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------|
|
| 436 |
+
| <code>0</code> | <code>Amrozi accused his brother , whom he called " the witness " , of deliberately distorting his evidence .<br>Referring to him as only " the witness " , Amrozi accused his brother of deliberately distorting his evidence .</code> | <code>This example is not_equivalent.</code> |
|
| 437 |
+
| <code>1</code> | <code>Do science and religion conflict with each other?<br>Does science conflict with the Bible?</code> | <code>This example is not_duplicate.</code> |
|
| 438 |
+
| <code>0</code> | <code>do iran and afghanistan speak the same language</code> | <code>This example is False.</code> |
|
| 439 |
* Loss: [<code>AnglELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#angleloss) with these parameters:
|
| 440 |
```json
|
| 441 |
{
|
|
|
|
| 728 |
### Training Hyperparameters
|
| 729 |
#### Non-Default Hyperparameters
|
| 730 |
|
| 731 |
+
- `per_device_train_batch_size`: 360
|
| 732 |
+
- `learning_rate`: 8e-05
|
| 733 |
+
- `weight_decay`: 5e-05
|
| 734 |
- `num_train_epochs`: 1
|
| 735 |
+
- `warmup_ratio`: 0.03
|
| 736 |
- `fp16`: True
|
| 737 |
- `gradient_checkpointing`: True
|
| 738 |
- `torch_compile`: True
|
|
|
|
| 745 |
- `do_predict`: False
|
| 746 |
- `eval_strategy`: no
|
| 747 |
- `prediction_loss_only`: True
|
| 748 |
+
- `per_device_train_batch_size`: 360
|
| 749 |
- `per_device_eval_batch_size`: 8
|
| 750 |
- `per_gpu_train_batch_size`: None
|
| 751 |
- `per_gpu_eval_batch_size`: None
|
| 752 |
- `gradient_accumulation_steps`: 1
|
| 753 |
- `eval_accumulation_steps`: None
|
| 754 |
- `torch_empty_cache_steps`: None
|
| 755 |
+
- `learning_rate`: 8e-05
|
| 756 |
+
- `weight_decay`: 5e-05
|
| 757 |
- `adam_beta1`: 0.9
|
| 758 |
- `adam_beta2`: 0.999
|
| 759 |
- `adam_epsilon`: 1e-08
|
|
|
|
| 762 |
- `max_steps`: -1
|
| 763 |
- `lr_scheduler_type`: linear
|
| 764 |
- `lr_scheduler_kwargs`: {}
|
| 765 |
+
- `warmup_ratio`: 0.03
|
| 766 |
- `warmup_steps`: 0
|
| 767 |
- `log_level`: passive
|
| 768 |
- `log_level_replica`: warning
|
|
|
|
| 866 |
### Training Logs
|
| 867 |
| Epoch | Step | Training Loss |
|
| 868 |
|:------:|:-----:|:-------------:|
|
| 869 |
+
| 0.0251 | 500 | 5.0537 |
|
| 870 |
+
| 0.0501 | 1000 | 3.6206 |
|
| 871 |
+
| 0.0752 | 1500 | 3.249 |
|
| 872 |
+
| 0.1003 | 2000 | 3.5885 |
|
| 873 |
+
| 0.1254 | 2500 | 3.2479 |
|
| 874 |
+
| 0.1504 | 3000 | 3.2033 |
|
| 875 |
+
| 0.1755 | 3500 | 2.7123 |
|
| 876 |
+
| 0.2006 | 4000 | 2.8247 |
|
| 877 |
+
| 0.2257 | 4500 | 2.7694 |
|
| 878 |
+
| 0.2507 | 5000 | 3.0215 |
|
| 879 |
+
| 0.2758 | 5500 | 2.6723 |
|
| 880 |
+
| 0.3009 | 6000 | 2.8297 |
|
| 881 |
+
| 0.3259 | 6500 | 2.4046 |
|
| 882 |
+
| 0.3510 | 7000 | 2.2289 |
|
| 883 |
+
| 0.3761 | 7500 | 2.4628 |
|
| 884 |
+
| 0.4012 | 8000 | 2.4032 |
|
| 885 |
+
| 0.4262 | 8500 | 2.5024 |
|
| 886 |
+
| 0.4513 | 9000 | 2.0948 |
|
| 887 |
+
| 0.4764 | 9500 | 2.4389 |
|
| 888 |
+
| 0.5015 | 10000 | 2.4771 |
|
| 889 |
+
| 0.5265 | 10500 | 2.6465 |
|
| 890 |
+
| 0.5516 | 11000 | 2.5892 |
|
| 891 |
+
| 0.5767 | 11500 | 2.3557 |
|
| 892 |
+
| 0.6017 | 12000 | 2.2359 |
|
| 893 |
+
| 0.6268 | 12500 | 2.5839 |
|
| 894 |
+
| 0.6519 | 13000 | 2.4216 |
|
| 895 |
+
| 0.6770 | 13500 | 2.3211 |
|
| 896 |
+
| 0.7020 | 14000 | 2.1171 |
|
| 897 |
+
| 0.7271 | 14500 | 2.1206 |
|
| 898 |
+
| 0.7522 | 15000 | 2.2557 |
|
| 899 |
+
| 0.7773 | 15500 | 2.2815 |
|
| 900 |
+
| 0.8023 | 16000 | 2.0951 |
|
| 901 |
+
| 0.8274 | 16500 | 2.3415 |
|
| 902 |
+
| 0.8525 | 17000 | 2.2792 |
|
| 903 |
+
| 0.8775 | 17500 | 2.3113 |
|
| 904 |
+
| 0.9026 | 18000 | 2.1932 |
|
| 905 |
+
| 0.9277 | 18500 | 2.1134 |
|
| 906 |
+
| 0.9528 | 19000 | 1.9995 |
|
| 907 |
+
| 0.9778 | 19500 | 1.8916 |
|
| 908 |
|
| 909 |
|
| 910 |
### Framework Versions
|
model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 127538496
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:3d9729ed5a375cb33fdfe9941bf4032235f8e37c6b27fa88b752ff736b85616b
|
| 3 |
size 127538496
|