thebajajra commited on
Commit
be384f8
·
verified ·
1 Parent(s): 1c7a511

Upload folder using huggingface_hub

Browse files
README.md CHANGED
@@ -4,62 +4,140 @@ tags:
4
  - sentence-similarity
5
  - feature-extraction
6
  - dense
7
- - ecommerce
8
- - e-commerce
9
- - retail
10
- - marketplace
11
- - shopping
12
- - amazon
13
- - ebay
14
- - alibaba
15
- - google
16
- - rakuten
17
- - bestbuy
18
- - walmart
19
- - flipkart
20
- - wayfair
21
- - shein
22
- - target
23
- - etsy
24
- - shopify
25
- - taobao
26
- - asos
27
- - carrefour
28
- - costco
29
- - overstock
30
- - pretraining
31
- - encoder
32
- - language-modeling
33
- - foundation-model
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34
  pipeline_tag: sentence-similarity
35
  library_name: sentence-transformers
36
  ---
37
 
38
- # RexBERT-base-embed-pf-v0.3
 
 
39
 
40
  ## Model Details
41
 
42
  ### Model Description
43
  - **Model Type:** Sentence Transformer
44
- <!-- - **Base model:** [Unknown](https://huggingface.co/unknown) -->
45
- - **Maximum Sequence Length:** 2048 tokens
46
  - **Output Dimensionality:** 768 dimensions
47
  - **Similarity Function:** Cosine Similarity
48
- <!-- - **Training Dataset:** Unknown -->
 
49
  <!-- - **Language:** Unknown -->
50
  <!-- - **License:** Unknown -->
51
 
52
  ### Model Sources
53
 
54
  - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
55
- - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
56
  - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
57
 
58
  ### Full Model Architecture
59
 
60
  ```
61
  SentenceTransformer(
62
- (0): Transformer({'max_seq_length': 2048, 'do_lower_case': False, 'architecture': 'ModernBertModel'})
63
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
64
  )
65
  ```
@@ -82,9 +160,9 @@ from sentence_transformers import SentenceTransformer
82
  model = SentenceTransformer("sentence_transformers_model_id")
83
  # Run inference
84
  sentences = [
85
- 'The weather is lovely today.',
86
- "It's so sunny outside!",
87
- 'He drove to the stadium.',
88
  ]
89
  embeddings = model.encode(sentences)
90
  print(embeddings.shape)
@@ -93,9 +171,9 @@ print(embeddings.shape)
93
  # Get the similarity scores for the embeddings
94
  similarities = model.similarity(embeddings, embeddings)
95
  print(similarities)
96
- # tensor([[0.9961, 0.8477, 0.8750],
97
- # [0.8477, 0.9961, 0.8047],
98
- # [0.8750, 0.8047, 1.0078]], dtype=torch.bfloat16)
99
  ```
100
 
101
  <!--
@@ -136,19 +214,767 @@ You can finetune this model on your own dataset.
136
 
137
  ## Training Details
138
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
139
  ### Framework Versions
140
- - Python: 3.12.8
141
- - Sentence Transformers: 5.1.1
142
- - Transformers: 4.53.3
143
- - PyTorch: 2.7.0
144
- - Accelerate: 1.10.0
145
- - Datasets: 3.6.0
146
- - Tokenizers: 0.21.4
147
 
148
  ## Citation
149
 
150
  ### BibTeX
151
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
152
  <!--
153
  ## Glossary
154
 
 
4
  - sentence-similarity
5
  - feature-extraction
6
  - dense
7
+ - generated_from_trainer
8
+ - dataset_size:157352076
9
+ - loss:MultipleNegativesRankingLoss
10
+ base_model: thebajajra/RexBERT-base-embed-pf-v0.2
11
+ widget:
12
+ - source_sentence: Two men perform repairs on an orange elevator.
13
+ sentences:
14
+ - Morale, Welfare & Recreation Services. MWR is a military acronym that stands for
15
+ Morale, Welfare and Recreation. The term is given to a complete range of community
16
+ support and quality of life programs for members of the Armed Forces, their families,
17
+ and retirees at more than 2,000 facilities on U.S. military bases throughout the
18
+ world.
19
+ - The people are sleeping in the snow.
20
+ - The men repair the elevator.
21
+ - source_sentence: the average of the qual record of all rows is 141.562 .
22
+ sentences:
23
+ - Confidence votes 133. There are codes in some areas for minimum railing height,
24
+ but it can vary according to your location. Generally, most areas require at least
25
+ 36 height above the deck. Generally, most railings are from 38-42 high. Many builders
26
+ prefer 42 heights and up to 52 for deck levels that are a greater distance from
27
+ the ground.
28
+ - "Article: Monday: Here I am, in the middle of nowhere. This camping trip idea\
29
+ \ is not getting off to a very good start. It's raining and the tent leaks .\
30
+ \ The hiking seemed to take forever, and I still can't understand how it could\
31
+ \ all have been up hill! How did I ever let my brother persuade me into doing\
32
+ \ this? When we get home--if we ever get home--he's going to have to do something\
33
+ \ great to get back on my good side. Maybe he should sponsor a shopping spree\
34
+ \ at the mall! Tuesday: Things are looking up. The sun came out today, so we were\
35
+ \ able to leave the tents and dry out. We're camped at the edge of a small lake\
36
+ \ that I couldn't see before because of the rain and fog. The mountains are all\
37
+ \ around us, and the forest is absolutely beautiful. We spent most of the day\
38
+ \ dragging out everything out of our backpacks or tents and putting it where the\
39
+ \ sun could dry it out. Later in the afternoon we tried to catch the fish for\
40
+ \ dinner, but the fish were smarter than we were. At night we built a fire and\
41
+ \ sang songs happily. Wednesday: We hiked to the far side of the lake and climbed\
42
+ \ to the top of a small peak. From there we could see how high the other mountains\
43
+ \ were and how far the forest spread around us. On the way up we passed through\
44
+ \ a snowfield! Thursday: I caught my first fish! We followed the stream that fed\
45
+ \ the lake. After about two miles, we came to a section that Carol said looked\
46
+ \ \"fishy\". She had a pack rod , which can be carried in a backpack. I asked\
47
+ \ to cast it, and I caught a fish on my first try. Carol caught a few more.\
48
+ \ But they were just too pretty to eat for lunch, so we put them back in the stream.\
49
+ \ Friday: I can't believe we are going home already. It will be nice to get a\
50
+ \ hot shower, sleep in a real bed, and eat junk food, but the trip has been wonderful.\
51
+ \ We're already talking about another camping adventure next year where we canoe\
52
+ \ down a river. It's hard to believe, but I think this city girl has a little\
53
+ \ country blood in her veins. \n Answer: she was tired of staying home."
54
+ - the average of the total record of all rows is 145.5 .
55
+ - source_sentence: what state is delaware located
56
+ sentences:
57
+ - '1 Delaware State Tree The state tree of Delaware is the American holly. 2 At
58
+ one time, the tree grew in great abundance in the state and therefore, it was
59
+ adopted as the state tree. 3 As you prepare y…. 4 Europe on a Budget: 21 Free
60
+ Walking Tours in Europe Walking tours can be a great way to get to know a new
61
+ city.'
62
+ - 'They are identical to Tables 1, 2 and 3 '
63
+ - Location of state of Delaware within United States. Delaware is a state found
64
+ in the nation of United States. Home to 897,934 people, it is the 46th largest
65
+ division in United States in terms of population. Delaware gained its current
66
+ status as a state in the year 1787.
67
+ - source_sentence: what is the largest navy base in the us
68
+ sentences:
69
+ - The hope is that former-Soviet bloc host countries will be more amenable to U.S.
70
+ bases than other hosts in old Europe and be less likely to block their use in
71
+ a time of conflict. (U.S. Navy) Diego Garcia, British Indian Ocean Territory.
72
+ - 'As of June 2015, the largest military bases on American soil are: Fort Hood,
73
+ TX, Camp Lejeune, NC, Camp Pendleton, CA, Fort Lewis-McCord, WA, Fort Dix-McGuire,
74
+ NJ, Fort Campbell, KY, Norfolk Navy Base, VA, Eglin AFB, FL, Fort Bragg, NC. Fort
75
+ Benning, GA.'
76
+ - 'Debian Debian ( -LSB- ˈdɛbiən -RSB- ) is a Unix-like computer operating system
77
+ that is composed entirely of free software , most of which is under the GNU General
78
+ Public License and packaged by a group of individuals participating in the Debian
79
+ Project . The Debian Project was first announced in 1993 by Ian Murdock , Debian
80
+ 0.01 was released on September 15 , 1993 , and the first stable release was made
81
+ in 1996 . The Debian stable release branch is one of the most popular for personal
82
+ computers and network servers , and has been used as a base for many other distributions
83
+ . The project ''s work is carried out over the Internet by a team of volunteers
84
+ guided by the Debian Project Leader and three foundational documents : the Debian
85
+ Social Contract , the Debian Constitution , and the Debian Free Software Guidelines
86
+ . New distributions are updated continually , and the next candidate is released
87
+ after a time-based freeze . As one of the earliest operating systems based on
88
+ the Linux kernel , it was decided that Debian was to be developed openly and freely
89
+ distributed in the spirit of the GNU Project . This decision drew the attention
90
+ and support of the Free Software Foundation , which sponsored the project for
91
+ one year from November 1994 to November 1995 . Upon the ending of the sponsorship
92
+ , the Debian Project formed the non-profit organisation Software in the Public
93
+ Interest . While Debian ''s main port , Debian GNU/Linux , uses the Linux kernel
94
+ and GNU programs , other ports exist based on BSD kernels and the GNU HURD microkernel
95
+ . All use the GNU userland and the GNU C library ( glibc ) .'
96
+ - source_sentence: weather. in long beach
97
+ sentences:
98
+ - 'Long Beach, CA - Weather forecast from Theweather.com. Weather conditions with
99
+ updates on temperature, humidity, wind speed, snow, pressure, etc. for Long Beach,
100
+ California Today: Sunny intervals, with a maximum temperature of 57° and a minimum
101
+ temperature of 46°.'
102
+ - The church resembles those built in the Roman style and is bright inside.
103
+ - The unemployment rate in Long Beach, California, is 5.70%, with job growth of
104
+ 1.37%. Future job growth over the next ten years is predicted to be 37.05%. Long
105
+ Beach, California Taxes. Long Beach, California,sales tax rate is 9.00%. Income
106
+ tax is 8.00%.
107
+ datasets:
108
+ - thebajajra/hard-negative-triplets
109
  pipeline_tag: sentence-similarity
110
  library_name: sentence-transformers
111
  ---
112
 
113
+ # SentenceTransformer based on thebajajra/RexBERT-base-embed-pf-v0.2
114
+
115
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [thebajajra/RexBERT-base-embed-pf-v0.2](https://huggingface.co/thebajajra/RexBERT-base-embed-pf-v0.2) on the [hard-negative-triplets](https://huggingface.co/datasets/thebajajra/hard-negative-triplets) dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
116
 
117
  ## Model Details
118
 
119
  ### Model Description
120
  - **Model Type:** Sentence Transformer
121
+ - **Base model:** [thebajajra/RexBERT-base-embed-pf-v0.2](https://huggingface.co/thebajajra/RexBERT-base-embed-pf-v0.2) <!-- at revision 29e288a2c1f32a4de9604882b486f837d7e15a38 -->
122
+ - **Maximum Sequence Length:** 1024 tokens
123
  - **Output Dimensionality:** 768 dimensions
124
  - **Similarity Function:** Cosine Similarity
125
+ - **Training Dataset:**
126
+ - [hard-negative-triplets](https://huggingface.co/datasets/thebajajra/hard-negative-triplets)
127
  <!-- - **Language:** Unknown -->
128
  <!-- - **License:** Unknown -->
129
 
130
  ### Model Sources
131
 
132
  - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
133
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/huggingface/sentence-transformers)
134
  - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
135
 
136
  ### Full Model Architecture
137
 
138
  ```
139
  SentenceTransformer(
140
+ (0): Transformer({'max_seq_length': 1024, 'do_lower_case': False, 'architecture': 'ModernBertModel'})
141
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
142
  )
143
  ```
 
160
  model = SentenceTransformer("sentence_transformers_model_id")
161
  # Run inference
162
  sentences = [
163
+ 'weather. in long beach',
164
+ 'Long Beach, CA - Weather forecast from Theweather.com. Weather conditions with updates on temperature, humidity, wind speed, snow, pressure, etc. for Long Beach, California Today: Sunny intervals, with a maximum temperature of 57° and a minimum temperature of 46°.',
165
+ 'The unemployment rate in Long Beach, California, is 5.70%, with job growth of 1.37%. Future job growth over the next ten years is predicted to be 37.05%. Long Beach, California Taxes. Long Beach, California,sales tax rate is 9.00%. Income tax is 8.00%.',
166
  ]
167
  embeddings = model.encode(sentences)
168
  print(embeddings.shape)
 
171
  # Get the similarity scores for the embeddings
172
  similarities = model.similarity(embeddings, embeddings)
173
  print(similarities)
174
+ # tensor([[1.0000, 0.7565, 0.3017],
175
+ # [0.7565, 1.0000, 0.3962],
176
+ # [0.3017, 0.3962, 1.0000]])
177
  ```
178
 
179
  <!--
 
214
 
215
  ## Training Details
216
 
217
+ ### Training Dataset
218
+
219
+ #### hard-negative-triplets
220
+
221
+ * Dataset: [hard-negative-triplets](https://huggingface.co/datasets/thebajajra/hard-negative-triplets) at [934c74e](https://huggingface.co/datasets/thebajajra/hard-negative-triplets/tree/934c74e2332109929b7ff3cd66f323eed65a0495)
222
+ * Size: 157,352,076 training samples
223
+ * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
224
+ * Approximate statistics based on the first 1000 samples:
225
+ | | anchor | positive | negative |
226
+ |:--------|:------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
227
+ | type | string | string | string |
228
+ | details | <ul><li>min: 4 tokens</li><li>mean: 22.43 tokens</li><li>max: 1024 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 111.89 tokens</li><li>max: 1024 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 81.15 tokens</li><li>max: 1024 tokens</li></ul> |
229
+ * Samples:
230
+ | anchor | positive | negative |
231
+ |:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------|
232
+ | <code>The authority to spend offsetting collections is a form of budget authority.</code> | <code>Budget authority includes the authority to spend offsetting collections.</code> | <code>Two emergency response unit workers are examining train tracks.</code> |
233
+ | <code>heavy duty picture hangers without nails</code> | <code>Command Wire-Back Picture Hangers, Indoor Use, 3-Hangers, 6-Strips, Decorate Damage-Free</code> | <code>ANCIRS 12 Pack 50lbs Heavy Duty Picture Hangers, Picture Hanging Hooks for Plaster Wall & Drywall</code> |
234
+ | <code>As it has since the '60s, edgy rock coexists with more easygoing In the mid-'60s, the best-selling albums included Herb Albert and the Tijuana Brass ' Whipped Cream and Other Delights.</code> | <code>Edgy rock co-existed in the '60s with easygoing music.</code> | <code>A sleeping baby in a pink striped outfit.</code> |
235
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
236
+ ```json
237
+ {
238
+ "scale": 20.0,
239
+ "similarity_fct": "cos_sim",
240
+ "gather_across_devices": false
241
+ }
242
+ ```
243
+
244
+ ### Evaluation Dataset
245
+
246
+ #### hard-negative-triplets
247
+
248
+ * Dataset: [hard-negative-triplets](https://huggingface.co/datasets/thebajajra/hard-negative-triplets) at [934c74e](https://huggingface.co/datasets/thebajajra/hard-negative-triplets/tree/934c74e2332109929b7ff3cd66f323eed65a0495)
249
+ * Size: 790,720 evaluation samples
250
+ * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
251
+ * Approximate statistics based on the first 1000 samples:
252
+ | | anchor | positive | negative |
253
+ |:--------|:-----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
254
+ | type | string | string | string |
255
+ | details | <ul><li>min: 4 tokens</li><li>mean: 17.29 tokens</li><li>max: 643 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 102.06 tokens</li><li>max: 1009 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 80.7 tokens</li><li>max: 1024 tokens</li></ul> |
256
+ * Samples:
257
+ | anchor | positive | negative |
258
+ |:-------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
259
+ | <code>Man standing in the rain carrying an umbrella.</code> | <code>The man has an umbrella.</code> | <code>The strike's outcome was influence by the heard of cattle. </code> |
260
+ | <code>A Queen of France was Marie Antoinette.</code> | <code>Marie Antoinette Marie Antoinette ( -LSB- ˈmæriˌæntwəˈnɛt -RSB- , -LSB- ˌɑ̃ːntwə - -RSB- , -LSB- ˌɑ̃ːtwə - -RSB- , -LSB- məˈriː - -RSB- -LSB- maʁi ɑ̃twanɛt -RSB- ; born Maria Antonia Josepha Johanna ( 2 November 1755 -- 16 October 1793 ) was the last Queen of France and Navarre before the French Revolution . She was born an Archduchess of Austria , and was the fifteenth and second youngest child of Empress Maria Theresa and Francis I , Holy Roman Emperor . In April 1770 , upon her marriage to Louis-Auguste , heir apparent to the French throne , she became Dauphine of France . On 10 May 1774 , when her husband ascended the throne as Louis XVI , she became Queen of France and Navarre , a title she held until September 1791 , when , as the French Revolution proceeded , she became Queen of the French , a title she held until 21 September 1792 . After eight years of marriage , Marie Antoinette gave birth to a daughter , Marie-Thérèse Charlotte , the first of her four children . Despite ...</code> | <code>Women in France The roles of women in France have changed throughout history .</code> |
261
+ | <code>The genus Omphalodes and the genus Gelsemium are both examples of what?</code> | <code>Gelsemium Gelsemium is an Asian and North American genus of flowering plants belonging to family Gelsemiaceae. The genus contains three species of shrubs to straggling or twining climbers. Two species are native to North America, and one to China and Southeast Asia.</code> | <code>Omphalodes verna Omphalodes verna (common names creeping navelwort or blue-eyed-Mary) is an herbaceous perennial rhizomatous plant of the genus "Omphalodes" belonging to the family Boraginaceae.</code> |
262
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
263
+ ```json
264
+ {
265
+ "scale": 20.0,
266
+ "similarity_fct": "cos_sim",
267
+ "gather_across_devices": false
268
+ }
269
+ ```
270
+
271
+ ### Training Hyperparameters
272
+ #### Non-Default Hyperparameters
273
+
274
+ - `eval_strategy`: steps
275
+ - `per_device_train_batch_size`: 384
276
+ - `per_device_eval_batch_size`: 128
277
+ - `learning_rate`: 0.0001
278
+ - `num_train_epochs`: 10
279
+ - `warmup_steps`: 1000
280
+ - `bf16`: True
281
+ - `dataloader_num_workers`: 20
282
+ - `dataloader_prefetch_factor`: 4
283
+ - `ddp_find_unused_parameters`: False
284
+
285
+ #### All Hyperparameters
286
+ <details><summary>Click to expand</summary>
287
+
288
+ - `overwrite_output_dir`: False
289
+ - `do_predict`: False
290
+ - `eval_strategy`: steps
291
+ - `prediction_loss_only`: True
292
+ - `per_device_train_batch_size`: 384
293
+ - `per_device_eval_batch_size`: 128
294
+ - `per_gpu_train_batch_size`: None
295
+ - `per_gpu_eval_batch_size`: None
296
+ - `gradient_accumulation_steps`: 1
297
+ - `eval_accumulation_steps`: None
298
+ - `torch_empty_cache_steps`: None
299
+ - `learning_rate`: 0.0001
300
+ - `weight_decay`: 0.0
301
+ - `adam_beta1`: 0.9
302
+ - `adam_beta2`: 0.999
303
+ - `adam_epsilon`: 1e-08
304
+ - `max_grad_norm`: 1.0
305
+ - `num_train_epochs`: 10
306
+ - `max_steps`: -1
307
+ - `lr_scheduler_type`: linear
308
+ - `lr_scheduler_kwargs`: {}
309
+ - `warmup_ratio`: 0.0
310
+ - `warmup_steps`: 1000
311
+ - `log_level`: passive
312
+ - `log_level_replica`: warning
313
+ - `log_on_each_node`: True
314
+ - `logging_nan_inf_filter`: True
315
+ - `save_safetensors`: True
316
+ - `save_on_each_node`: False
317
+ - `save_only_model`: False
318
+ - `restore_callback_states_from_checkpoint`: False
319
+ - `no_cuda`: False
320
+ - `use_cpu`: False
321
+ - `use_mps_device`: False
322
+ - `seed`: 42
323
+ - `data_seed`: None
324
+ - `jit_mode_eval`: False
325
+ - `bf16`: True
326
+ - `fp16`: False
327
+ - `fp16_opt_level`: O1
328
+ - `half_precision_backend`: auto
329
+ - `bf16_full_eval`: False
330
+ - `fp16_full_eval`: False
331
+ - `tf32`: None
332
+ - `local_rank`: 0
333
+ - `ddp_backend`: None
334
+ - `tpu_num_cores`: None
335
+ - `tpu_metrics_debug`: False
336
+ - `debug`: []
337
+ - `dataloader_drop_last`: True
338
+ - `dataloader_num_workers`: 20
339
+ - `dataloader_prefetch_factor`: 4
340
+ - `past_index`: -1
341
+ - `disable_tqdm`: False
342
+ - `remove_unused_columns`: True
343
+ - `label_names`: None
344
+ - `load_best_model_at_end`: False
345
+ - `ignore_data_skip`: False
346
+ - `fsdp`: []
347
+ - `fsdp_min_num_params`: 0
348
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
349
+ - `fsdp_transformer_layer_cls_to_wrap`: None
350
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
351
+ - `parallelism_config`: None
352
+ - `deepspeed`: None
353
+ - `label_smoothing_factor`: 0.0
354
+ - `optim`: adamw_torch_fused
355
+ - `optim_args`: None
356
+ - `adafactor`: False
357
+ - `group_by_length`: False
358
+ - `length_column_name`: length
359
+ - `project`: huggingface
360
+ - `trackio_space_id`: trackio
361
+ - `ddp_find_unused_parameters`: False
362
+ - `ddp_bucket_cap_mb`: None
363
+ - `ddp_broadcast_buffers`: False
364
+ - `dataloader_pin_memory`: True
365
+ - `dataloader_persistent_workers`: False
366
+ - `skip_memory_metrics`: True
367
+ - `use_legacy_prediction_loop`: False
368
+ - `push_to_hub`: False
369
+ - `resume_from_checkpoint`: None
370
+ - `hub_model_id`: None
371
+ - `hub_strategy`: every_save
372
+ - `hub_private_repo`: None
373
+ - `hub_always_push`: False
374
+ - `hub_revision`: None
375
+ - `gradient_checkpointing`: False
376
+ - `gradient_checkpointing_kwargs`: None
377
+ - `include_inputs_for_metrics`: False
378
+ - `include_for_metrics`: []
379
+ - `eval_do_concat_batches`: True
380
+ - `fp16_backend`: auto
381
+ - `push_to_hub_model_id`: None
382
+ - `push_to_hub_organization`: None
383
+ - `mp_parameters`:
384
+ - `auto_find_batch_size`: False
385
+ - `full_determinism`: False
386
+ - `torchdynamo`: None
387
+ - `ray_scope`: last
388
+ - `ddp_timeout`: 1800
389
+ - `torch_compile`: False
390
+ - `torch_compile_backend`: None
391
+ - `torch_compile_mode`: None
392
+ - `include_tokens_per_second`: False
393
+ - `include_num_input_tokens_seen`: no
394
+ - `neftune_noise_alpha`: None
395
+ - `optim_target_modules`: None
396
+ - `batch_eval_metrics`: False
397
+ - `eval_on_start`: False
398
+ - `use_liger_kernel`: False
399
+ - `liger_kernel_config`: None
400
+ - `eval_use_gather_object`: False
401
+ - `average_tokens_across_devices`: True
402
+ - `prompts`: None
403
+ - `batch_sampler`: batch_sampler
404
+ - `multi_dataset_batch_sampler`: proportional
405
+ - `router_mapping`: {}
406
+ - `learning_rate_mapping`: {}
407
+
408
+ </details>
409
+
410
+ ### Training Logs
411
+ <details><summary>Click to expand</summary>
412
+
413
+ | Epoch | Step | Training Loss | Validation Loss |
414
+ |:------:|:-----:|:-------------:|:---------------:|
415
+ | 0.0020 | 100 | 0.5828 | - |
416
+ | 0.0039 | 200 | 0.3733 | - |
417
+ | 0.0059 | 300 | 0.3213 | - |
418
+ | 0.0078 | 400 | 0.2873 | - |
419
+ | 0.0098 | 500 | 0.2666 | - |
420
+ | 0.0117 | 600 | 0.2474 | - |
421
+ | 0.0137 | 700 | 0.2335 | - |
422
+ | 0.0156 | 800 | 0.2189 | - |
423
+ | 0.0176 | 900 | 0.2091 | - |
424
+ | 0.0195 | 1000 | 0.1981 | - |
425
+ | 0.0215 | 1100 | 0.1946 | - |
426
+ | 0.0234 | 1200 | 0.1842 | - |
427
+ | 0.0254 | 1300 | 0.1748 | - |
428
+ | 0.0273 | 1400 | 0.1646 | - |
429
+ | 0.0293 | 1500 | 0.1603 | - |
430
+ | 0.0312 | 1600 | 0.1503 | - |
431
+ | 0.0332 | 1700 | 0.1445 | - |
432
+ | 0.0351 | 1800 | 0.1383 | - |
433
+ | 0.0371 | 1900 | 0.1329 | - |
434
+ | 0.0390 | 2000 | 0.1277 | - |
435
+ | 0.0410 | 2100 | 0.1235 | - |
436
+ | 0.0430 | 2200 | 0.1198 | - |
437
+ | 0.0449 | 2300 | 0.1142 | - |
438
+ | 0.0469 | 2400 | 0.1111 | - |
439
+ | 0.0488 | 2500 | 0.1057 | - |
440
+ | 0.0508 | 2600 | 0.1031 | - |
441
+ | 0.0527 | 2700 | 0.1001 | - |
442
+ | 0.0547 | 2800 | 0.0981 | - |
443
+ | 0.0566 | 2900 | 0.0959 | - |
444
+ | 0.0586 | 3000 | 0.0921 | - |
445
+ | 0.0605 | 3100 | 0.0905 | - |
446
+ | 0.0625 | 3200 | 0.086 | - |
447
+ | 0.0644 | 3300 | 0.0859 | - |
448
+ | 0.0664 | 3400 | 0.083 | - |
449
+ | 0.0683 | 3500 | 0.0818 | - |
450
+ | 0.0703 | 3600 | 0.0802 | - |
451
+ | 0.0722 | 3700 | 0.0779 | - |
452
+ | 0.0742 | 3800 | 0.0776 | - |
453
+ | 0.0761 | 3900 | 0.0764 | - |
454
+ | 0.0781 | 4000 | 0.0749 | - |
455
+ | 0.0800 | 4100 | 0.0748 | - |
456
+ | 0.0820 | 4200 | 0.0719 | - |
457
+ | 0.0839 | 4300 | 0.0705 | - |
458
+ | 0.0859 | 4400 | 0.0689 | - |
459
+ | 0.0879 | 4500 | 0.0684 | - |
460
+ | 0.0898 | 4600 | 0.0672 | - |
461
+ | 0.0918 | 4700 | 0.0649 | - |
462
+ | 0.0937 | 4800 | 0.0641 | - |
463
+ | 0.0957 | 4900 | 0.0618 | - |
464
+ | 0.0976 | 5000 | 0.0611 | - |
465
+ | 0.0996 | 5100 | 0.0614 | - |
466
+ | 0.1000 | 5123 | - | 0.0438 |
467
+ | 0.1015 | 5200 | 0.0603 | - |
468
+ | 0.1035 | 5300 | 0.0596 | - |
469
+ | 0.1054 | 5400 | 0.0589 | - |
470
+ | 0.1074 | 5500 | 0.0567 | - |
471
+ | 0.1093 | 5600 | 0.0583 | - |
472
+ | 0.1113 | 5700 | 0.0554 | - |
473
+ | 0.1132 | 5800 | 0.0547 | - |
474
+ | 0.1152 | 5900 | 0.0537 | - |
475
+ | 0.1171 | 6000 | 0.0537 | - |
476
+ | 0.1191 | 6100 | 0.0521 | - |
477
+ | 0.1210 | 6200 | 0.0515 | - |
478
+ | 0.1230 | 6300 | 0.0512 | - |
479
+ | 0.1249 | 6400 | 0.0505 | - |
480
+ | 0.1269 | 6500 | 0.0494 | - |
481
+ | 0.1289 | 6600 | 0.0497 | - |
482
+ | 0.1308 | 6700 | 0.0481 | - |
483
+ | 0.1328 | 6800 | 0.0468 | - |
484
+ | 0.1347 | 6900 | 0.0467 | - |
485
+ | 0.1367 | 7000 | 0.0471 | - |
486
+ | 0.1386 | 7100 | 0.0457 | - |
487
+ | 0.1406 | 7200 | 0.0451 | - |
488
+ | 0.1425 | 7300 | 0.0444 | - |
489
+ | 0.1445 | 7400 | 0.0443 | - |
490
+ | 0.1464 | 7500 | 0.0441 | - |
491
+ | 0.1484 | 7600 | 0.0437 | - |
492
+ | 0.1503 | 7700 | 0.0426 | - |
493
+ | 0.1523 | 7800 | 0.0423 | - |
494
+ | 0.1542 | 7900 | 0.0413 | - |
495
+ | 0.1562 | 8000 | 0.042 | - |
496
+ | 0.1581 | 8100 | 0.0408 | - |
497
+ | 0.1601 | 8200 | 0.0401 | - |
498
+ | 0.1620 | 8300 | 0.0397 | - |
499
+ | 0.1640 | 8400 | 0.0394 | - |
500
+ | 0.1659 | 8500 | 0.0392 | - |
501
+ | 0.1679 | 8600 | 0.0387 | - |
502
+ | 0.1699 | 8700 | 0.0382 | - |
503
+ | 0.1718 | 8800 | 0.0386 | - |
504
+ | 0.1738 | 8900 | 0.0373 | - |
505
+ | 0.1757 | 9000 | 0.0383 | - |
506
+ | 0.1777 | 9100 | 0.0365 | - |
507
+ | 0.1796 | 9200 | 0.0364 | - |
508
+ | 0.1816 | 9300 | 0.0361 | - |
509
+ | 0.1835 | 9400 | 0.0361 | - |
510
+ | 0.1855 | 9500 | 0.0359 | - |
511
+ | 0.1874 | 9600 | 0.0359 | - |
512
+ | 0.1894 | 9700 | 0.035 | - |
513
+ | 0.1913 | 9800 | 0.0348 | - |
514
+ | 0.1933 | 9900 | 0.0344 | - |
515
+ | 0.1952 | 10000 | 0.0345 | - |
516
+ | 0.1972 | 10100 | 0.0337 | - |
517
+ | 0.1991 | 10200 | 0.0335 | - |
518
+ | 0.2000 | 10246 | - | 0.0235 |
519
+ | 0.2011 | 10300 | 0.0329 | - |
520
+ | 0.2030 | 10400 | 0.0325 | - |
521
+ | 0.2050 | 10500 | 0.0326 | - |
522
+ | 0.2069 | 10600 | 0.0324 | - |
523
+ | 0.2089 | 10700 | 0.0325 | - |
524
+ | 0.2109 | 10800 | 0.0324 | - |
525
+ | 0.2128 | 10900 | 0.0321 | - |
526
+ | 0.2148 | 11000 | 0.0317 | - |
527
+ | 0.2167 | 11100 | 0.0309 | - |
528
+ | 0.2187 | 11200 | 0.0306 | - |
529
+ | 0.2206 | 11300 | 0.0307 | - |
530
+ | 0.2226 | 11400 | 0.0305 | - |
531
+ | 0.2245 | 11500 | 0.0314 | - |
532
+ | 0.2265 | 11600 | 0.0301 | - |
533
+ | 0.2284 | 11700 | 0.0307 | - |
534
+ | 0.2304 | 11800 | 0.0296 | - |
535
+ | 0.2323 | 11900 | 0.0294 | - |
536
+ | 0.2343 | 12000 | 0.0292 | - |
537
+ | 0.2362 | 12100 | 0.03 | - |
538
+ | 0.2382 | 12200 | 0.0298 | - |
539
+ | 0.2401 | 12300 | 0.0292 | - |
540
+ | 0.2421 | 12400 | 0.0294 | - |
541
+ | 0.2440 | 12500 | 0.0295 | - |
542
+ | 0.2460 | 12600 | 0.0285 | - |
543
+ | 0.2479 | 12700 | 0.0281 | - |
544
+ | 0.2499 | 12800 | 0.0287 | - |
545
+ | 0.2518 | 12900 | 0.0285 | - |
546
+ | 0.2538 | 13000 | 0.0285 | - |
547
+ | 0.2558 | 13100 | 0.0281 | - |
548
+ | 0.2577 | 13200 | 0.0277 | - |
549
+ | 0.2597 | 13300 | 0.0277 | - |
550
+ | 0.2616 | 13400 | 0.0282 | - |
551
+ | 0.2636 | 13500 | 0.0279 | - |
552
+ | 0.2655 | 13600 | 0.0269 | - |
553
+ | 0.2675 | 13700 | 0.0271 | - |
554
+ | 0.2694 | 13800 | 0.0269 | - |
555
+ | 0.2714 | 13900 | 0.0271 | - |
556
+ | 0.2733 | 14000 | 0.0266 | - |
557
+ | 0.2753 | 14100 | 0.0264 | - |
558
+ | 0.2772 | 14200 | 0.0268 | - |
559
+ | 0.2792 | 14300 | 0.0271 | - |
560
+ | 0.2811 | 14400 | 0.0266 | - |
561
+ | 0.2831 | 14500 | 0.0265 | - |
562
+ | 0.2850 | 14600 | 0.0261 | - |
563
+ | 0.2870 | 14700 | 0.0254 | - |
564
+ | 0.2889 | 14800 | 0.0255 | - |
565
+ | 0.2909 | 14900 | 0.0255 | - |
566
+ | 0.2928 | 15000 | 0.0258 | - |
567
+ | 0.2948 | 15100 | 0.0254 | - |
568
+ | 0.2968 | 15200 | 0.0253 | - |
569
+ | 0.2987 | 15300 | 0.0252 | - |
570
+ | 0.3001 | 15369 | - | 0.0176 |
571
+ | 0.3007 | 15400 | 0.0246 | - |
572
+ | 0.3026 | 15500 | 0.0249 | - |
573
+ | 0.3046 | 15600 | 0.0246 | - |
574
+ | 0.3065 | 15700 | 0.0246 | - |
575
+ | 0.3085 | 15800 | 0.0249 | - |
576
+ | 0.3104 | 15900 | 0.0246 | - |
577
+ | 0.3124 | 16000 | 0.0243 | - |
578
+ | 0.3143 | 16100 | 0.0245 | - |
579
+ | 0.3163 | 16200 | 0.0239 | - |
580
+ | 0.3182 | 16300 | 0.0238 | - |
581
+ | 0.3202 | 16400 | 0.0244 | - |
582
+ | 0.3221 | 16500 | 0.0236 | - |
583
+ | 0.3241 | 16600 | 0.0241 | - |
584
+ | 0.3260 | 16700 | 0.0234 | - |
585
+ | 0.3280 | 16800 | 0.0234 | - |
586
+ | 0.3299 | 16900 | 0.0237 | - |
587
+ | 0.3319 | 17000 | 0.0234 | - |
588
+ | 0.3338 | 17100 | 0.0231 | - |
589
+ | 0.3358 | 17200 | 0.0226 | - |
590
+ | 0.3378 | 17300 | 0.0229 | - |
591
+ | 0.3397 | 17400 | 0.0226 | - |
592
+ | 0.3417 | 17500 | 0.0229 | - |
593
+ | 0.3436 | 17600 | 0.0223 | - |
594
+ | 0.3456 | 17700 | 0.0229 | - |
595
+ | 0.3475 | 17800 | 0.0222 | - |
596
+ | 0.3495 | 17900 | 0.0222 | - |
597
+ | 0.3514 | 18000 | 0.0224 | - |
598
+ | 0.3534 | 18100 | 0.0221 | - |
599
+ | 0.3553 | 18200 | 0.0221 | - |
600
+ | 0.3573 | 18300 | 0.0221 | - |
601
+ | 0.3592 | 18400 | 0.0223 | - |
602
+ | 0.3612 | 18500 | 0.0217 | - |
603
+ | 0.3631 | 18600 | 0.0219 | - |
604
+ | 0.3651 | 18700 | 0.0216 | - |
605
+ | 0.3670 | 18800 | 0.0211 | - |
606
+ | 0.3690 | 18900 | 0.0209 | - |
607
+ | 0.3709 | 19000 | 0.0214 | - |
608
+ | 0.3729 | 19100 | 0.0211 | - |
609
+ | 0.3748 | 19200 | 0.0214 | - |
610
+ | 0.3768 | 19300 | 0.0208 | - |
611
+ | 0.3788 | 19400 | 0.0209 | - |
612
+ | 0.3807 | 19500 | 0.0208 | - |
613
+ | 0.3827 | 19600 | 0.0205 | - |
614
+ | 0.3846 | 19700 | 0.0211 | - |
615
+ | 0.3866 | 19800 | 0.0208 | - |
616
+ | 0.3885 | 19900 | 0.0208 | - |
617
+ | 0.3905 | 20000 | 0.0209 | - |
618
+ | 0.3924 | 20100 | 0.0205 | - |
619
+ | 0.3944 | 20200 | 0.0206 | - |
620
+ | 0.3963 | 20300 | 0.0207 | - |
621
+ | 0.3983 | 20400 | 0.0202 | - |
622
+ | 0.4001 | 20492 | - | 0.0143 |
623
+ | 0.4002 | 20500 | 0.0203 | - |
624
+ | 0.4022 | 20600 | 0.0202 | - |
625
+ | 0.4041 | 20700 | 0.0201 | - |
626
+ | 0.4061 | 20800 | 0.0201 | - |
627
+ | 0.4080 | 20900 | 0.0198 | - |
628
+ | 0.4100 | 21000 | 0.0202 | - |
629
+ | 0.4119 | 21100 | 0.0199 | - |
630
+ | 0.4139 | 21200 | 0.0202 | - |
631
+ | 0.4158 | 21300 | 0.0197 | - |
632
+ | 0.4178 | 21400 | 0.0191 | - |
633
+ | 0.4197 | 21500 | 0.0194 | - |
634
+ | 0.4217 | 21600 | 0.0195 | - |
635
+ | 0.4237 | 21700 | 0.0193 | - |
636
+ | 0.4256 | 21800 | 0.0196 | - |
637
+ | 0.4276 | 21900 | 0.0195 | - |
638
+ | 0.4295 | 22000 | 0.0192 | - |
639
+ | 0.4315 | 22100 | 0.0188 | - |
640
+ | 0.4334 | 22200 | 0.0197 | - |
641
+ | 0.4354 | 22300 | 0.0191 | - |
642
+ | 0.4373 | 22400 | 0.0189 | - |
643
+ | 0.4393 | 22500 | 0.0195 | - |
644
+ | 0.4412 | 22600 | 0.0189 | - |
645
+ | 0.4432 | 22700 | 0.0189 | - |
646
+ | 0.4451 | 22800 | 0.0187 | - |
647
+ | 0.4471 | 22900 | 0.0188 | - |
648
+ | 0.4490 | 23000 | 0.0191 | - |
649
+ | 0.4510 | 23100 | 0.0187 | - |
650
+ | 0.4529 | 23200 | 0.0185 | - |
651
+ | 0.4549 | 23300 | 0.0188 | - |
652
+ | 0.4568 | 23400 | 0.0185 | - |
653
+ | 0.4588 | 23500 | 0.019 | - |
654
+ | 0.4607 | 23600 | 0.0184 | - |
655
+ | 0.4627 | 23700 | 0.0187 | - |
656
+ | 0.4647 | 23800 | 0.0183 | - |
657
+ | 0.4666 | 23900 | 0.0182 | - |
658
+ | 0.4686 | 24000 | 0.0183 | - |
659
+ | 0.4705 | 24100 | 0.0181 | - |
660
+ | 0.4725 | 24200 | 0.0181 | - |
661
+ | 0.4744 | 24300 | 0.0179 | - |
662
+ | 0.4764 | 24400 | 0.0175 | - |
663
+ | 0.4783 | 24500 | 0.0181 | - |
664
+ | 0.4803 | 24600 | 0.0179 | - |
665
+ | 0.4822 | 24700 | 0.0179 | - |
666
+ | 0.4842 | 24800 | 0.0181 | - |
667
+ | 0.4861 | 24900 | 0.0181 | - |
668
+ | 0.4881 | 25000 | 0.0182 | - |
669
+ | 0.4900 | 25100 | 0.0177 | - |
670
+ | 0.4920 | 25200 | 0.0177 | - |
671
+ | 0.4939 | 25300 | 0.0179 | - |
672
+ | 0.4959 | 25400 | 0.0172 | - |
673
+ | 0.4978 | 25500 | 0.0177 | - |
674
+ | 0.4998 | 25600 | 0.018 | - |
675
+ | 0.5001 | 25615 | - | 0.0125 |
676
+ | 0.5017 | 25700 | 0.0173 | - |
677
+ | 0.5037 | 25800 | 0.0176 | - |
678
+ | 0.5057 | 25900 | 0.0175 | - |
679
+ | 0.5076 | 26000 | 0.0173 | - |
680
+ | 0.5096 | 26100 | 0.018 | - |
681
+ | 0.5115 | 26200 | 0.0177 | - |
682
+ | 0.5135 | 26300 | 0.0172 | - |
683
+ | 0.5154 | 26400 | 0.0175 | - |
684
+ | 0.5174 | 26500 | 0.0174 | - |
685
+ | 0.5193 | 26600 | 0.0167 | - |
686
+ | 0.5213 | 26700 | 0.0169 | - |
687
+ | 0.5232 | 26800 | 0.0172 | - |
688
+ | 0.5252 | 26900 | 0.0171 | - |
689
+ | 0.5271 | 27000 | 0.0173 | - |
690
+ | 0.5291 | 27100 | 0.0175 | - |
691
+ | 0.5310 | 27200 | 0.0168 | - |
692
+ | 0.5330 | 27300 | 0.017 | - |
693
+ | 0.5349 | 27400 | 0.0167 | - |
694
+ | 0.5369 | 27500 | 0.0174 | - |
695
+ | 0.5388 | 27600 | 0.0169 | - |
696
+ | 0.5408 | 27700 | 0.0171 | - |
697
+ | 0.5427 | 27800 | 0.0166 | - |
698
+ | 0.5447 | 27900 | 0.0167 | - |
699
+ | 0.5467 | 28000 | 0.0166 | - |
700
+ | 0.5486 | 28100 | 0.0168 | - |
701
+ | 0.5506 | 28200 | 0.0168 | - |
702
+ | 0.5525 | 28300 | 0.0166 | - |
703
+ | 0.5545 | 28400 | 0.0167 | - |
704
+ | 0.5564 | 28500 | 0.0167 | - |
705
+ | 0.5584 | 28600 | 0.0166 | - |
706
+ | 0.5603 | 28700 | 0.0167 | - |
707
+ | 0.5623 | 28800 | 0.0166 | - |
708
+ | 0.5642 | 28900 | 0.0169 | - |
709
+ | 0.5662 | 29000 | 0.0163 | - |
710
+ | 0.5681 | 29100 | 0.0168 | - |
711
+ | 0.5701 | 29200 | 0.0164 | - |
712
+ | 0.5720 | 29300 | 0.0166 | - |
713
+ | 0.5740 | 29400 | 0.0163 | - |
714
+ | 0.5759 | 29500 | 0.016 | - |
715
+ | 0.5779 | 29600 | 0.0164 | - |
716
+ | 0.5798 | 29700 | 0.0163 | - |
717
+ | 0.5818 | 29800 | 0.0162 | - |
718
+ | 0.5837 | 29900 | 0.0162 | - |
719
+ | 0.5857 | 30000 | 0.0159 | - |
720
+ | 0.5876 | 30100 | 0.0163 | - |
721
+ | 0.5896 | 30200 | 0.0159 | - |
722
+ | 0.5916 | 30300 | 0.016 | - |
723
+ | 0.5935 | 30400 | 0.016 | - |
724
+ | 0.5955 | 30500 | 0.0157 | - |
725
+ | 0.5974 | 30600 | 0.0163 | - |
726
+ | 0.5994 | 30700 | 0.0155 | - |
727
+ | 0.6001 | 30738 | - | 0.0112 |
728
+ | 0.6013 | 30800 | 0.0156 | - |
729
+ | 0.6033 | 30900 | 0.0157 | - |
730
+ | 0.6052 | 31000 | 0.0158 | - |
731
+ | 0.6072 | 31100 | 0.0159 | - |
732
+ | 0.6091 | 31200 | 0.0157 | - |
733
+ | 0.6111 | 31300 | 0.016 | - |
734
+ | 0.6130 | 31400 | 0.0154 | - |
735
+ | 0.6150 | 31500 | 0.0156 | - |
736
+ | 0.6169 | 31600 | 0.0159 | - |
737
+ | 0.6189 | 31700 | 0.0158 | - |
738
+ | 0.6208 | 31800 | 0.0154 | - |
739
+ | 0.6228 | 31900 | 0.0157 | - |
740
+ | 0.6247 | 32000 | 0.0155 | - |
741
+ | 0.6267 | 32100 | 0.0154 | - |
742
+ | 0.6286 | 32200 | 0.0158 | - |
743
+ | 0.6306 | 32300 | 0.0154 | - |
744
+ | 0.6326 | 32400 | 0.0156 | - |
745
+ | 0.6345 | 32500 | 0.0158 | - |
746
+ | 0.6365 | 32600 | 0.0155 | - |
747
+ | 0.6384 | 32700 | 0.0156 | - |
748
+ | 0.6404 | 32800 | 0.0154 | - |
749
+ | 0.6423 | 32900 | 0.0154 | - |
750
+ | 0.6443 | 33000 | 0.0153 | - |
751
+ | 0.6462 | 33100 | 0.0153 | - |
752
+ | 0.6482 | 33200 | 0.0151 | - |
753
+ | 0.6501 | 33300 | 0.0155 | - |
754
+ | 0.6521 | 33400 | 0.0156 | - |
755
+ | 0.6540 | 33500 | 0.0153 | - |
756
+ | 0.6560 | 33600 | 0.0152 | - |
757
+ | 0.6579 | 33700 | 0.0153 | - |
758
+ | 0.6599 | 33800 | 0.015 | - |
759
+ | 0.6618 | 33900 | 0.0151 | - |
760
+ | 0.6638 | 34000 | 0.0148 | - |
761
+ | 0.6657 | 34100 | 0.0149 | - |
762
+ | 0.6677 | 34200 | 0.0154 | - |
763
+ | 0.6696 | 34300 | 0.0152 | - |
764
+ | 0.6716 | 34400 | 0.0154 | - |
765
+ | 0.6736 | 34500 | 0.0149 | - |
766
+ | 0.6755 | 34600 | 0.0148 | - |
767
+ | 0.6775 | 34700 | 0.0149 | - |
768
+ | 0.6794 | 34800 | 0.015 | - |
769
+ | 0.6814 | 34900 | 0.0148 | - |
770
+ | 0.6833 | 35000 | 0.0145 | - |
771
+ | 0.6853 | 35100 | 0.0149 | - |
772
+ | 0.6872 | 35200 | 0.015 | - |
773
+ | 0.6892 | 35300 | 0.0146 | - |
774
+ | 0.6911 | 35400 | 0.0147 | - |
775
+ | 0.6931 | 35500 | 0.0146 | - |
776
+ | 0.6950 | 35600 | 0.0148 | - |
777
+ | 0.6970 | 35700 | 0.0146 | - |
778
+ | 0.6989 | 35800 | 0.0147 | - |
779
+ | 0.7001 | 35861 | - | 0.0104 |
780
+ | 0.7009 | 35900 | 0.0142 | - |
781
+ | 0.7028 | 36000 | 0.0148 | - |
782
+ | 0.7048 | 36100 | 0.0145 | - |
783
+ | 0.7067 | 36200 | 0.0145 | - |
784
+ | 0.7087 | 36300 | 0.0142 | - |
785
+ | 0.7106 | 36400 | 0.0143 | - |
786
+ | 0.7126 | 36500 | 0.0145 | - |
787
+ | 0.7146 | 36600 | 0.0144 | - |
788
+ | 0.7165 | 36700 | 0.0144 | - |
789
+ | 0.7185 | 36800 | 0.0143 | - |
790
+ | 0.7204 | 36900 | 0.0146 | - |
791
+ | 0.7224 | 37000 | 0.0142 | - |
792
+ | 0.7243 | 37100 | 0.014 | - |
793
+ | 0.7263 | 37200 | 0.0142 | - |
794
+ | 0.7282 | 37300 | 0.0142 | - |
795
+ | 0.7302 | 37400 | 0.0147 | - |
796
+ | 0.7321 | 37500 | 0.0143 | - |
797
+ | 0.7341 | 37600 | 0.0143 | - |
798
+ | 0.7360 | 37700 | 0.014 | - |
799
+ | 0.7380 | 37800 | 0.0146 | - |
800
+ | 0.7399 | 37900 | 0.0143 | - |
801
+ | 0.7419 | 38000 | 0.0145 | - |
802
+ | 0.7438 | 38100 | 0.0141 | - |
803
+ | 0.7458 | 38200 | 0.0142 | - |
804
+ | 0.7477 | 38300 | 0.0145 | - |
805
+ | 0.7497 | 38400 | 0.014 | - |
806
+ | 0.7516 | 38500 | 0.0139 | - |
807
+ | 0.7536 | 38600 | 0.0143 | - |
808
+ | 0.7555 | 38700 | 0.0142 | - |
809
+ | 0.7575 | 38800 | 0.0142 | - |
810
+ | 0.7595 | 38900 | 0.0141 | - |
811
+ | 0.7614 | 39000 | 0.0137 | - |
812
+ | 0.7634 | 39100 | 0.0141 | - |
813
+ | 0.7653 | 39200 | 0.0143 | - |
814
+ | 0.7673 | 39300 | 0.0145 | - |
815
+ | 0.7692 | 39400 | 0.0144 | - |
816
+ | 0.7712 | 39500 | 0.0142 | - |
817
+ | 0.7731 | 39600 | 0.0144 | - |
818
+ | 0.7751 | 39700 | 0.0139 | - |
819
+ | 0.7770 | 39800 | 0.0142 | - |
820
+ | 0.7790 | 39900 | 0.0139 | - |
821
+ | 0.7809 | 40000 | 0.0137 | - |
822
+ | 0.7829 | 40100 | 0.0137 | - |
823
+ | 0.7848 | 40200 | 0.014 | - |
824
+ | 0.7868 | 40300 | 0.014 | - |
825
+ | 0.7887 | 40400 | 0.0137 | - |
826
+ | 0.7907 | 40500 | 0.0143 | - |
827
+ | 0.7926 | 40600 | 0.0141 | - |
828
+ | 0.7946 | 40700 | 0.0138 | - |
829
+ | 0.7965 | 40800 | 0.0139 | - |
830
+ | 0.7985 | 40900 | 0.014 | - |
831
+ | 0.8001 | 40984 | - | 0.0098 |
832
+ | 0.8005 | 41000 | 0.0135 | - |
833
+ | 0.8024 | 41100 | 0.0138 | - |
834
+ | 0.8044 | 41200 | 0.0139 | - |
835
+ | 0.8063 | 41300 | 0.0137 | - |
836
+ | 0.8083 | 41400 | 0.0136 | - |
837
+ | 0.8102 | 41500 | 0.0139 | - |
838
+ | 0.8122 | 41600 | 0.0137 | - |
839
+ | 0.8141 | 41700 | 0.0139 | - |
840
+ | 0.8161 | 41800 | 0.014 | - |
841
+ | 0.8180 | 41900 | 0.0138 | - |
842
+ | 0.8200 | 42000 | 0.0134 | - |
843
+ | 0.8219 | 42100 | 0.0137 | - |
844
+ | 0.8239 | 42200 | 0.0136 | - |
845
+ | 0.8258 | 42300 | 0.0137 | - |
846
+ | 0.8278 | 42400 | 0.0139 | - |
847
+ | 0.8297 | 42500 | 0.0138 | - |
848
+ | 0.8317 | 42600 | 0.0137 | - |
849
+ | 0.8336 | 42700 | 0.0139 | - |
850
+ | 0.8356 | 42800 | 0.0134 | - |
851
+ | 0.8375 | 42900 | 0.0133 | - |
852
+ | 0.8395 | 43000 | 0.0134 | - |
853
+ | 0.8415 | 43100 | 0.0135 | - |
854
+ | 0.8434 | 43200 | 0.0134 | - |
855
+ | 0.8454 | 43300 | 0.0136 | - |
856
+ | 0.8473 | 43400 | 0.0138 | - |
857
+ | 0.8493 | 43500 | 0.0136 | - |
858
+ | 0.8512 | 43600 | 0.0131 | - |
859
+ | 0.8532 | 43700 | 0.0137 | - |
860
+ | 0.8551 | 43800 | 0.0134 | - |
861
+ | 0.8571 | 43900 | 0.0128 | - |
862
+ | 0.8590 | 44000 | 0.0134 | - |
863
+ | 0.8610 | 44100 | 0.0131 | - |
864
+ | 0.8629 | 44200 | 0.0133 | - |
865
+ | 0.8649 | 44300 | 0.0132 | - |
866
+ | 0.8668 | 44400 | 0.0135 | - |
867
+ | 0.8688 | 44500 | 0.013 | - |
868
+ | 0.8707 | 44600 | 0.0135 | - |
869
+ | 0.8727 | 44700 | 0.0131 | - |
870
+ | 0.8746 | 44800 | 0.0131 | - |
871
+ | 0.8766 | 44900 | 0.013 | - |
872
+ | 0.8785 | 45000 | 0.0129 | - |
873
+ | 0.8805 | 45100 | 0.0133 | - |
874
+ | 0.8825 | 45200 | 0.0133 | - |
875
+ | 0.8844 | 45300 | 0.0134 | - |
876
+ | 0.8864 | 45400 | 0.0135 | - |
877
+ | 0.8883 | 45500 | 0.0131 | - |
878
+ | 0.8903 | 45600 | 0.0134 | - |
879
+ | 0.8922 | 45700 | 0.0133 | - |
880
+ | 0.8942 | 45800 | 0.0132 | - |
881
+ | 0.8961 | 45900 | 0.0129 | - |
882
+ | 0.8981 | 46000 | 0.0131 | - |
883
+ | 0.9000 | 46100 | 0.013 | - |
884
+ | 0.9002 | 46107 | - | 0.0091 |
885
+ | 0.9020 | 46200 | 0.013 | - |
886
+ | 0.9039 | 46300 | 0.0129 | - |
887
+ | 0.9059 | 46400 | 0.0131 | - |
888
+ | 0.9078 | 46500 | 0.0132 | - |
889
+ | 0.9098 | 46600 | 0.0131 | - |
890
+ | 0.9117 | 46700 | 0.0131 | - |
891
+ | 0.9137 | 46800 | 0.0131 | - |
892
+ | 0.9156 | 46900 | 0.0132 | - |
893
+ | 0.9176 | 47000 | 0.0128 | - |
894
+ | 0.9195 | 47100 | 0.0126 | - |
895
+ | 0.9215 | 47200 | 0.0128 | - |
896
+ | 0.9234 | 47300 | 0.0131 | - |
897
+ | 0.9254 | 47400 | 0.0129 | - |
898
+ | 0.9274 | 47500 | 0.0127 | - |
899
+ | 0.9293 | 47600 | 0.0132 | - |
900
+ | 0.9313 | 47700 | 0.013 | - |
901
+ | 0.9332 | 47800 | 0.0128 | - |
902
+ | 0.9352 | 47900 | 0.0127 | - |
903
+ | 0.9371 | 48000 | 0.0126 | - |
904
+ | 0.9391 | 48100 | 0.0125 | - |
905
+ | 0.9410 | 48200 | 0.013 | - |
906
+ | 0.9430 | 48300 | 0.0127 | - |
907
+ | 0.9449 | 48400 | 0.0126 | - |
908
+ | 0.9469 | 48500 | 0.0127 | - |
909
+ | 0.9488 | 48600 | 0.0133 | - |
910
+ | 0.9508 | 48700 | 0.0125 | - |
911
+ | 0.9527 | 48800 | 0.0126 | - |
912
+ | 0.9547 | 48900 | 0.0128 | - |
913
+ | 0.9566 | 49000 | 0.0128 | - |
914
+ | 0.9586 | 49100 | 0.0129 | - |
915
+ | 0.9605 | 49200 | 0.0129 | - |
916
+ | 0.9625 | 49300 | 0.0127 | - |
917
+ | 0.9644 | 49400 | 0.0125 | - |
918
+ | 0.9664 | 49500 | 0.0128 | - |
919
+ | 0.9684 | 49600 | 0.0128 | - |
920
+ | 0.9703 | 49700 | 0.0125 | - |
921
+ | 0.9723 | 49800 | 0.0127 | - |
922
+ | 0.9742 | 49900 | 0.0129 | - |
923
+ | 0.9762 | 50000 | 0.013 | - |
924
+ | 0.9781 | 50100 | 0.0129 | - |
925
+ | 0.9801 | 50200 | 0.0128 | - |
926
+ | 0.9820 | 50300 | 0.0125 | - |
927
+ | 0.9840 | 50400 | 0.0127 | - |
928
+ | 0.9859 | 50500 | 0.0125 | - |
929
+ | 0.9879 | 50600 | 0.0128 | - |
930
+ | 0.9898 | 50700 | 0.0123 | - |
931
+ | 0.9918 | 50800 | 0.0125 | - |
932
+ | 0.9937 | 50900 | 0.0125 | - |
933
+ | 0.9957 | 51000 | 0.0125 | - |
934
+ | 0.9976 | 51100 | 0.0124 | - |
935
+ | 0.9996 | 51200 | 0.0128 | - |
936
+ | 1.0002 | 51230 | - | 0.0088 |
937
+
938
+ </details>
939
+
940
  ### Framework Versions
941
+ - Python: 3.11.13
942
+ - Sentence Transformers: 5.1.2
943
+ - Transformers: 4.57.1
944
+ - PyTorch: 2.8.0+cu129
945
+ - Accelerate: 1.11.0
946
+ - Datasets: 4.3.0
947
+ - Tokenizers: 0.22.1
948
 
949
  ## Citation
950
 
951
  ### BibTeX
952
 
953
+ #### Sentence Transformers
954
+ ```bibtex
955
+ @inproceedings{reimers-2019-sentence-bert,
956
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
957
+ author = "Reimers, Nils and Gurevych, Iryna",
958
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
959
+ month = "11",
960
+ year = "2019",
961
+ publisher = "Association for Computational Linguistics",
962
+ url = "https://arxiv.org/abs/1908.10084",
963
+ }
964
+ ```
965
+
966
+ #### MultipleNegativesRankingLoss
967
+ ```bibtex
968
+ @misc{henderson2017efficient,
969
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
970
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
971
+ year={2017},
972
+ eprint={1705.00652},
973
+ archivePrefix={arXiv},
974
+ primaryClass={cs.CL}
975
+ }
976
+ ```
977
+
978
  <!--
979
  ## Glossary
980
 
config.json CHANGED
@@ -40,7 +40,6 @@
40
  "sep_token_id": 50282,
41
  "sparse_pred_ignore_index": -100,
42
  "sparse_prediction": false,
43
- "torch_dtype": "bfloat16",
44
- "transformers_version": "4.53.3",
45
  "vocab_size": 50368
46
  }
 
40
  "sep_token_id": 50282,
41
  "sparse_pred_ignore_index": -100,
42
  "sparse_prediction": false,
43
+ "transformers_version": "4.57.1",
 
44
  "vocab_size": 50368
45
  }
config_sentence_transformers.json CHANGED
@@ -1,9 +1,9 @@
1
  {
2
  "model_type": "SentenceTransformer",
3
  "__version__": {
4
- "sentence_transformers": "5.1.1",
5
- "transformers": "4.53.3",
6
- "pytorch": "2.7.0"
7
  },
8
  "prompts": {
9
  "query": "",
 
1
  {
2
  "model_type": "SentenceTransformer",
3
  "__version__": {
4
+ "sentence_transformers": "5.1.2",
5
+ "transformers": "4.57.1",
6
+ "pytorch": "2.8.0+cu129"
7
  },
8
  "prompts": {
9
  "query": "",
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:4a6cc1e16f006f9d27bae9ef27abd50c047a3e9bce5dc24425065dfbb9792d9f
3
- size 298041696
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3f681d69f06a4b118e6b05a3b3727b7ef154af28c91ce7941f26036dc6c96eaa
3
+ size 596070136
sentence_bert_config.json CHANGED
@@ -1,4 +1,4 @@
1
  {
2
- "max_seq_length": 2048,
3
  "do_lower_case": false
4
  }
 
1
  {
2
+ "max_seq_length": 1024,
3
  "do_lower_case": false
4
  }
tokenizer.json CHANGED
@@ -2,7 +2,7 @@
2
  "version": "1.0",
3
  "truncation": {
4
  "direction": "Right",
5
- "max_length": 2048,
6
  "strategy": "LongestFirst",
7
  "stride": 0
8
  },
 
2
  "version": "1.0",
3
  "truncation": {
4
  "direction": "Right",
5
+ "max_length": 1024,
6
  "strategy": "LongestFirst",
7
  "stride": 0
8
  },
tokenizer_config.json CHANGED
@@ -933,12 +933,12 @@
933
  "cls_token": "[CLS]",
934
  "extra_special_tokens": {},
935
  "mask_token": "[MASK]",
936
- "max_length": 2048,
937
  "model_input_names": [
938
  "input_ids",
939
  "attention_mask"
940
  ],
941
- "model_max_length": 2048,
942
  "pad_to_multiple_of": null,
943
  "pad_token": "[PAD]",
944
  "pad_token_type_id": 0,
 
933
  "cls_token": "[CLS]",
934
  "extra_special_tokens": {},
935
  "mask_token": "[MASK]",
936
+ "max_length": 1024,
937
  "model_input_names": [
938
  "input_ids",
939
  "attention_mask"
940
  ],
941
+ "model_max_length": 1024,
942
  "pad_to_multiple_of": null,
943
  "pad_token": "[PAD]",
944
  "pad_token_type_id": 0,