thebajajra commited on
Commit
30c9769
·
verified ·
1 Parent(s): 92ae162

Upload folder using huggingface_hub

Browse files
.DS_Store ADDED
Binary file (6.15 kB). View file
 
README.md CHANGED
@@ -1,65 +1,246 @@
1
  ---
 
 
2
  tags:
3
  - sentence-transformers
4
  - sentence-similarity
5
  - feature-extraction
6
  - dense
7
- - ecommerce
8
- - e-commerce
9
- - retail
10
- - marketplace
11
- - shopping
12
- - amazon
13
- - ebay
14
- - alibaba
15
- - google
16
- - rakuten
17
- - bestbuy
18
- - walmart
19
- - flipkart
20
- - wayfair
21
- - shein
22
- - target
23
- - etsy
24
- - shopify
25
- - taobao
26
- - asos
27
- - carrefour
28
- - costco
29
- - overstock
30
- - pretraining
31
- - encoder
32
- - language-modeling
33
- - foundation-model
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34
  pipeline_tag: sentence-similarity
35
  library_name: sentence-transformers
36
  ---
37
 
38
- # RexBERT-base-embed-pf-v0.1
 
 
39
 
40
  ## Model Details
41
 
42
  ### Model Description
43
  - **Model Type:** Sentence Transformer
44
- <!-- - **Base model:** [Unknown](https://huggingface.co/unknown) -->
45
- - **Maximum Sequence Length:** 2048 tokens
46
  - **Output Dimensionality:** 768 dimensions
47
  - **Similarity Function:** Cosine Similarity
48
- <!-- - **Training Dataset:** Unknown -->
49
- <!-- - **Language:** Unknown -->
 
50
  <!-- - **License:** Unknown -->
51
 
52
  ### Model Sources
53
 
54
  - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
55
- - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
56
  - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
57
 
58
  ### Full Model Architecture
59
 
60
  ```
61
  SentenceTransformer(
62
- (0): Transformer({'max_seq_length': 2048, 'do_lower_case': False, 'architecture': 'ModernBertModel'})
63
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
64
  )
65
  ```
@@ -81,21 +262,23 @@ from sentence_transformers import SentenceTransformer
81
  # Download from the 🤗 Hub
82
  model = SentenceTransformer("sentence_transformers_model_id")
83
  # Run inference
84
- sentences = [
85
- 'The weather is lovely today.',
86
- "It's so sunny outside!",
87
- 'He drove to the stadium.',
 
 
 
88
  ]
89
- embeddings = model.encode(sentences)
90
- print(embeddings.shape)
91
- # [3, 768]
 
92
 
93
  # Get the similarity scores for the embeddings
94
- similarities = model.similarity(embeddings, embeddings)
95
  print(similarities)
96
- # tensor([[1.0000, 0.5898, 0.6172],
97
- # [0.5898, 0.9961, 0.3457],
98
- # [0.6172, 0.3457, 1.0000]], dtype=torch.bfloat16)
99
  ```
100
 
101
  <!--
@@ -136,19 +319,679 @@ You can finetune this model on your own dataset.
136
 
137
  ## Training Details
138
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
139
  ### Framework Versions
140
- - Python: 3.12.8
141
- - Sentence Transformers: 5.1.1
142
- - Transformers: 4.53.3
143
- - PyTorch: 2.7.0
144
- - Accelerate: 1.10.0
145
- - Datasets: 3.6.0
146
- - Tokenizers: 0.21.4
147
 
148
  ## Citation
149
 
150
  ### BibTeX
151
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
152
  <!--
153
  ## Glossary
154
 
 
1
  ---
2
+ language:
3
+ - en
4
  tags:
5
  - sentence-transformers
6
  - sentence-similarity
7
  - feature-extraction
8
  - dense
9
+ - generated_from_trainer
10
+ - dataset_size:222490215
11
+ - loss:MultipleNegativesRankingLoss
12
+ base_model: thebajajra/RexBERT-base
13
+ widget:
14
+ - source_sentence: Can I bring Katana (Samurai Sword) from Japan to Malaysia?
15
+ sentences:
16
+ - I've seen the j hook method and binder method on here, but I was looking for something
17
+ a little cheaper. I need to hang 100 empty record sleeves on a wall for a photoshoot
18
+ and couldn't think of anything other than command strips. I'd use magic tape but
19
+ I hear that it rips paper. I also need to hang them on cement
20
+ - "Hi all, \n\nWith the success of GoT, and with the upcoming QoT, LoTR and The\
21
+ \ Witcher series, I was wondering which fantasy books you thought would translate\
22
+ \ well to TV. \n\nI think The Traitor Baru Cormorant would be great, as well as\
23
+ \ Farseer. My heart also wants me to believe Malazan would be good, but the CGI\
24
+ \ budget would likely need to be ridiculous."
25
+ - Hi everyone, currently I'm at Japan and thinking of buying a Katana (Samurai Sword)
26
+ and bring it back to Malaysia. How do you guys/girls reckon? Will i pass through
27
+ japanese and malaysian customs without a problem?
28
+ - source_sentence: What one book would you recommend schools add to the list of books
29
+ to teach?
30
+ sentences:
31
+ - 'Hi everyone. I''m in the middle of teaching a college algebra course and a large
32
+ portion of my students are working toward a nursing career (this is a required
33
+ course for them).
34
+
35
+
36
+ Since my background is in math and a variety of physical/engineering sciences,
37
+ I have no problem emphasizing the utility of the course material to the few students
38
+ aiming for computer science, physics, and accounting. However, I feel really lost
39
+ as to how I can answer the classical "when am I going to use this class" for the
40
+ nursing types, and I''d really like to do everything I can to motivate them. My
41
+ general thoughts on mathematics as a whole is that the most important thing to
42
+ learn from it is the logical, systematic, and scientific thought process for problem
43
+ solving, but I also feel that this is built up over several deeper math and science
44
+ courses rather than just one introductory math course.
45
+
46
+
47
+ So I''m hoping some people can offer some insight as to the long-term reality:
48
+ does anyone here feel like they benefited from an algebra or other introductory
49
+ math course in the long run? If so, how? And if possible, are there any examples
50
+ from the actual job that can be related in some way to a basic algebra class?
51
+ Or do you feel that this is a complete waste of time and shouldn''t be required?
52
+
53
+
54
+ Any input is appreciated.'
55
+ - "Hello all, \n\nI need some serious help with a spot in my hard. \n\n\nIt's\
56
+ \ between a hundred foot tall oak tree, the shade of the carport, and sits in\
57
+ \ front of the bay windows on the front of my house. \n\n\nWe're so tired if sitting\
58
+ \ here and staring at this, \"dead zone. \" \n\n\nIt's shaded basically all\
59
+ \ the time and stays pretty damp. \n\n\nWe've planted periwinkle here, and it\
60
+ \ blooms, but won't spread. I think we spent about $700 planting it 3 years\
61
+ \ ago.. . But it hasn't done anything. \n\n\nSince the spot is in front of the\
62
+ \ windows, we can't plant anything tall. Also, the yard is sloped, so I\
63
+ \ can't put a patio there (without terracing). About 5 years ago, we tried\
64
+ \ to turn it into a rock garden.... but the oak tree drops so many leaves in the\
65
+ \ autumn that the rock garden was covered and it was a nightmare to get the leaves\
66
+ \ out of the pea gravel. \n\n\nI live in central Alabama. Any help would\
67
+ \ be greatly appreciated. Please save me."
68
+ - 'I had a conversation a while back, and we noticed that schools (specifically
69
+ high schools) don''t really teach any contemporary books. I''m curious what books,
70
+ new or old, you would add the the required high school reading.
71
+
72
+
73
+ The first books that comes to my mind are *Zen and the Art of Motorcycle maintenance*,
74
+ *Zen Flesh Zen Bones*, and maybe even *the Bible* (+ other old religious scriptures:
75
+ Tao Te Jing etc) if for nothing else the shear number of literary allusions reading
76
+ it reveals.'
77
+ - source_sentence: Ram allocation setting keeps resetting to xmx1g in vanilla launcher,
78
+ not using Curse or Twitch launcher, wtf?
79
+ sentences:
80
+ - 'Is there some way we can reward people for having a good win differential (i.e.
81
+ they''ve won a lot more games than they''ve lost). I''m not opposed to the current
82
+ degree/rank system, but maybe we could add a flair for highest win differential
83
+ in a day or in a month or something like that. I feel like we should reward people
84
+ who are winning 70+% of their games as opposed to only rewarding people with the
85
+ highest volume of wins.
86
+
87
+
88
+ Thoughts?'
89
+ - 'So every time I allocate 6gb of ram to minecraft, I can start up the game and
90
+ use 6gb just fine, but if I close the game and re-open it, the java argument is
91
+ reset back to the default ram value. I''ve researched everywhere and I can''t
92
+ find anything for the vanilla launcher as this problem seems to only effect people
93
+ using the Curse or Twitch launcher which I do not have or have ever used. Razer
94
+ Synapse seems to be a factor as well, but I don''t have Razer Synapse either.
95
+
96
+
97
+ Not a major problem, I just don''t feel like adding the argument before I launch
98
+ the game every time.
99
+
100
+
101
+ EDIT: Can''t edit post title, but I meant to type xmx2g instead of 1g.
102
+
103
+
104
+ EDIT 2: The problem fixed itself after setting the argument a few times, dunno
105
+ what was goin on.'
106
+ - 'timestamp
107
+
108
+
109
+ closeup of cables
110
+
111
+
112
+ &amp;#x200B;
113
+
114
+
115
+ &amp;#x200B;
116
+
117
+
118
+ # OUTDATED, PLEASE SEE MY NEWEST POST
119
+
120
+
121
+ # OUTDATED, PLEASE SEE MY NEWEST POST
122
+
123
+
124
+ # OUTDATED, PLEASE SEE MY NEWEST POST
125
+
126
+
127
+ &amp;#x200B;
128
+
129
+
130
+ &amp;#x200B;
131
+
132
+
133
+ |ITEM|NOTE|PRICE|
134
+
135
+ |:-|:-|:-|
136
+
137
+ |SA Arcane base kit|never mounted|130 EUR **SOLD**|
138
+
139
+ |coiled LEMO cable rose|1.5m with 15cm coil on device side. USB-A to USB-C. "Rose"
140
+ paracord with white techflex double sleeving. white heatshrink|~~105 EUR~~ **SOLD**|
141
+
142
+ |coiled LEMO cable purple|1.5m with 15cm coil on device side. USB-A to USB-C.
143
+ "Neon Pink" paracord with purple techflex double sleeving. blue heatshrink. (GMK
144
+ Laser themed) - rest of cable not pictured but will be included obviously|105
145
+ EUR **SOLD**|
146
+
147
+ |Unholy Panda (linear) x70|Halo housing + Trash Panda stem (you can choose between
148
+ Halo True and Halo Clear spring)|18 EUR **SOLD**|
149
+
150
+ |Unholy Panda (tactile) x 100|Halo housing + purple Trash Panda stem (you can
151
+ choose between Halo True and Halo Clear spring)|23 EUR|
152
+
153
+ |Unholy Panda (tactile) x 100|Halo housing + purple Trash Panda stem (you can
154
+ choose between Halo True and Halo Clear spring)|23 EUR|
155
+
156
+ |\--|\--|\--|
157
+
158
+ |ADD-ON ONLY: Nutcracker V1 Pro switch opener|for Cherry style housings, silver.
159
+ will only sell bundled with other items|15 EUR|
160
+
161
+
162
+ Cables were made by PexonPCs and use genuine LEMO connectors'
163
+ - source_sentence: Struggling with BPD for a little over a year, SO just told me something
164
+ that hurt my feelings. Is this the BP clinginess talking or am I correct in my
165
+ gut instincts?
166
+ sentences:
167
+ - Since demon's souls is getting a remake i wanted to ask how strong do you guys
168
+ think the slayer of demon's from the original version is. I personally have him
169
+ at planet level
170
+ - 'My SO and I were on the couch watching TV. I reached up and softly touched his
171
+ face for a second. He smiled and rubbed my leg. I asked him, "Does me touching
172
+ you annoy you?" He thought for a moment and said, "Not all the time." I pressed
173
+ for an explanation and he said, "Sometimes during the week, when I''m exhausted
174
+ from work, I don''t want you to be so needy."
175
+
176
+
177
+ Is this a red flag or is this just my constant need for validation due to the
178
+ BPD?'
179
+ - So I'm on a trip right now and I may have found a sweet deal on a Mohawk solo
180
+ canoe. The problem of course (just my luck) is that I also have my kayak with
181
+ me. I'm ready to pull the trigger on this canoe if it's in good shape bit I don't
182
+ have a way of getting it home at the moment. There's an rei close by and I was
183
+ thinking about getting a set of the y carriers. Has anyone ever tried putting
184
+ a solo canoe in one of these? Could I fit one boat in a single y carrier and then
185
+ the other on the roof racks? Another option is to build a addition to the roof
186
+ rack but I don't have any tools on hand so I would like to avoid that if possible.
187
+ - source_sentence: Where do you guys go to find used camper shells?
188
+ sentences:
189
+ - 'Hey guys what is the most optimal tool for pulling long staples out from hardwood
190
+ flooring? I''m trying to find the most optimal way to do it because I have thousands
191
+ to pull! Fence pliers did not work too well on account the pointy tip was too
192
+ thick get in and roll them out and when i tried the gripping/cutting part it broke
193
+ the staples.
194
+
195
+
196
+ I''m thinking round nose vice grips or a car gasket puller?
197
+
198
+
199
+ Thanks'
200
+ - 'I''ve got a newly acquired 1st gen 2005 silvee Toyota tundra trd and am looking
201
+ for an used camper shell. Craigslist hasnt been very useful....where do you guys
202
+ go?
203
+
204
+
205
+ Thanks!'
206
+ - I work at a convenience store and the number of Newports I sell a day is insane.
207
+ Considering buying a couple cartons of em and maybe some parliament menthols if
208
+ the FDA goes through with this. Should be able to throw em up on craigslist or
209
+ ebay a week or two later and it'll be like steaks in a piranha pond
210
+ datasets:
211
+ - nomic-ai/nomic-embed-unsupervised-data
212
  pipeline_tag: sentence-similarity
213
  library_name: sentence-transformers
214
  ---
215
 
216
+ # SentenceTransformer based on thebajajra/RexBERT-base
217
+
218
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [thebajajra/RexBERT-base](https://huggingface.co/thebajajra/RexBERT-base) on the [nomic-embed-unsupervised-data](https://huggingface.co/datasets/nomic-ai/nomic-embed-unsupervised-data) dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
219
 
220
  ## Model Details
221
 
222
  ### Model Description
223
  - **Model Type:** Sentence Transformer
224
+ - **Base model:** [thebajajra/RexBERT-base](https://huggingface.co/thebajajra/RexBERT-base) <!-- at revision 4f66d2977864414371770084b681e00698b98457 -->
225
+ - **Maximum Sequence Length:** 1024 tokens
226
  - **Output Dimensionality:** 768 dimensions
227
  - **Similarity Function:** Cosine Similarity
228
+ - **Training Dataset:**
229
+ - [nomic-embed-unsupervised-data](https://huggingface.co/datasets/nomic-ai/nomic-embed-unsupervised-data)
230
+ - **Language:** en
231
  <!-- - **License:** Unknown -->
232
 
233
  ### Model Sources
234
 
235
  - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
236
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/huggingface/sentence-transformers)
237
  - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
238
 
239
  ### Full Model Architecture
240
 
241
  ```
242
  SentenceTransformer(
243
+ (0): Transformer({'max_seq_length': 1024, 'do_lower_case': False, 'architecture': 'ModernBertModel'})
244
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
245
  )
246
  ```
 
262
  # Download from the 🤗 Hub
263
  model = SentenceTransformer("sentence_transformers_model_id")
264
  # Run inference
265
+ queries = [
266
+ "Where do you guys go to find used camper shells?",
267
+ ]
268
+ documents = [
269
+ "I've got a newly acquired 1st gen 2005 silvee Toyota tundra trd and am looking for an used camper shell. Craigslist hasnt been very useful....where do you guys go?\n\nThanks!",
270
+ "I work at a convenience store and the number of Newports I sell a day is insane. Considering buying a couple cartons of em and maybe some parliament menthols if the FDA goes through with this. Should be able to throw em up on craigslist or ebay a week or two later and it'll be like steaks in a piranha pond",
271
+ "Hey guys what is the most optimal tool for pulling long staples out from hardwood flooring? I'm trying to find the most optimal way to do it because I have thousands to pull! Fence pliers did not work too well on account the pointy tip was too thick get in and roll them out and when i tried the gripping/cutting part it broke the staples.\n\nI'm thinking round nose vice grips or a car gasket puller?\n\nThanks",
272
  ]
273
+ query_embeddings = model.encode_query(queries)
274
+ document_embeddings = model.encode_document(documents)
275
+ print(query_embeddings.shape, document_embeddings.shape)
276
+ # [1, 768] [3, 768]
277
 
278
  # Get the similarity scores for the embeddings
279
+ similarities = model.similarity(query_embeddings, document_embeddings)
280
  print(similarities)
281
+ # tensor([[0.8108, 0.2481, 0.1200]])
 
 
282
  ```
283
 
284
  <!--
 
319
 
320
  ## Training Details
321
 
322
+ ### Training Dataset
323
+
324
+ #### nomic-embed-unsupervised-data
325
+
326
+ * Dataset: [nomic-embed-unsupervised-data](https://huggingface.co/datasets/nomic-ai/nomic-embed-unsupervised-data) at [917bae6](https://huggingface.co/datasets/nomic-ai/nomic-embed-unsupervised-data/tree/917bae6ed30ebc80fc8c81ba8e3e34558205d6bb)
327
+ * Size: 222,490,215 training samples
328
+ * Columns: <code>query</code> and <code>document</code>
329
+ * Approximate statistics based on the first 1000 samples:
330
+ | | query | document |
331
+ |:--------|:----------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------|
332
+ | type | string | string |
333
+ | details | <ul><li>min: 6 tokens</li><li>mean: 16.83 tokens</li><li>max: 62 tokens</li></ul> | <ul><li>min: 12 tokens</li><li>mean: 162.25 tokens</li><li>max: 1024 tokens</li></ul> |
334
+ * Samples:
335
+ | query | document |
336
+ |:--------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
337
+ | <code>I became a US citizen early this year and this is going to be my first 4th of July as an American!</code> | <code>Because of the current situation, my citizen oath ceremony felt more like a pick up order... Got my certificate, and no guests allowed, so I couldn’t bring anybody to join my ceremony, also no pictures. <br><br>Anyway... I want to celebrate big time this 4th of July, and I’m already planning it! (Any ideas are super welcome!). I say big time but I just really want to do something fun at home with my family. 😊</code> |
338
+ | <code>"The Kingdom of God for Jesus"; I know you guys know how to answer this overrated question.</code> | <code>Basically what we're talking about is that the "kingdom" of god according to jesus are:<br><br>* "the kingdom as good news (where the kingdom is on earth, whereas by living a beautiful, meaningful life on earth is the meaning of salvation)" <br>* "the kingdom is offered to all"<br>* etc.<br><br>and finally, the question goes like this: "The Kingdom Does Not Ask for Performance; It is a gift, an offer. We can only inherit it. So, ***what is the point of being good***?"</code> |
339
+ | <code>So I made a "size" chart to go with my weight infograph, all based off that "Relative champ weight/height" thread.</code> | <code>Here's the weight chart I did the other day<br><br><br><br>And here's the size chart I did today. <br><br><br><br>*Anivia, Skarner and Shyvanna (dragon form) are "Dimensions" instead of an actual "height", but I think you can get the jist.<br><br>The original thread this is based off of is located via the link below. I am using these numbers (and my own conversions), so I'm not always sure where they got the numbers!<br><br></code> |
340
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
341
+ ```json
342
+ {
343
+ "scale": 20.0,
344
+ "similarity_fct": "cos_sim",
345
+ "gather_across_devices": false
346
+ }
347
+ ```
348
+
349
+ ### Evaluation Dataset
350
+
351
+ #### nomic-embed-unsupervised-data
352
+
353
+ * Dataset: [nomic-embed-unsupervised-data](https://huggingface.co/datasets/nomic-ai/nomic-embed-unsupervised-data) at [917bae6](https://huggingface.co/datasets/nomic-ai/nomic-embed-unsupervised-data/tree/917bae6ed30ebc80fc8c81ba8e3e34558205d6bb)
354
+ * Size: 222,727 evaluation samples
355
+ * Columns: <code>query</code> and <code>document</code>
356
+ * Approximate statistics based on the first 1000 samples:
357
+ | | query | document |
358
+ |:--------|:----------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------|
359
+ | type | string | string |
360
+ | details | <ul><li>min: 6 tokens</li><li>mean: 16.41 tokens</li><li>max: 66 tokens</li></ul> | <ul><li>min: 15 tokens</li><li>mean: 164.47 tokens</li><li>max: 1024 tokens</li></ul> |
361
+ * Samples:
362
+ | query | document |
363
+ |:-------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
364
+ | <code>Do you subscribe to any horror magazines?</code> | <code>I get most of my horror news from blogs and websites and such, but i do subscribe to a bunch of horror mags. With everything being so digital these days, something about flipping through a magazine and reading articles about both classic and upcoming horror movies is refreshing. I get a lot of great recommendations from them, and theres a lot of interesting interviews and behind the scenes stuff that i dont see on the popular websites.</code> |
365
+ | <code>Missing PDS Laundry Card :(</code> | <code>This is an absolute long shot but I must've accidentally left my laundry card in the dryer card slot because I cant find it anywhere. If someone found a card in there, please DM me. I've already bought a card but I'd like to have my original card back :(</code> |
366
+ | <code>Talking Bad will be terrible</code> | <code>Talking Dead is horrible and this will be to. Chris Hardwick and the cast of random no name celebrities offer nothing new to the discussion. The only good thing about Breaking Bad ending is that Talking Bad will end soon as well.</code> |
367
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
368
+ ```json
369
+ {
370
+ "scale": 20.0,
371
+ "similarity_fct": "cos_sim",
372
+ "gather_across_devices": false
373
+ }
374
+ ```
375
+
376
+ ### Training Hyperparameters
377
+ #### Non-Default Hyperparameters
378
+
379
+ - `eval_strategy`: steps
380
+ - `per_device_train_batch_size`: 256
381
+ - `per_device_eval_batch_size`: 128
382
+ - `learning_rate`: 2e-06
383
+ - `num_train_epochs`: 4
384
+ - `warmup_ratio`: 0.1
385
+ - `bf16`: True
386
+ - `batch_sampler`: no_duplicates
387
+
388
+ #### All Hyperparameters
389
+ <details><summary>Click to expand</summary>
390
+
391
+ - `overwrite_output_dir`: False
392
+ - `do_predict`: False
393
+ - `eval_strategy`: steps
394
+ - `prediction_loss_only`: True
395
+ - `per_device_train_batch_size`: 256
396
+ - `per_device_eval_batch_size`: 128
397
+ - `per_gpu_train_batch_size`: None
398
+ - `per_gpu_eval_batch_size`: None
399
+ - `gradient_accumulation_steps`: 1
400
+ - `eval_accumulation_steps`: None
401
+ - `torch_empty_cache_steps`: None
402
+ - `learning_rate`: 2e-06
403
+ - `weight_decay`: 0.0
404
+ - `adam_beta1`: 0.9
405
+ - `adam_beta2`: 0.999
406
+ - `adam_epsilon`: 1e-08
407
+ - `max_grad_norm`: 1.0
408
+ - `num_train_epochs`: 4
409
+ - `max_steps`: -1
410
+ - `lr_scheduler_type`: linear
411
+ - `lr_scheduler_kwargs`: {}
412
+ - `warmup_ratio`: 0.1
413
+ - `warmup_steps`: 0
414
+ - `log_level`: passive
415
+ - `log_level_replica`: warning
416
+ - `log_on_each_node`: True
417
+ - `logging_nan_inf_filter`: True
418
+ - `save_safetensors`: True
419
+ - `save_on_each_node`: False
420
+ - `save_only_model`: False
421
+ - `restore_callback_states_from_checkpoint`: False
422
+ - `no_cuda`: False
423
+ - `use_cpu`: False
424
+ - `use_mps_device`: False
425
+ - `seed`: 42
426
+ - `data_seed`: None
427
+ - `jit_mode_eval`: False
428
+ - `bf16`: True
429
+ - `fp16`: False
430
+ - `fp16_opt_level`: O1
431
+ - `half_precision_backend`: auto
432
+ - `bf16_full_eval`: False
433
+ - `fp16_full_eval`: False
434
+ - `tf32`: None
435
+ - `local_rank`: 0
436
+ - `ddp_backend`: None
437
+ - `tpu_num_cores`: None
438
+ - `tpu_metrics_debug`: False
439
+ - `debug`: []
440
+ - `dataloader_drop_last`: True
441
+ - `dataloader_num_workers`: 0
442
+ - `dataloader_prefetch_factor`: None
443
+ - `past_index`: -1
444
+ - `disable_tqdm`: False
445
+ - `remove_unused_columns`: True
446
+ - `label_names`: None
447
+ - `load_best_model_at_end`: False
448
+ - `ignore_data_skip`: False
449
+ - `fsdp`: []
450
+ - `fsdp_min_num_params`: 0
451
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
452
+ - `fsdp_transformer_layer_cls_to_wrap`: None
453
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
454
+ - `parallelism_config`: None
455
+ - `deepspeed`: None
456
+ - `label_smoothing_factor`: 0.0
457
+ - `optim`: adamw_torch
458
+ - `optim_args`: None
459
+ - `adafactor`: False
460
+ - `group_by_length`: False
461
+ - `length_column_name`: length
462
+ - `project`: huggingface
463
+ - `trackio_space_id`: trackio
464
+ - `ddp_find_unused_parameters`: None
465
+ - `ddp_bucket_cap_mb`: None
466
+ - `ddp_broadcast_buffers`: False
467
+ - `dataloader_pin_memory`: True
468
+ - `dataloader_persistent_workers`: False
469
+ - `skip_memory_metrics`: True
470
+ - `use_legacy_prediction_loop`: False
471
+ - `push_to_hub`: False
472
+ - `resume_from_checkpoint`: None
473
+ - `hub_model_id`: None
474
+ - `hub_strategy`: every_save
475
+ - `hub_private_repo`: None
476
+ - `hub_always_push`: False
477
+ - `hub_revision`: None
478
+ - `gradient_checkpointing`: False
479
+ - `gradient_checkpointing_kwargs`: None
480
+ - `include_inputs_for_metrics`: False
481
+ - `include_for_metrics`: []
482
+ - `eval_do_concat_batches`: True
483
+ - `fp16_backend`: auto
484
+ - `push_to_hub_model_id`: None
485
+ - `push_to_hub_organization`: None
486
+ - `mp_parameters`:
487
+ - `auto_find_batch_size`: False
488
+ - `full_determinism`: False
489
+ - `torchdynamo`: None
490
+ - `ray_scope`: last
491
+ - `ddp_timeout`: 1800
492
+ - `torch_compile`: False
493
+ - `torch_compile_backend`: None
494
+ - `torch_compile_mode`: None
495
+ - `include_tokens_per_second`: False
496
+ - `include_num_input_tokens_seen`: no
497
+ - `neftune_noise_alpha`: None
498
+ - `optim_target_modules`: None
499
+ - `batch_eval_metrics`: False
500
+ - `eval_on_start`: False
501
+ - `use_liger_kernel`: False
502
+ - `liger_kernel_config`: None
503
+ - `eval_use_gather_object`: False
504
+ - `average_tokens_across_devices`: True
505
+ - `prompts`: None
506
+ - `batch_sampler`: no_duplicates
507
+ - `multi_dataset_batch_sampler`: proportional
508
+ - `router_mapping`: {}
509
+ - `learning_rate_mapping`: {}
510
+
511
+ </details>
512
+
513
+ ### Training Logs
514
+ <details><summary>Click to expand</summary>
515
+
516
+ | Epoch | Step | Training Loss | Validation Loss |
517
+ |:------:|:-----:|:-------------:|:---------------:|
518
+ | 0.0009 | 100 | 4.4714 | - |
519
+ | 0.0018 | 200 | 4.4457 | - |
520
+ | 0.0028 | 300 | 4.4007 | - |
521
+ | 0.0037 | 400 | 4.336 | - |
522
+ | 0.0046 | 500 | 4.2476 | - |
523
+ | 0.0055 | 600 | 4.1406 | - |
524
+ | 0.0064 | 700 | 4.0049 | - |
525
+ | 0.0074 | 800 | 3.8434 | - |
526
+ | 0.0083 | 900 | 3.6393 | - |
527
+ | 0.0092 | 1000 | 3.3763 | - |
528
+ | 0.0101 | 1100 | 3.0541 | - |
529
+ | 0.0110 | 1200 | 2.6362 | - |
530
+ | 0.0120 | 1300 | 2.1226 | - |
531
+ | 0.0129 | 1400 | 1.6113 | - |
532
+ | 0.0138 | 1500 | 1.2565 | - |
533
+ | 0.0147 | 1600 | 1.029 | - |
534
+ | 0.0156 | 1700 | 0.846 | - |
535
+ | 0.0166 | 1800 | 0.7111 | - |
536
+ | 0.0175 | 1900 | 0.5967 | - |
537
+ | 0.0184 | 2000 | 0.488 | - |
538
+ | 0.0193 | 2100 | 0.4138 | - |
539
+ | 0.0203 | 2200 | 0.3565 | - |
540
+ | 0.0212 | 2300 | 0.3129 | - |
541
+ | 0.0221 | 2400 | 0.2827 | - |
542
+ | 0.0230 | 2500 | 0.2557 | - |
543
+ | 0.0239 | 2600 | 0.2379 | - |
544
+ | 0.0249 | 2700 | 0.2234 | - |
545
+ | 0.0258 | 2800 | 0.2055 | - |
546
+ | 0.0267 | 2900 | 0.1926 | - |
547
+ | 0.0276 | 3000 | 0.1843 | - |
548
+ | 0.0285 | 3100 | 0.175 | - |
549
+ | 0.0295 | 3200 | 0.1647 | - |
550
+ | 0.0304 | 3300 | 0.157 | - |
551
+ | 0.0313 | 3400 | 0.1512 | - |
552
+ | 0.0322 | 3500 | 0.146 | - |
553
+ | 0.0331 | 3600 | 0.1412 | - |
554
+ | 0.0341 | 3700 | 0.1352 | - |
555
+ | 0.0350 | 3800 | 0.1295 | - |
556
+ | 0.0359 | 3900 | 0.1261 | - |
557
+ | 0.0368 | 4000 | 0.122 | - |
558
+ | 0.0377 | 4100 | 0.1171 | - |
559
+ | 0.0387 | 4200 | 0.1147 | - |
560
+ | 0.0396 | 4300 | 0.1103 | - |
561
+ | 0.0405 | 4400 | 0.1073 | - |
562
+ | 0.0414 | 4500 | 0.1053 | - |
563
+ | 0.0423 | 4600 | 0.1016 | - |
564
+ | 0.0433 | 4700 | 0.0991 | - |
565
+ | 0.0442 | 4800 | 0.0981 | - |
566
+ | 0.0451 | 4900 | 0.0935 | - |
567
+ | 0.0460 | 5000 | 0.0928 | - |
568
+ | 0.0469 | 5100 | 0.0895 | - |
569
+ | 0.0479 | 5200 | 0.0877 | - |
570
+ | 0.0488 | 5300 | 0.0853 | - |
571
+ | 0.0497 | 5400 | 0.0829 | - |
572
+ | 0.0506 | 5500 | 0.0818 | - |
573
+ | 0.0515 | 5600 | 0.0805 | - |
574
+ | 0.0525 | 5700 | 0.0785 | - |
575
+ | 0.0534 | 5800 | 0.0769 | - |
576
+ | 0.0543 | 5900 | 0.0746 | - |
577
+ | 0.0552 | 6000 | 0.0754 | - |
578
+ | 0.0562 | 6100 | 0.0715 | - |
579
+ | 0.0571 | 6200 | 0.0707 | - |
580
+ | 0.0580 | 6300 | 0.0699 | - |
581
+ | 0.0589 | 6400 | 0.0678 | - |
582
+ | 0.0598 | 6500 | 0.0659 | - |
583
+ | 0.0608 | 6600 | 0.0659 | - |
584
+ | 0.0617 | 6700 | 0.0646 | - |
585
+ | 0.0626 | 6800 | 0.0627 | - |
586
+ | 0.0635 | 6900 | 0.0627 | - |
587
+ | 0.0644 | 7000 | 0.0604 | - |
588
+ | 0.0654 | 7100 | 0.0592 | - |
589
+ | 0.0663 | 7200 | 0.059 | - |
590
+ | 0.0672 | 7300 | 0.0577 | - |
591
+ | 0.0681 | 7400 | 0.0568 | - |
592
+ | 0.0690 | 7500 | 0.0558 | - |
593
+ | 0.0700 | 7600 | 0.0552 | - |
594
+ | 0.0709 | 7700 | 0.0542 | - |
595
+ | 0.0718 | 7800 | 0.0531 | - |
596
+ | 0.0727 | 7900 | 0.0528 | - |
597
+ | 0.0736 | 8000 | 0.0526 | - |
598
+ | 0.0746 | 8100 | 0.0509 | - |
599
+ | 0.0755 | 8200 | 0.05 | - |
600
+ | 0.0764 | 8300 | 0.0495 | - |
601
+ | 0.0773 | 8400 | 0.0486 | - |
602
+ | 0.0782 | 8500 | 0.0482 | - |
603
+ | 0.0792 | 8600 | 0.048 | - |
604
+ | 0.0801 | 8700 | 0.0468 | - |
605
+ | 0.0810 | 8800 | 0.0461 | - |
606
+ | 0.0819 | 8900 | 0.0459 | - |
607
+ | 0.0828 | 9000 | 0.0453 | - |
608
+ | 0.0838 | 9100 | 0.0442 | - |
609
+ | 0.0847 | 9200 | 0.0443 | - |
610
+ | 0.0856 | 9300 | 0.0437 | - |
611
+ | 0.0865 | 9400 | 0.0435 | - |
612
+ | 0.0874 | 9500 | 0.0426 | - |
613
+ | 0.0884 | 9600 | 0.042 | - |
614
+ | 0.0893 | 9700 | 0.0423 | - |
615
+ | 0.0902 | 9800 | 0.0406 | - |
616
+ | 0.0911 | 9900 | 0.0405 | - |
617
+ | 0.0920 | 10000 | 0.0397 | - |
618
+ | 0.0930 | 10100 | 0.0401 | - |
619
+ | 0.0939 | 10200 | 0.0392 | - |
620
+ | 0.0948 | 10300 | 0.0396 | - |
621
+ | 0.0957 | 10400 | 0.0391 | - |
622
+ | 0.0967 | 10500 | 0.0384 | - |
623
+ | 0.0976 | 10600 | 0.0377 | - |
624
+ | 0.0985 | 10700 | 0.0379 | - |
625
+ | 0.0994 | 10800 | 0.0372 | - |
626
+ | 0.1003 | 10900 | 0.0364 | - |
627
+ | 0.1013 | 11000 | 0.0367 | - |
628
+ | 0.1022 | 11100 | 0.0359 | - |
629
+ | 0.1031 | 11200 | 0.0355 | - |
630
+ | 0.1040 | 11300 | 0.0358 | - |
631
+ | 0.1049 | 11400 | 0.035 | - |
632
+ | 0.1059 | 11500 | 0.0353 | - |
633
+ | 0.1068 | 11600 | 0.0341 | - |
634
+ | 0.1077 | 11700 | 0.0343 | - |
635
+ | 0.1086 | 11800 | 0.034 | - |
636
+ | 0.1095 | 11900 | 0.0334 | - |
637
+ | 0.1105 | 12000 | 0.0337 | - |
638
+ | 0.1114 | 12100 | 0.0332 | - |
639
+ | 0.1123 | 12200 | 0.0323 | - |
640
+ | 0.1132 | 12300 | 0.0323 | - |
641
+ | 0.1141 | 12400 | 0.0322 | - |
642
+ | 0.1151 | 12500 | 0.0312 | - |
643
+ | 0.1160 | 12600 | 0.0307 | - |
644
+ | 0.1169 | 12700 | 0.0314 | - |
645
+ | 0.1178 | 12800 | 0.0309 | - |
646
+ | 0.1187 | 12900 | 0.0313 | - |
647
+ | 0.1197 | 13000 | 0.0306 | - |
648
+ | 0.1206 | 13100 | 0.0303 | - |
649
+ | 0.1215 | 13200 | 0.0301 | - |
650
+ | 0.1224 | 13300 | 0.0302 | - |
651
+ | 0.1233 | 13400 | 0.0296 | - |
652
+ | 0.1243 | 13500 | 0.029 | - |
653
+ | 0.1252 | 13600 | 0.0288 | - |
654
+ | 0.1261 | 13700 | 0.0286 | - |
655
+ | 0.1270 | 13800 | 0.0291 | - |
656
+ | 0.1279 | 13900 | 0.0287 | - |
657
+ | 0.1289 | 14000 | 0.0284 | - |
658
+ | 0.1298 | 14100 | 0.0276 | - |
659
+ | 0.1307 | 14200 | 0.028 | - |
660
+ | 0.1316 | 14300 | 0.0275 | - |
661
+ | 0.1326 | 14400 | 0.0269 | - |
662
+ | 0.1335 | 14500 | 0.027 | - |
663
+ | 0.1344 | 14600 | 0.0273 | - |
664
+ | 0.1353 | 14700 | 0.0267 | - |
665
+ | 0.1362 | 14800 | 0.0263 | - |
666
+ | 0.1372 | 14900 | 0.0264 | - |
667
+ | 0.1381 | 15000 | 0.0263 | - |
668
+ | 0.1390 | 15100 | 0.0262 | - |
669
+ | 0.1399 | 15200 | 0.0256 | - |
670
+ | 0.1408 | 15300 | 0.0254 | - |
671
+ | 0.1418 | 15400 | 0.0257 | - |
672
+ | 0.1427 | 15500 | 0.0251 | - |
673
+ | 0.1436 | 15600 | 0.0253 | - |
674
+ | 0.1445 | 15700 | 0.0247 | - |
675
+ | 0.1454 | 15800 | 0.0251 | - |
676
+ | 0.1464 | 15900 | 0.0245 | - |
677
+ | 0.1473 | 16000 | 0.0246 | - |
678
+ | 0.1482 | 16100 | 0.024 | - |
679
+ | 0.1491 | 16200 | 0.0241 | - |
680
+ | 0.1500 | 16300 | 0.0243 | - |
681
+ | 0.1510 | 16400 | 0.0235 | - |
682
+ | 0.1519 | 16500 | 0.024 | - |
683
+ | 0.1528 | 16600 | 0.0236 | - |
684
+ | 0.1537 | 16700 | 0.0233 | - |
685
+ | 0.1546 | 16800 | 0.0237 | - |
686
+ | 0.1556 | 16900 | 0.023 | - |
687
+ | 0.1565 | 17000 | 0.0233 | - |
688
+ | 0.1574 | 17100 | 0.0229 | - |
689
+ | 0.1583 | 17200 | 0.0227 | - |
690
+ | 0.1592 | 17300 | 0.023 | - |
691
+ | 0.1602 | 17400 | 0.0232 | - |
692
+ | 0.1611 | 17500 | 0.0221 | - |
693
+ | 0.1620 | 17600 | 0.0217 | - |
694
+ | 0.1629 | 17700 | 0.0224 | - |
695
+ | 0.1638 | 17800 | 0.0217 | - |
696
+ | 0.1648 | 17900 | 0.0219 | - |
697
+ | 0.1657 | 18000 | 0.0216 | - |
698
+ | 0.1666 | 18100 | 0.0214 | - |
699
+ | 0.1675 | 18200 | 0.0213 | - |
700
+ | 0.1685 | 18300 | 0.0215 | - |
701
+ | 0.1694 | 18400 | 0.0211 | - |
702
+ | 0.1703 | 18500 | 0.0213 | - |
703
+ | 0.1712 | 18600 | 0.0211 | - |
704
+ | 0.1721 | 18700 | 0.0212 | - |
705
+ | 0.1731 | 18800 | 0.0204 | - |
706
+ | 0.1740 | 18900 | 0.0206 | - |
707
+ | 0.1749 | 19000 | 0.021 | - |
708
+ | 0.1758 | 19100 | 0.0208 | - |
709
+ | 0.1767 | 19200 | 0.0202 | - |
710
+ | 0.1777 | 19300 | 0.0199 | - |
711
+ | 0.1786 | 19400 | 0.0204 | - |
712
+ | 0.1795 | 19500 | 0.0199 | - |
713
+ | 0.1804 | 19600 | 0.0196 | - |
714
+ | 0.1813 | 19700 | 0.0198 | - |
715
+ | 0.1823 | 19800 | 0.0199 | - |
716
+ | 0.1832 | 19900 | 0.0194 | - |
717
+ | 0.1841 | 20000 | 0.0191 | - |
718
+ | 0.1850 | 20100 | 0.0193 | - |
719
+ | 0.1859 | 20200 | 0.0193 | - |
720
+ | 0.1869 | 20300 | 0.0192 | - |
721
+ | 0.1878 | 20400 | 0.0192 | - |
722
+ | 0.1887 | 20500 | 0.0188 | - |
723
+ | 0.1896 | 20600 | 0.0183 | - |
724
+ | 0.1905 | 20700 | 0.0186 | - |
725
+ | 0.1915 | 20800 | 0.0182 | - |
726
+ | 0.1924 | 20900 | 0.0184 | - |
727
+ | 0.1933 | 21000 | 0.0187 | - |
728
+ | 0.1942 | 21100 | 0.0184 | - |
729
+ | 0.1951 | 21200 | 0.0183 | - |
730
+ | 0.1961 | 21300 | 0.0181 | - |
731
+ | 0.1970 | 21400 | 0.0178 | - |
732
+ | 0.1979 | 21500 | 0.0179 | - |
733
+ | 0.1988 | 21600 | 0.018 | - |
734
+ | 0.1997 | 21700 | 0.0185 | - |
735
+ | 0.2000 | 21728 | - | 0.0098 |
736
+ | 0.2007 | 21800 | 0.0176 | - |
737
+ | 0.2016 | 21900 | 0.0183 | - |
738
+ | 0.2025 | 22000 | 0.0174 | - |
739
+ | 0.2034 | 22100 | 0.0179 | - |
740
+ | 0.2044 | 22200 | 0.0175 | - |
741
+ | 0.2053 | 22300 | 0.0175 | - |
742
+ | 0.2062 | 22400 | 0.0172 | - |
743
+ | 0.2071 | 22500 | 0.0173 | - |
744
+ | 0.2080 | 22600 | 0.017 | - |
745
+ | 0.2090 | 22700 | 0.0167 | - |
746
+ | 0.2099 | 22800 | 0.0164 | - |
747
+ | 0.2108 | 22900 | 0.0167 | - |
748
+ | 0.2117 | 23000 | 0.0165 | - |
749
+ | 0.2126 | 23100 | 0.0171 | - |
750
+ | 0.2136 | 23200 | 0.0169 | - |
751
+ | 0.2145 | 23300 | 0.0164 | - |
752
+ | 0.2154 | 23400 | 0.0162 | - |
753
+ | 0.2163 | 23500 | 0.0164 | - |
754
+ | 0.2172 | 23600 | 0.0164 | - |
755
+ | 0.2182 | 23700 | 0.0166 | - |
756
+ | 0.2191 | 23800 | 0.0163 | - |
757
+ | 0.2200 | 23900 | 0.0164 | - |
758
+ | 0.2209 | 24000 | 0.0165 | - |
759
+ | 0.2218 | 24100 | 0.0163 | - |
760
+ | 0.2228 | 24200 | 0.0162 | - |
761
+ | 0.2237 | 24300 | 0.0163 | - |
762
+ | 0.2246 | 24400 | 0.0157 | - |
763
+ | 0.2255 | 24500 | 0.0157 | - |
764
+ | 0.2264 | 24600 | 0.0158 | - |
765
+ | 0.2274 | 24700 | 0.0153 | - |
766
+ | 0.2283 | 24800 | 0.0156 | - |
767
+ | 0.2292 | 24900 | 0.0155 | - |
768
+ | 0.2301 | 25000 | 0.0156 | - |
769
+ | 0.2310 | 25100 | 0.0154 | - |
770
+ | 0.2320 | 25200 | 0.0151 | - |
771
+ | 0.2329 | 25300 | 0.0153 | - |
772
+ | 0.2338 | 25400 | 0.015 | - |
773
+ | 0.2347 | 25500 | 0.0153 | - |
774
+ | 0.2356 | 25600 | 0.015 | - |
775
+ | 0.2366 | 25700 | 0.0152 | - |
776
+ | 0.2375 | 25800 | 0.0147 | - |
777
+ | 0.2384 | 25900 | 0.0148 | - |
778
+ | 0.2393 | 26000 | 0.0148 | - |
779
+ | 0.2402 | 26100 | 0.0144 | - |
780
+ | 0.2412 | 26200 | 0.0146 | - |
781
+ | 0.2421 | 26300 | 0.0143 | - |
782
+ | 0.2430 | 26400 | 0.0143 | - |
783
+ | 0.2439 | 26500 | 0.0145 | - |
784
+ | 0.2449 | 26600 | 0.0142 | - |
785
+ | 0.2458 | 26700 | 0.0142 | - |
786
+ | 0.2467 | 26800 | 0.0143 | - |
787
+ | 0.2476 | 26900 | 0.0139 | - |
788
+ | 0.2485 | 27000 | 0.0141 | - |
789
+ | 0.2495 | 27100 | 0.0141 | - |
790
+ | 0.2504 | 27200 | 0.0143 | - |
791
+ | 0.2513 | 27300 | 0.0141 | - |
792
+ | 0.2522 | 27400 | 0.014 | - |
793
+ | 0.2531 | 27500 | 0.0137 | - |
794
+ | 0.2541 | 27600 | 0.014 | - |
795
+ | 0.2550 | 27700 | 0.0139 | - |
796
+ | 0.2559 | 27800 | 0.0138 | - |
797
+ | 0.2568 | 27900 | 0.0141 | - |
798
+ | 0.2577 | 28000 | 0.0138 | - |
799
+ | 0.2587 | 28100 | 0.0138 | - |
800
+ | 0.2596 | 28200 | 0.0134 | - |
801
+ | 0.2605 | 28300 | 0.0135 | - |
802
+ | 0.2614 | 28400 | 0.0131 | - |
803
+ | 0.2623 | 28500 | 0.0133 | - |
804
+ | 0.2633 | 28600 | 0.0132 | - |
805
+ | 0.2642 | 28700 | 0.0133 | - |
806
+ | 0.2651 | 28800 | 0.0131 | - |
807
+ | 0.2660 | 28900 | 0.013 | - |
808
+ | 0.2669 | 29000 | 0.0131 | - |
809
+ | 0.2679 | 29100 | 0.013 | - |
810
+ | 0.2688 | 29200 | 0.0135 | - |
811
+ | 0.2697 | 29300 | 0.0131 | - |
812
+ | 0.2706 | 29400 | 0.0134 | - |
813
+ | 0.2715 | 29500 | 0.0131 | - |
814
+ | 0.2725 | 29600 | 0.0129 | - |
815
+ | 0.2734 | 29700 | 0.0127 | - |
816
+ | 0.2743 | 29800 | 0.0128 | - |
817
+ | 0.2752 | 29900 | 0.0125 | - |
818
+ | 0.2761 | 30000 | 0.0127 | - |
819
+ | 0.2771 | 30100 | 0.0126 | - |
820
+ | 0.2780 | 30200 | 0.0124 | - |
821
+ | 0.2789 | 30300 | 0.0126 | - |
822
+ | 0.2798 | 30400 | 0.0126 | - |
823
+ | 0.2808 | 30500 | 0.0122 | - |
824
+ | 0.2817 | 30600 | 0.0124 | - |
825
+ | 0.2826 | 30700 | 0.0123 | - |
826
+ | 0.2835 | 30800 | 0.0126 | - |
827
+ | 0.2844 | 30900 | 0.0123 | - |
828
+ | 0.2854 | 31000 | 0.012 | - |
829
+ | 0.2863 | 31100 | 0.012 | - |
830
+ | 0.2872 | 31200 | 0.0123 | - |
831
+ | 0.2881 | 31300 | 0.0122 | - |
832
+ | 0.2890 | 31400 | 0.0121 | - |
833
+ | 0.2900 | 31500 | 0.0124 | - |
834
+ | 0.2909 | 31600 | 0.0117 | - |
835
+ | 0.2918 | 31700 | 0.0118 | - |
836
+ | 0.2927 | 31800 | 0.0121 | - |
837
+ | 0.2936 | 31900 | 0.0119 | - |
838
+ | 0.2946 | 32000 | 0.0115 | - |
839
+ | 0.2955 | 32100 | 0.0117 | - |
840
+ | 0.2964 | 32200 | 0.012 | - |
841
+ | 0.2973 | 32300 | 0.0118 | - |
842
+ | 0.2982 | 32400 | 0.0117 | - |
843
+ | 0.2992 | 32500 | 0.0119 | - |
844
+ | 0.3001 | 32600 | 0.0118 | - |
845
+ | 0.3010 | 32700 | 0.0115 | - |
846
+ | 0.3019 | 32800 | 0.012 | - |
847
+ | 0.3028 | 32900 | 0.0119 | - |
848
+ | 0.3038 | 33000 | 0.0113 | - |
849
+ | 0.3047 | 33100 | 0.0117 | - |
850
+ | 0.3056 | 33200 | 0.0117 | - |
851
+ | 0.3065 | 33300 | 0.0113 | - |
852
+ | 0.3074 | 33400 | 0.0113 | - |
853
+ | 0.3084 | 33500 | 0.0113 | - |
854
+ | 0.3093 | 33600 | 0.0117 | - |
855
+ | 0.3102 | 33700 | 0.0111 | - |
856
+ | 0.3111 | 33800 | 0.0112 | - |
857
+ | 0.3120 | 33900 | 0.0113 | - |
858
+ | 0.3130 | 34000 | 0.0111 | - |
859
+ | 0.3139 | 34100 | 0.0113 | - |
860
+ | 0.3148 | 34200 | 0.0115 | - |
861
+ | 0.3157 | 34300 | 0.0114 | - |
862
+ | 0.3167 | 34400 | 0.0109 | - |
863
+ | 0.3176 | 34500 | 0.0112 | - |
864
+ | 0.3185 | 34600 | 0.0109 | - |
865
+ | 0.3194 | 34700 | 0.011 | - |
866
+ | 0.3203 | 34800 | 0.0108 | - |
867
+ | 0.3213 | 34900 | 0.0108 | - |
868
+ | 0.3222 | 35000 | 0.0107 | - |
869
+ | 0.3231 | 35100 | 0.0109 | - |
870
+ | 0.3240 | 35200 | 0.0108 | - |
871
+ | 0.3249 | 35300 | 0.0108 | - |
872
+ | 0.3259 | 35400 | 0.0108 | - |
873
+ | 0.3268 | 35500 | 0.0105 | - |
874
+ | 0.3277 | 35600 | 0.0106 | - |
875
+ | 0.3286 | 35700 | 0.0105 | - |
876
+ | 0.3295 | 35800 | 0.0104 | - |
877
+ | 0.3305 | 35900 | 0.0107 | - |
878
+ | 0.3314 | 36000 | 0.0105 | - |
879
+ | 0.3323 | 36100 | 0.0103 | - |
880
+ | 0.3332 | 36200 | 0.0105 | - |
881
+ | 0.3341 | 36300 | 0.0103 | - |
882
+ | 0.3351 | 36400 | 0.0107 | - |
883
+ | 0.3360 | 36500 | 0.0101 | - |
884
+ | 0.3369 | 36600 | 0.0102 | - |
885
+ | 0.3378 | 36700 | 0.0102 | - |
886
+ | 0.3387 | 36800 | 0.0102 | - |
887
+ | 0.3397 | 36900 | 0.01 | - |
888
+ | 0.3406 | 37000 | 0.0103 | - |
889
+ | 0.3415 | 37100 | 0.0103 | - |
890
+ | 0.3424 | 37200 | 0.01 | - |
891
+ | 0.3433 | 37300 | 0.0103 | - |
892
+ | 0.3443 | 37400 | 0.0103 | - |
893
+ | 0.3452 | 37500 | 0.0104 | - |
894
+ | 0.3461 | 37600 | 0.0098 | - |
895
+ | 0.3470 | 37700 | 0.0099 | - |
896
+ | 0.3479 | 37800 | 0.0102 | - |
897
+ | 0.3489 | 37900 | 0.0102 | - |
898
+ | 0.3498 | 38000 | 0.01 | - |
899
+ | 0.3507 | 38100 | 0.0101 | - |
900
+ | 0.3516 | 38200 | 0.01 | - |
901
+ | 0.3526 | 38300 | 0.0098 | - |
902
+ | 0.3535 | 38400 | 0.0097 | - |
903
+ | 0.3544 | 38500 | 0.0096 | - |
904
+ | 0.3553 | 38600 | 0.01 | - |
905
+ | 0.3562 | 38700 | 0.0097 | - |
906
+ | 0.3572 | 38800 | 0.0101 | - |
907
+ | 0.3581 | 38900 | 0.0099 | - |
908
+ | 0.3590 | 39000 | 0.0099 | - |
909
+ | 0.3599 | 39100 | 0.01 | - |
910
+ | 0.3608 | 39200 | 0.0094 | - |
911
+ | 0.3618 | 39300 | 0.0096 | - |
912
+ | 0.3627 | 39400 | 0.0095 | - |
913
+ | 0.3636 | 39500 | 0.0094 | - |
914
+ | 0.3645 | 39600 | 0.0094 | - |
915
+ | 0.3654 | 39700 | 0.0094 | - |
916
+ | 0.3664 | 39800 | 0.0096 | - |
917
+ | 0.3673 | 39900 | 0.0095 | - |
918
+ | 0.3682 | 40000 | 0.0096 | - |
919
+ | 0.3691 | 40100 | 0.0096 | - |
920
+ | 0.3700 | 40200 | 0.0094 | - |
921
+ | 0.3710 | 40300 | 0.0093 | - |
922
+ | 0.3719 | 40400 | 0.0092 | - |
923
+ | 0.3728 | 40500 | 0.0095 | - |
924
+ | 0.3737 | 40600 | 0.0091 | - |
925
+ | 0.3746 | 40700 | 0.0098 | - |
926
+ | 0.3756 | 40800 | 0.0094 | - |
927
+ | 0.3765 | 40900 | 0.0092 | - |
928
+ | 0.3774 | 41000 | 0.0094 | - |
929
+ | 0.3783 | 41100 | 0.0092 | - |
930
+ | 0.3792 | 41200 | 0.0093 | - |
931
+ | 0.3802 | 41300 | 0.0092 | - |
932
+ | 0.3811 | 41400 | 0.0095 | - |
933
+ | 0.3820 | 41500 | 0.0094 | - |
934
+ | 0.3829 | 41600 | 0.0089 | - |
935
+ | 0.3838 | 41700 | 0.009 | - |
936
+ | 0.3848 | 41800 | 0.0092 | - |
937
+ | 0.3857 | 41900 | 0.009 | - |
938
+ | 0.3866 | 42000 | 0.0089 | - |
939
+ | 0.3875 | 42100 | 0.0091 | - |
940
+ | 0.3884 | 42200 | 0.0087 | - |
941
+ | 0.3894 | 42300 | 0.0091 | - |
942
+ | 0.3903 | 42400 | 0.0089 | - |
943
+ | 0.3912 | 42500 | 0.0089 | - |
944
+ | 0.3921 | 42600 | 0.0089 | - |
945
+ | 0.3931 | 42700 | 0.0087 | - |
946
+ | 0.3940 | 42800 | 0.009 | - |
947
+ | 0.3949 | 42900 | 0.0087 | - |
948
+ | 0.3958 | 43000 | 0.0089 | - |
949
+ | 0.3967 | 43100 | 0.0088 | - |
950
+ | 0.3977 | 43200 | 0.0088 | - |
951
+ | 0.3986 | 43300 | 0.0089 | - |
952
+ | 0.3995 | 43400 | 0.0088 | - |
953
+ | 0.4000 | 43456 | - | 0.0047 |
954
+
955
+ </details>
956
+
957
  ### Framework Versions
958
+ - Python: 3.11.10
959
+ - Sentence Transformers: 5.1.2
960
+ - Transformers: 4.57.1
961
+ - PyTorch: 2.4.1+cu121
962
+ - Accelerate: 1.11.0
963
+ - Datasets: 4.3.0
964
+ - Tokenizers: 0.22.1
965
 
966
  ## Citation
967
 
968
  ### BibTeX
969
 
970
+ #### Sentence Transformers
971
+ ```bibtex
972
+ @inproceedings{reimers-2019-sentence-bert,
973
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
974
+ author = "Reimers, Nils and Gurevych, Iryna",
975
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
976
+ month = "11",
977
+ year = "2019",
978
+ publisher = "Association for Computational Linguistics",
979
+ url = "https://arxiv.org/abs/1908.10084",
980
+ }
981
+ ```
982
+
983
+ #### MultipleNegativesRankingLoss
984
+ ```bibtex
985
+ @misc{henderson2017efficient,
986
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
987
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
988
+ year={2017},
989
+ eprint={1705.00652},
990
+ archivePrefix={arXiv},
991
+ primaryClass={cs.CL}
992
+ }
993
+ ```
994
+
995
  <!--
996
  ## Glossary
997
 
config.json CHANGED
@@ -40,7 +40,7 @@
40
  "sep_token_id": 50282,
41
  "sparse_pred_ignore_index": -100,
42
  "sparse_prediction": false,
43
- "torch_dtype": "bfloat16",
44
  "transformers_version": "4.53.3",
45
  "vocab_size": 50368
46
  }
 
40
  "sep_token_id": 50282,
41
  "sparse_pred_ignore_index": -100,
42
  "sparse_prediction": false,
43
+ "torch_dtype": "float32",
44
  "transformers_version": "4.53.3",
45
  "vocab_size": 50368
46
  }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:fc7a79dbf02221230a7f7753547e0bf0cbcb28969f6e16bca7ba94253d417065
3
- size 298041696
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c39c43f445292f7d8511f7e046391472b822d804bec90a7baa1dac64cf5bf246
3
+ size 596070136
sentence_bert_config.json CHANGED
@@ -1,4 +1,4 @@
1
  {
2
- "max_seq_length": 2048,
3
  "do_lower_case": false
4
  }
 
1
  {
2
+ "max_seq_length": 1024,
3
  "do_lower_case": false
4
  }
tokenizer.json CHANGED
@@ -2,7 +2,7 @@
2
  "version": "1.0",
3
  "truncation": {
4
  "direction": "Right",
5
- "max_length": 2048,
6
  "strategy": "LongestFirst",
7
  "stride": 0
8
  },
 
2
  "version": "1.0",
3
  "truncation": {
4
  "direction": "Right",
5
+ "max_length": 1024,
6
  "strategy": "LongestFirst",
7
  "stride": 0
8
  },
tokenizer_config.json CHANGED
@@ -933,12 +933,12 @@
933
  "cls_token": "[CLS]",
934
  "extra_special_tokens": {},
935
  "mask_token": "[MASK]",
936
- "max_length": 2048,
937
  "model_input_names": [
938
  "input_ids",
939
  "attention_mask"
940
  ],
941
- "model_max_length": 2048,
942
  "pad_to_multiple_of": null,
943
  "pad_token": "[PAD]",
944
  "pad_token_type_id": 0,
 
933
  "cls_token": "[CLS]",
934
  "extra_special_tokens": {},
935
  "mask_token": "[MASK]",
936
+ "max_length": 1024,
937
  "model_input_names": [
938
  "input_ids",
939
  "attention_mask"
940
  ],
941
+ "model_max_length": 1024,
942
  "pad_to_multiple_of": null,
943
  "pad_token": "[PAD]",
944
  "pad_token_type_id": 0,