hotchpotch commited on
Commit
5764e57
·
verified ·
1 Parent(s): 518bcc9

Add new SentenceTransformer model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,1378 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ tags:
5
+ - sentence-transformers
6
+ - sentence-similarity
7
+ - feature-extraction
8
+ - dense
9
+ - generated_from_trainer
10
+ - dataset_size:4314846
11
+ - loss:CachedMultipleNegativesRankingLoss
12
+ base_model: answerdotai/ModernBERT-base
13
+ widget:
14
+ - source_sentence: what is grade 7 gcse equivalent to?
15
+ sentences:
16
+ - Unlike the Google Home Mini (First Gen), the Nest Mini (Second Gen) can be used
17
+ to actually enjoy music in every room of the house. While the Google Home Mini
18
+ (First Gen) is a decent way to get music in every room of your home for cheap,
19
+ the sound quality that comes from the speaker reflects the price of the product.
20
+ - In general, a grade 7-9 is roughly equivalent to A-A* under the old system, while
21
+ a grade 4 and above is roughly equivalent to a C and above. Fewer students will
22
+ receive a grade 9 than would have received an A* under the old grading system.
23
+ - '[''Pulling at a wet or dirty diaper.'', ''Hiding to pee or poop.'', "Interest
24
+ in others'' use of the potty, or copying their behavior.", ''Having a dry diaper
25
+ for a longer-than-usual time.'', ''Awakening dry from a nap.'', "Telling you that
26
+ they''re about to go, are going or have just gone in their diaper."]'
27
+ - source_sentence: Desire For Sex Drops As You Age, But You Can Still Have A Satisfactory
28
+ Sex Life
29
+ sentences:
30
+ - 'ADVERTISEMENT
31
+
32
+ Those who have been in long-term relationships know that sex can start to fall
33
+ by the wayside the longer you''re together.
34
+
35
+ Whether you have children, a busy career, an active social life, a job that takes
36
+ you away from home often, or a chronic illness, there are plenty of reasons why
37
+ couples have less sex compared to when they first started dating.
38
+
39
+ And it''s not just stuff like that that''s keeping you away from fun between the
40
+ sheets; according to research from the Kinsey Institute, age plays a factor in
41
+ your sex drive, for both men and women.
42
+
43
+ Unsurprisingly, younger people are having the most sex compared to other age groups.
44
+
45
+ Those aged 18 to 29 years old are having sex an average of 112 times a year (about
46
+ every three days), and, as Indy100 notes, most people lose their virginity when
47
+ they''re teenagers, with men having sex for the first time around 16.8 years,
48
+ and women losing theirs at 17.2 years.
49
+
50
+ By comparison, 30 to 39-year-olds have sex on average 86 times a year, which is
51
+ around 1.6 times per week.
52
+
53
+ The study notes that this drop-off coincides with the age people choose to start
54
+ having children, which, as parents know, can really kill the mood, especially
55
+ if there''s a baby crying at the exact same time you feel like getting it on.
56
+ (Which is most likely in the morning.)
57
+
58
+ And it only lessens the older you get. Those who are in their 40s have sex an
59
+ average of 69 times a year, due to factors such as family obligations, day-to-day
60
+ stresses, and possible illnesses.
61
+
62
+ "The basic storyline that has emerged from these studies is that, as we get older,
63
+ our odds of developing chronic health conditions increases and this, in turn,
64
+ negatively impacts the frequency and quality of sexual activity," notes Dr. Justin
65
+ Lehmiller of the Kinsey Institute.
66
+
67
+ Unfortunately, the study didn''t look into the sex lives of those 50 and older,
68
+ but there is other research out there. According to a study published in the Archives
69
+ of Sexual Behavior, couples who have been married for more than 25 years have
70
+ a 40 per cent chance of having sex two or three times a week, but that statistic
71
+ drops to 35 per cent for couples who have been married for 50 or more years.
72
+
73
+ Surprisingly, couples who have been together for 65 years are 42 per cent more
74
+ likely to have sex a couple times a week.
75
+
76
+ As we get older, our odds of developing chronic health conditions increases and
77
+ this, in turn, negatively impacts the frequency and quality of sexual activity.
78
+
79
+ According to a study published in the Journal of Sex Research, those who "feel
80
+ their age" tended to have less sex, while those who remained in better health
81
+ had more active and satisfying sex lives.
82
+
83
+ "The younger people feel, the more likely they are to maintain high sexual satisfaction
84
+ as they get older (or at least they''ll experience a much less noticeable change),"
85
+ wrote Lehmiller.
86
+
87
+ It''s worth noting that these study results come from a small sample of the population,
88
+ and it shouldn''t be the standard for how much sex we should be having.
89
+
90
+ However, there is plenty of research that backs up the claim that sex is great
91
+ for one''s health, so the more you get busy, the better!
92
+
93
+ Also on HuffPost:'
94
+ - 'HONOLULU — A former Hawaii state worker who sent a false missile alert last month
95
+ said Friday that he''s devastated for causing panic but was "100 per cent sure"
96
+ at the time that the attack was real.
97
+
98
+ The man in his 50s spoke to reporters on the condition that he not be identified
99
+ because he fears for his safety after receiving threats.
100
+
101
+ He says the on-duty call he received on Jan. 13 didn''t sound like a drill. However,
102
+ state officials say other workers clearly heard the word "exercise" repeated several
103
+ times.
104
+
105
+ He said it felt like he had been hit with a "body blow" when he realized it was
106
+ just a drill and he has had difficulty eating and sleeping since.
107
+
108
+ The Hawaii Emergency Management Agency fired him.
109
+
110
+ The man''s superiors said they knew for years that he had problems performing
111
+ his job. The worker had mistakenly believed drills for tsunami and fire warnings
112
+ were actual events, and colleagues were not comfortable working with him, the
113
+ state said.
114
+
115
+ His supervisors counselled him but kept him for a decade in a position that had
116
+ to be renewed each year.
117
+
118
+ The ex-worker disputed that, saying he wasn''t aware of any performance problems.
119
+
120
+ While working at the state warning site in a former bunker in Honolulu''s Diamond
121
+ Head crater on Jan. 13, the man said, he took a call that sounded like a real
122
+ warning from U.S Pacific Command. He said he didn''t hear that it was a drill.
123
+
124
+ But the problems at the agency went beyond the one employee.
125
+
126
+ Federal and state reports say the agency had a vague checklist for missile alerts,
127
+ allowing workers to interpret the steps they should follow differently. Managers
128
+ didn''t require a second person to sign off on alerts before they were sent, and
129
+ the agency lacked any preparation on how to correct a false warning.
130
+
131
+ Those details emerged Tuesday in reports on investigations about how the agency
132
+ mistakenly blasted cellphones and broadcast stations with the missile warning.
133
+
134
+ It took nearly 40 minutes for the agency to figure out a way to retract the false
135
+ alert on the same platforms it was sent to.
136
+
137
+ "The protocols were not in place. It was a sense of urgency to put it in place
138
+ as soon as possible. But those protocols were not developed to the point they
139
+ should have," retired Brig. Gen. Bruce Oliveira, who wrote the report on Hawaii''s
140
+ internal investigation, said at a news conference.
141
+
142
+ Hawaii Emergency Management Agency Administrator Vern Miyagi resigned as the reports
143
+ were released. Officials revealed that the employee who sent the alert was fired
144
+ Jan. 26. The state did not name him.
145
+
146
+ The agency''s executive officer, Toby Clairmont, said Wednesday that he stepped
147
+ down because it was clear action would be taken against agency leaders after the
148
+ alert.'
149
+ - 'Pompeii’s Final Hours: New Evidence (C5)
150
+
151
+ Rating:
152
+
153
+ The Big Crash Diet Experiment (BBC1)
154
+
155
+ Rating:
156
+
157
+ With his rosy cheeks and nose, and a crown of laurel leaves drooping over one
158
+ eye, former political journalist John Sergeant looked like jolly little Bacchus,
159
+ the Roman god of wine, as he tucked into an ancient feast on Pompeii’s Final Hours:
160
+ New Evidence (C5).
161
+
162
+ A game soul, whether strutting the pasa doble on Strictly or bartering in a Naples
163
+ marketplace, John munched fried sea urchins and braised moray eel — with plenty
164
+ of red vino to slosh the taste away.
165
+
166
+ He did blanch at the thought of bulls’ testicles stuffed with pepper and herbs.
167
+
168
+ John Sergeant on an hour-long archaeological romp in Pompeii’s Final Hours: New
169
+ Evidence
170
+
171
+ Apparently this delicacy was a great favourite in Pompeii — but then, the decadent
172
+ Romans drenched every meal in lashings of garum, a sauce made from rotting fish.
173
+ Anything would taste better than that.
174
+
175
+ Noble as Brutus, John held his nose and chewed a mouthful of cobbler. ‘I wouldn’t
176
+ have it every night,’ he muttered.
177
+
178
+ It’s an astonishing thought that Julius Caesar conquered most of the known world,
179
+ when he must have been suffering from chronic indigestion.
180
+
181
+ Imagine what the Romans might have done if they’d invented the pizza a couple
182
+ of thousand years earlier.
183
+
184
+ This hour-long archaeological romp was the first of three surveys of life in the
185
+ shadow of Vesuvius, set to continue tonight and tomorrow.
186
+
187
+ The ‘new evidence’ in the title came from computer X-ray scans of some of Pompeii’s
188
+ famous casts.
189
+
190
+ These detailed figurines were created by the 19th-century archaeologist Giuseppe
191
+ Fiorelli, who injected liquid plaster into the cavities where Roman bodies had
192
+ been buried by ash in the volcanic eruption in AD79.
193
+
194
+ Fiorelli’s casts are the most moving and tragic death masks ever made. Every plaster
195
+ corpse is writhing in agony, suffocated by poisonous gases.
196
+
197
+ For 150 years, the victims’ skeletal remains have been locked in their cases.
198
+ It is only now that the technology exists to examine the bones without destroying
199
+ the casts.
200
+
201
+ What the first CT scans revealed swept old theories away. One figure long believed
202
+ to be a man appeared, in fact, to be female.
203
+
204
+ Another, thought for decades to be a male gladiator in his prime, turned out to
205
+ be a teenage boy.
206
+
207
+ Presenters Bettany Hughes and Raksha Dave didn’t make enough of these dramatic
208
+ finds. The CT results were held back to the end of the hour, so that the discoveries
209
+ were inevitably rushed.
210
+
211
+ Dr Javid Abdelmoneim in The Big Crash Diet Experiment challenges conventional
212
+ wisdom on food and exercise
213
+
214
+ Don’t blame John Sergeant, though. While the others were in the lab, he was still
215
+ polishing off his meal of eels and urchins. Say what you like, this man believes
216
+ in doing his research.
217
+
218
+ After that, he’d probably welcome a few days of starvation. The powdered soups
219
+ and shakes fed to four slimmers by Dr Javid Abdelmoneim in The Big Crash Diet
220
+ Experiment (BBC1) looked worse than any classical culinary torture, though.
221
+
222
+ To challenge conventional wisdom that brief bursts of intensive dieting rarely
223
+ bring long-term results, Dr Javid had his guinea pigs living on 800 calories a
224
+ day for nine weeks.
225
+
226
+ All lost plenty of weight. But it was the switch to healthy-eating afterwards
227
+ that seemed to bring the best results.
228
+
229
+ The show had plenty of useful advice for dieters. Don’t pretend fast food is ‘addictive’
230
+ — greasy take-aways are just a bad habit. Only eat in the dining room, never on
231
+ the sofa . . . or in bed.
232
+
233
+ Remember, burger bars are in the cynical business of selling you empty calories.
234
+
235
+ Follow those rules, and you might not need the powdered shakes. Or the foul fish
236
+ sauce.'
237
+ - source_sentence: Berlin startup offers a year with no money worries
238
+ sentences:
239
+ - 'Get daily updates directly to your inbox + Subscribe Thank you for subscribing!
240
+ Could not subscribe, try again later Invalid Email
241
+
242
+ Nuneaton''s hospital has been given the all-clear after a previously closed ward
243
+ has now been re-opened.
244
+
245
+ Bosses at the George Eliot Hospital were forced to close the Adam Bede ward due
246
+ to an outbreak of Norovirus.
247
+
248
+ It remained closed over the weekend but on Monday they said that ward had now
249
+ been decontaminated and re-opened.
250
+
251
+ Martina Morris, deputy director of nursing at George Eliot Hospital NHS Trust,
252
+ said: “The patients on Adam Bede ward have been clear of symptoms for the last
253
+ 48 hours, and following a full decontamination, we have re-opened the ward.
254
+
255
+ “Any patients in the hospital who continue to present with symptoms of norovirus
256
+ have been isolated in side rooms.”
257
+
258
+ “But they are keen to prevent any further outbreaks and are appealing to anyone
259
+ from suffering from the sickness and diarrhoea to steer clear.
260
+
261
+ “We ask that the public continue to avoid the hospital, if they have symptoms
262
+ of diarrhoea and vomiting and do not visit until they have been symptom free for
263
+ at least 48 hours,” the deputy director of nursing said.
264
+
265
+ “Good hand hygiene is key to limiting the spread of these infections and it is
266
+ important to wash your hands thoroughly with soap and warm water as using just
267
+ an anti-bacterial hand gel is not sufficient.”'
268
+ - 'Comedy cabaret team All That Malarkey are promising to end 2017 with a festive
269
+ bang with their new show Camp as Christmas.
270
+
271
+ They will be playing The Groundlings Theatre in Portsmouth on December 20 at 7.30pm
272
+ (www.groundlings.co.uk) and Chichester’s St John’s Chapel on December 21, also
273
+ at 7.30pm (07722 824696).
274
+
275
+ Spokesman David Harrington said: “We spent a sizzling summer strutting our stuff
276
+ at the Edinburgh Fringe Festival, where we performed to an international audience
277
+ and gained excellent reviews.”
278
+
279
+ Now they are back on the road for Christmas: “We’re excited to have dates including
280
+ our London debut at the magnificent King’s Head Theatre, as well as other performances
281
+ in Wales and the South, though we always finish at Chichester as that is where
282
+ our journey began.
283
+
284
+ “The four classically-trained singers of ATM are geared up and ready to sing their
285
+ hearts out, fling themselves around the stage and present popular Christmas songs
286
+ from pop to classics and carols, all musically arranged in unexpected ways that
287
+ will surprise and entertain, accompanied and compered by yours truly at the keyboard.
288
+ Known for our unique four-part harmony arrangements of family favourites, laced
289
+ with fun, sparkle and tongue-in-cheek frivolity, our new programme will include
290
+ wonderful new renditions of Do you ACTUALLY wish it could be Christmas everyday,
291
+ Christmas No.1 Medley and We Need a Little Christmas.
292
+
293
+ “Always drawing an amazing and welcoming crowd, our performance this year will
294
+ be at St John’s Chapel, Chichester, hometown of the unmissable ginger-haired ATM
295
+ soprano, Amy Fuller, and the city where ATM started four Christmases ago.
296
+
297
+ “Promising to be an energetic and impossibly-festive evening, we’ll also be holding
298
+ a collection for St Wilfrid’s Hospice at the end, particularly close to our hearts
299
+ this year. Also in the diary for this tour is an appearance at my hometown of
300
+ Portsmouth (Wednesday, December 20 at The Groundlings Theatre). Having gone to
301
+ Padnell school and Oaklands Catholic school and sixth form, it will be a treat
302
+ to bring our outrageous act to old friends and family, and show them what I do
303
+ for a living…flick my hair around and make funny faces at the piano like a maniac.
304
+ Amy Fuller had made herself a complete stranger to me by growing up in Chichester
305
+ and going to Bishop Luffa and Parklands Primary, but we fortunately crossed paths
306
+ when studying together.”'
307
+ - 'Michael Bohmeyer, the founder of Mein Grundeinkommen (My Basic Income). Photo:
308
+ DPA
309
+
310
+ Miko from Berlin may only be five, but he already has €1,000 ($1,063) per month
311
+ to live on -- not from hard graft, but as part of an experiment into universal
312
+ basic income.
313
+
314
+ He is one of 85 people, including around 10 children, chosen by startup Mein Grundeinkommen
315
+ (My Basic Income) to receive the payments for a year since 2014.
316
+
317
+ Founder Michael Bohmeyer has set out to prove to a sceptical public in Germany
318
+ and further afield that the universal basic income (UBI) idea is workable.
319
+
320
+ "Thanks to my first startup, I got a regular income, my life became more creative
321
+ and healthy. So I wanted to launch a social experiment," 31-year-old Bohmeyer
322
+ told AFP.
323
+
324
+ And he wasn''t alone in wanting to test the idea, as some 55,000 donors have stumped
325
+ up the cash for the payments in a "crowdfunding" model -- with the final recipients
326
+ picked out in a "wheel of fortune" event livestreamed online.
327
+
328
+ Mother Birgit Kaulfuss said little Miko "can''t really understand, but for the
329
+ whole family it was exhilarating" when he was chosen -- offering a chance to live
330
+ "in a more relaxed way" and take a first-ever family holiday.
331
+
332
+ Trying things out
333
+
334
+ "Everyone sleeps more soundly and no one become a layabout," Bohmeyer said of
335
+ his beneficiaries.
336
+
337
+ Recipients'' experiences range from a welcome spell without financial worries
338
+ to major turning points in their lives.
339
+
340
+ "Without day-to-day pressures, you can be more creative and try things out," Valerie
341
+ Rupp told public broadcaster ARD in a recent interview.
342
+
343
+ She was able both to take care of her baby and start a career as a decorator --
344
+ even as her husband, newly arrived from Mali, was taking German
345
+
346
+ lessons.
347
+
348
+ Winners have left jobs that were doing little more for them than put bread on
349
+ the table to become teachers, taken time out to address chronic illness, broken
350
+ alcohol addiction, taken care of loved ones, or paid for children''s studies.
351
+
352
+ "It''s at once a gift and a prompt" to make a change, explained Astrid Lobeyer,
353
+ who used the money to give eulogies at funerals and studied the
354
+
355
+ therapeutic Alexander technique, a method for relieving stress in the muscles.
356
+
357
+ Bohmeyer''s experiment has fascinated social media and boosted discussion about
358
+ a universal income in Germany.
359
+
360
+ At the same time, Finland is testing the idea with 2,000 homeless recipients and
361
+ the idea is a flagship policy for French Socialist presidential
362
+
363
+ candidate Benoit Hamon.
364
+
365
+ Reward for laziness?
366
+
367
+ In 2009, the German parliament flatly rejected a petition from some 50,000 Germans
368
+ demanding a universal income.
369
+
370
+ Nevertheless, some 40 percent of the public still think it''s a good idea, according
371
+ to a survey last June by pollsters Emnid.
372
+
373
+ Supporters have formed a campaign group called "Buendnis Grundeinkommen" (Basic
374
+ income federation) with their sights on September''s legislative elections, but
375
+ so far no major party has taken up the cause.
376
+
377
+ There are pockets of support among left-wingers, the right, Catholic organisations
378
+ and even industry leaders, whose reasoning ranges from fighting poverty to simplifying
379
+ bureaucracy or smoothing the transition into the
380
+
381
+ digital era.
382
+
383
+ Resistance to the idea is more focused, centering on how UBI would change people''s
384
+ relationship to work.
385
+
386
+ Right-wingers dismiss it as a "reward for laziness", while the Social Democratic
387
+ Party (SPD) worried in 2006 about unemployed recipients being
388
+
389
+ "labelled useless" rather than getting help to find jobs.
390
+
391
+ Meanwhile, major unions like IG Metall and Verdi denounce the idea as a "liberal
392
+ Trojan horse" that would "boost inequality" by paying millionaires and poor people
393
+ alike.
394
+
395
+ Thankless jobs
396
+
397
+ Mein Grundeinkommen is "poorly thought out" as a response to broader social questions,
398
+ University of Freiburg economist Alexander Spermann told AFP.
399
+
400
+ The startup''s 20 employees eat up "60 percent of the budget", founder Michael
401
+ Bohmeyer admits -- while the idea of basing the funding on curiosity or activism
402
+ by thousands of donors is hardly applicable on a large scale.
403
+
404
+ For Spermann, the Berliners'' experiment has only succeeded in answering the question
405
+ "what would I do with a blank cheque if I got one for Christmas?"
406
+
407
+ People''s choices in terms of qualifications or work if they were guaranteed the
408
+ payments for life are the real mystery, the economist argues.
409
+
410
+ "Who will take on the exhausting and sometimes less attractive tasks, like emptying
411
+ bins or taking care of the elderly?" asked Werner Eichhorst of the Bonn Centre
412
+ for the Future of Work (IZA) in 2013.
413
+
414
+ UBI supporters argue such jobs would either be taken over by robots or find a
415
+ new place of honour in society if the policy were enacted.
416
+
417
+ "No machine will take over working for us and pay our taxes at the same time,"
418
+ Eichhorst and opponents shoot back.'
419
+ - source_sentence: population of artesia
420
+ sentences:
421
+ - Meanwhile, bring 4 cups of water to a boil and add the barley. Simmer uncovered
422
+ for 30 minutes, drain, and set aside. When the soup is ready, add the barley and
423
+ cook the soup for another 15 or 20 minutes, until the barley is tender.
424
+ - The 2016 Artesia, New Mexico, population is 12,036. There are 1,211 people per
425
+ square mile (population density).
426
+ - There are 30 calories in one cup of chopped green peppers and approximately 6
427
+ calories in 1 ounce or 28g of green peppers.
428
+ - source_sentence: what is the best paying engineering job
429
+ sentences:
430
+ - The 20 highest-paying jobs for engineering majors. Engineering jobs pay well.
431
+ To find out just how lucrative they really are, we turned to PayScale, the creator
432
+ of the world's largest compensation database. To find the 20 highest-paying jobs
433
+ for engineering majors, PayScale first identified the most common jobs for those
434
+ with a bachelor's degree (and nothing more) who work full-time in the US. Chief
435
+ architects and vice president's of business development topped the list, both
436
+ earning an impressive $151,000 a year.
437
+ - "Depending on the thickness and size of the chop, it can take anywhere from eight\
438
+ \ to 30 minutes. Hereâ\x80\x99s a helpful cooking chart and some tips to achieve\
439
+ \ delicious pork chops every time. Pork chops are a crowd pleaser, especially\
440
+ \ once you master your grilling technique. For safe consumption, itâ\x80\x99s\
441
+ \ recommended to cook pork until it reaches an internal temperature of 145°F\
442
+ \ or 65°C. Depending on the cut and thickness of your chop, the time it may take\
443
+ \ to reach this can vary. To make sure your chops are the right temperature, use\
444
+ \ a digital meat thermometer."
445
+ - Aviation is a combat arms branch which encompasses 80 percent of the commissioned
446
+ officer operational flying positions within the Army (less those in Aviation Material
447
+ Management and Medical Service Corps).
448
+ datasets:
449
+ - sentence-transformers/msmarco-co-condenser-margin-mse-sym-mnrl-mean-v1
450
+ - sentence-transformers/natural-questions
451
+ - sentence-transformers/gooaq
452
+ - sentence-transformers/ccnews
453
+ - sentence-transformers/hotpotqa
454
+ pipeline_tag: sentence-similarity
455
+ library_name: sentence-transformers
456
+ metrics:
457
+ - cosine_accuracy@10
458
+ - cosine_precision@10
459
+ - cosine_recall@10
460
+ - cosine_ndcg@10
461
+ - cosine_mrr@10
462
+ - cosine_map@10
463
+ model-index:
464
+ - name: SentenceTransformer based on answerdotai/ModernBERT-base
465
+ results:
466
+ - task:
467
+ type: information-retrieval
468
+ name: Information Retrieval
469
+ dataset:
470
+ name: NanoClimateFEVER
471
+ type: NanoClimateFEVER
472
+ metrics:
473
+ - type: cosine_accuracy@10
474
+ value: 0.68
475
+ name: Cosine Accuracy@10
476
+ - type: cosine_precision@10
477
+ value: 0.092
478
+ name: Cosine Precision@10
479
+ - type: cosine_recall@10
480
+ value: 0.38066666666666665
481
+ name: Cosine Recall@10
482
+ - type: cosine_ndcg@10
483
+ value: 0.3249416664911049
484
+ name: Cosine Ndcg@10
485
+ - type: cosine_mrr@10
486
+ value: 0.42102380952380947
487
+ name: Cosine Mrr@10
488
+ - type: cosine_map@10
489
+ value: 0.24314219576719573
490
+ name: Cosine Map@10
491
+ - task:
492
+ type: information-retrieval
493
+ name: Information Retrieval
494
+ dataset:
495
+ name: NanoDBPedia
496
+ type: NanoDBPedia
497
+ metrics:
498
+ - type: cosine_accuracy@10
499
+ value: 0.94
500
+ name: Cosine Accuracy@10
501
+ - type: cosine_precision@10
502
+ value: 0.3960000000000001
503
+ name: Cosine Precision@10
504
+ - type: cosine_recall@10
505
+ value: 0.2744430818213576
506
+ name: Cosine Recall@10
507
+ - type: cosine_ndcg@10
508
+ value: 0.5073068162878996
509
+ name: Cosine Ndcg@10
510
+ - type: cosine_mrr@10
511
+ value: 0.7423333333333335
512
+ name: Cosine Mrr@10
513
+ - type: cosine_map@10
514
+ value: 0.38069007936507937
515
+ name: Cosine Map@10
516
+ - task:
517
+ type: information-retrieval
518
+ name: Information Retrieval
519
+ dataset:
520
+ name: NanoFEVER
521
+ type: NanoFEVER
522
+ metrics:
523
+ - type: cosine_accuracy@10
524
+ value: 0.98
525
+ name: Cosine Accuracy@10
526
+ - type: cosine_precision@10
527
+ value: 0.10199999999999998
528
+ name: Cosine Precision@10
529
+ - type: cosine_recall@10
530
+ value: 0.9333333333333332
531
+ name: Cosine Recall@10
532
+ - type: cosine_ndcg@10
533
+ value: 0.8029023009379854
534
+ name: Cosine Ndcg@10
535
+ - type: cosine_mrr@10
536
+ value: 0.7768571428571428
537
+ name: Cosine Mrr@10
538
+ - type: cosine_map@10
539
+ value: 0.7485238095238094
540
+ name: Cosine Map@10
541
+ - task:
542
+ type: information-retrieval
543
+ name: Information Retrieval
544
+ dataset:
545
+ name: NanoFiQA2018
546
+ type: NanoFiQA2018
547
+ metrics:
548
+ - type: cosine_accuracy@10
549
+ value: 0.74
550
+ name: Cosine Accuracy@10
551
+ - type: cosine_precision@10
552
+ value: 0.11799999999999997
553
+ name: Cosine Precision@10
554
+ - type: cosine_recall@10
555
+ value: 0.558484126984127
556
+ name: Cosine Recall@10
557
+ - type: cosine_ndcg@10
558
+ value: 0.46456077633242976
559
+ name: Cosine Ndcg@10
560
+ - type: cosine_mrr@10
561
+ value: 0.529
562
+ name: Cosine Mrr@10
563
+ - type: cosine_map@10
564
+ value: 0.3803263888888889
565
+ name: Cosine Map@10
566
+ - task:
567
+ type: information-retrieval
568
+ name: Information Retrieval
569
+ dataset:
570
+ name: NanoHotpotQA
571
+ type: NanoHotpotQA
572
+ metrics:
573
+ - type: cosine_accuracy@10
574
+ value: 0.9
575
+ name: Cosine Accuracy@10
576
+ - type: cosine_precision@10
577
+ value: 0.12799999999999997
578
+ name: Cosine Precision@10
579
+ - type: cosine_recall@10
580
+ value: 0.64
581
+ name: Cosine Recall@10
582
+ - type: cosine_ndcg@10
583
+ value: 0.6343076802331278
584
+ name: Cosine Ndcg@10
585
+ - type: cosine_mrr@10
586
+ value: 0.822857142857143
587
+ name: Cosine Mrr@10
588
+ - type: cosine_map@10
589
+ value: 0.5439285714285714
590
+ name: Cosine Map@10
591
+ - task:
592
+ type: information-retrieval
593
+ name: Information Retrieval
594
+ dataset:
595
+ name: NanoMSMARCO
596
+ type: NanoMSMARCO
597
+ metrics:
598
+ - type: cosine_accuracy@10
599
+ value: 0.82
600
+ name: Cosine Accuracy@10
601
+ - type: cosine_precision@10
602
+ value: 0.08199999999999999
603
+ name: Cosine Precision@10
604
+ - type: cosine_recall@10
605
+ value: 0.82
606
+ name: Cosine Recall@10
607
+ - type: cosine_ndcg@10
608
+ value: 0.5554645505559797
609
+ name: Cosine Ndcg@10
610
+ - type: cosine_mrr@10
611
+ value: 0.47130158730158717
612
+ name: Cosine Mrr@10
613
+ - type: cosine_map@10
614
+ value: 0.47130158730158733
615
+ name: Cosine Map@10
616
+ - task:
617
+ type: information-retrieval
618
+ name: Information Retrieval
619
+ dataset:
620
+ name: NanoNFCorpus
621
+ type: NanoNFCorpus
622
+ metrics:
623
+ - type: cosine_accuracy@10
624
+ value: 0.64
625
+ name: Cosine Accuracy@10
626
+ - type: cosine_precision@10
627
+ value: 0.26199999999999996
628
+ name: Cosine Precision@10
629
+ - type: cosine_recall@10
630
+ value: 0.1304361785122358
631
+ name: Cosine Recall@10
632
+ - type: cosine_ndcg@10
633
+ value: 0.3071432716086243
634
+ name: Cosine Ndcg@10
635
+ - type: cosine_mrr@10
636
+ value: 0.44966666666666666
637
+ name: Cosine Mrr@10
638
+ - type: cosine_map@10
639
+ value: 0.22727896825396823
640
+ name: Cosine Map@10
641
+ - task:
642
+ type: information-retrieval
643
+ name: Information Retrieval
644
+ dataset:
645
+ name: NanoNQ
646
+ type: NanoNQ
647
+ metrics:
648
+ - type: cosine_accuracy@10
649
+ value: 0.84
650
+ name: Cosine Accuracy@10
651
+ - type: cosine_precision@10
652
+ value: 0.08999999999999998
653
+ name: Cosine Precision@10
654
+ - type: cosine_recall@10
655
+ value: 0.8
656
+ name: Cosine Recall@10
657
+ - type: cosine_ndcg@10
658
+ value: 0.6336493294508291
659
+ name: Cosine Ndcg@10
660
+ - type: cosine_mrr@10
661
+ value: 0.5863333333333334
662
+ name: Cosine Mrr@10
663
+ - type: cosine_map@10
664
+ value: 0.5725238095238094
665
+ name: Cosine Map@10
666
+ - task:
667
+ type: information-retrieval
668
+ name: Information Retrieval
669
+ dataset:
670
+ name: NanoQuoraRetrieval
671
+ type: NanoQuoraRetrieval
672
+ metrics:
673
+ - type: cosine_accuracy@10
674
+ value: 0.98
675
+ name: Cosine Accuracy@10
676
+ - type: cosine_precision@10
677
+ value: 0.132
678
+ name: Cosine Precision@10
679
+ - type: cosine_recall@10
680
+ value: 0.9693333333333334
681
+ name: Cosine Recall@10
682
+ - type: cosine_ndcg@10
683
+ value: 0.9390528024052875
684
+ name: Cosine Ndcg@10
685
+ - type: cosine_mrr@10
686
+ value: 0.94
687
+ name: Cosine Mrr@10
688
+ - type: cosine_map@10
689
+ value: 0.9192037037037036
690
+ name: Cosine Map@10
691
+ - task:
692
+ type: information-retrieval
693
+ name: Information Retrieval
694
+ dataset:
695
+ name: NanoSCIDOCS
696
+ type: NanoSCIDOCS
697
+ metrics:
698
+ - type: cosine_accuracy@10
699
+ value: 0.82
700
+ name: Cosine Accuracy@10
701
+ - type: cosine_precision@10
702
+ value: 0.172
703
+ name: Cosine Precision@10
704
+ - type: cosine_recall@10
705
+ value: 0.3526666666666667
706
+ name: Cosine Recall@10
707
+ - type: cosine_ndcg@10
708
+ value: 0.3220405919077218
709
+ name: Cosine Ndcg@10
710
+ - type: cosine_mrr@10
711
+ value: 0.46624603174603174
712
+ name: Cosine Mrr@10
713
+ - type: cosine_map@10
714
+ value: 0.21359206349206347
715
+ name: Cosine Map@10
716
+ - task:
717
+ type: information-retrieval
718
+ name: Information Retrieval
719
+ dataset:
720
+ name: NanoArguAna
721
+ type: NanoArguAna
722
+ metrics:
723
+ - type: cosine_accuracy@10
724
+ value: 0.86
725
+ name: Cosine Accuracy@10
726
+ - type: cosine_precision@10
727
+ value: 0.08599999999999998
728
+ name: Cosine Precision@10
729
+ - type: cosine_recall@10
730
+ value: 0.86
731
+ name: Cosine Recall@10
732
+ - type: cosine_ndcg@10
733
+ value: 0.5327666709376693
734
+ name: Cosine Ndcg@10
735
+ - type: cosine_mrr@10
736
+ value: 0.428579365079365
737
+ name: Cosine Mrr@10
738
+ - type: cosine_map@10
739
+ value: 0.42857936507936506
740
+ name: Cosine Map@10
741
+ - task:
742
+ type: information-retrieval
743
+ name: Information Retrieval
744
+ dataset:
745
+ name: NanoSciFact
746
+ type: NanoSciFact
747
+ metrics:
748
+ - type: cosine_accuracy@10
749
+ value: 0.78
750
+ name: Cosine Accuracy@10
751
+ - type: cosine_precision@10
752
+ value: 0.088
753
+ name: Cosine Precision@10
754
+ - type: cosine_recall@10
755
+ value: 0.77
756
+ name: Cosine Recall@10
757
+ - type: cosine_ndcg@10
758
+ value: 0.6297303845024383
759
+ name: Cosine Ndcg@10
760
+ - type: cosine_mrr@10
761
+ value: 0.5945
762
+ name: Cosine Mrr@10
763
+ - type: cosine_map@10
764
+ value: 0.5785
765
+ name: Cosine Map@10
766
+ - task:
767
+ type: information-retrieval
768
+ name: Information Retrieval
769
+ dataset:
770
+ name: NanoTouche2020
771
+ type: NanoTouche2020
772
+ metrics:
773
+ - type: cosine_accuracy@10
774
+ value: 0.9591836734693877
775
+ name: Cosine Accuracy@10
776
+ - type: cosine_precision@10
777
+ value: 0.4142857142857143
778
+ name: Cosine Precision@10
779
+ - type: cosine_recall@10
780
+ value: 0.28398393842367586
781
+ name: Cosine Recall@10
782
+ - type: cosine_ndcg@10
783
+ value: 0.4754464568728222
784
+ name: Cosine Ndcg@10
785
+ - type: cosine_mrr@10
786
+ value: 0.7044703595724006
787
+ name: Cosine Mrr@10
788
+ - type: cosine_map@10
789
+ value: 0.32663292301047403
790
+ name: Cosine Map@10
791
+ - task:
792
+ type: nano-beir
793
+ name: Nano BEIR
794
+ dataset:
795
+ name: NanoBEIR mean
796
+ type: NanoBEIR_mean
797
+ metrics:
798
+ - type: cosine_accuracy@10
799
+ value: 0.8414756671899528
800
+ name: Cosine Accuracy@10
801
+ - type: cosine_precision@10
802
+ value: 0.16632967032967033
803
+ name: Cosine Precision@10
804
+ - type: cosine_recall@10
805
+ value: 0.5979497942877997
806
+ name: Cosine Recall@10
807
+ - type: cosine_ndcg@10
808
+ value: 0.5484087152710707
809
+ name: Cosine Ndcg@10
810
+ - type: cosine_mrr@10
811
+ value: 0.6102437517131395
812
+ name: Cosine Mrr@10
813
+ - type: cosine_map@10
814
+ value: 0.46417103579527047
815
+ name: Cosine Map@10
816
+ ---
817
+
818
+ # SentenceTransformer based on answerdotai/ModernBERT-base
819
+
820
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) on the [msmarco](https://huggingface.co/datasets/sentence-transformers/msmarco-co-condenser-margin-mse-sym-mnrl-mean-v1), [natural_questions](https://huggingface.co/datasets/sentence-transformers/natural-questions), [gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq), [ccnews](https://huggingface.co/datasets/sentence-transformers/ccnews) and [hotpotqa](https://huggingface.co/datasets/sentence-transformers/hotpotqa) datasets. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
821
+
822
+ ## Model Details
823
+
824
+ ### Model Description
825
+ - **Model Type:** Sentence Transformer
826
+ - **Base model:** [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) <!-- at revision 8949b909ec900327062f0ebf497f51aef5e6f0c8 -->
827
+ - **Maximum Sequence Length:** 512 tokens
828
+ - **Output Dimensionality:** 768 dimensions
829
+ - **Similarity Function:** Cosine Similarity
830
+ - **Training Datasets:**
831
+ - [msmarco](https://huggingface.co/datasets/sentence-transformers/msmarco-co-condenser-margin-mse-sym-mnrl-mean-v1)
832
+ - [natural_questions](https://huggingface.co/datasets/sentence-transformers/natural-questions)
833
+ - [gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq)
834
+ - [ccnews](https://huggingface.co/datasets/sentence-transformers/ccnews)
835
+ - [hotpotqa](https://huggingface.co/datasets/sentence-transformers/hotpotqa)
836
+ - **Language:** en
837
+ <!-- - **License:** Unknown -->
838
+
839
+ ### Model Sources
840
+
841
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
842
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/huggingface/sentence-transformers)
843
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
844
+
845
+ ### Full Model Architecture
846
+
847
+ ```
848
+ SentenceTransformer(
849
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'ModernBertModel'})
850
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
851
+ )
852
+ ```
853
+
854
+ ## Usage
855
+
856
+ ### Direct Usage (Sentence Transformers)
857
+
858
+ First install the Sentence Transformers library:
859
+
860
+ ```bash
861
+ pip install -U sentence-transformers
862
+ ```
863
+
864
+ Then you can load this model and run inference.
865
+ ```python
866
+ from sentence_transformers import SentenceTransformer
867
+
868
+ # Download from the 🤗 Hub
869
+ model = SentenceTransformer("hotchpotch/ModernBERT-embedding-CMNRL")
870
+ # Run inference
871
+ queries = [
872
+ "what is the best paying engineering job",
873
+ ]
874
+ documents = [
875
+ "The 20 highest-paying jobs for engineering majors. Engineering jobs pay well. To find out just how lucrative they really are, we turned to PayScale, the creator of the world's largest compensation database. To find the 20 highest-paying jobs for engineering majors, PayScale first identified the most common jobs for those with a bachelor's degree (and nothing more) who work full-time in the US. Chief architects and vice president's of business development topped the list, both earning an impressive $151,000 a year.",
876
+ 'Aviation is a combat arms branch which encompasses 80 percent of the commissioned officer operational flying positions within the Army (less those in Aviation Material Management and Medical Service Corps).',
877
+ 'Depending on the thickness and size of the chop, it can take anywhere from eight to 30 minutes. Hereâ\x80\x99s a helpful cooking chart and some tips to achieve delicious pork chops every time. Pork chops are a crowd pleaser, especially once you master your grilling technique. For safe consumption, itâ\x80\x99s recommended to cook pork until it reaches an internal temperature of 145°F or 65°C. Depending on the cut and thickness of your chop, the time it may take to reach this can vary. To make sure your chops are the right temperature, use a digital meat thermometer.',
878
+ ]
879
+ query_embeddings = model.encode_query(queries)
880
+ document_embeddings = model.encode_document(documents)
881
+ print(query_embeddings.shape, document_embeddings.shape)
882
+ # [1, 768] [3, 768]
883
+
884
+ # Get the similarity scores for the embeddings
885
+ similarities = model.similarity(query_embeddings, document_embeddings)
886
+ print(similarities)
887
+ # tensor([[ 0.8588, 0.1637, -0.0107]])
888
+ ```
889
+
890
+ <!--
891
+ ### Direct Usage (Transformers)
892
+
893
+ <details><summary>Click to see the direct usage in Transformers</summary>
894
+
895
+ </details>
896
+ -->
897
+
898
+ <!--
899
+ ### Downstream Usage (Sentence Transformers)
900
+
901
+ You can finetune this model on your own dataset.
902
+
903
+ <details><summary>Click to expand</summary>
904
+
905
+ </details>
906
+ -->
907
+
908
+ <!--
909
+ ### Out-of-Scope Use
910
+
911
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
912
+ -->
913
+
914
+ ## Evaluation
915
+
916
+ ### Metrics
917
+
918
+ #### Information Retrieval
919
+
920
+ * Datasets: `NanoClimateFEVER`, `NanoDBPedia`, `NanoFEVER`, `NanoFiQA2018`, `NanoHotpotQA`, `NanoMSMARCO`, `NanoNFCorpus`, `NanoNQ`, `NanoQuoraRetrieval`, `NanoSCIDOCS`, `NanoArguAna`, `NanoSciFact` and `NanoTouche2020`
921
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
922
+
923
+ | Metric | NanoClimateFEVER | NanoDBPedia | NanoFEVER | NanoFiQA2018 | NanoHotpotQA | NanoMSMARCO | NanoNFCorpus | NanoNQ | NanoQuoraRetrieval | NanoSCIDOCS | NanoArguAna | NanoSciFact | NanoTouche2020 |
924
+ |:--------------------|:-----------------|:------------|:-----------|:-------------|:-------------|:------------|:-------------|:-----------|:-------------------|:------------|:------------|:------------|:---------------|
925
+ | cosine_accuracy@10 | 0.68 | 0.94 | 0.98 | 0.74 | 0.9 | 0.82 | 0.64 | 0.84 | 0.98 | 0.82 | 0.86 | 0.78 | 0.9592 |
926
+ | cosine_precision@10 | 0.092 | 0.396 | 0.102 | 0.118 | 0.128 | 0.082 | 0.262 | 0.09 | 0.132 | 0.172 | 0.086 | 0.088 | 0.4143 |
927
+ | cosine_recall@10 | 0.3807 | 0.2744 | 0.9333 | 0.5585 | 0.64 | 0.82 | 0.1304 | 0.8 | 0.9693 | 0.3527 | 0.86 | 0.77 | 0.284 |
928
+ | **cosine_ndcg@10** | **0.3249** | **0.5073** | **0.8029** | **0.4646** | **0.6343** | **0.5555** | **0.3071** | **0.6336** | **0.9391** | **0.322** | **0.5328** | **0.6297** | **0.4754** |
929
+ | cosine_mrr@10 | 0.421 | 0.7423 | 0.7769 | 0.529 | 0.8229 | 0.4713 | 0.4497 | 0.5863 | 0.94 | 0.4662 | 0.4286 | 0.5945 | 0.7045 |
930
+ | cosine_map@10 | 0.2431 | 0.3807 | 0.7485 | 0.3803 | 0.5439 | 0.4713 | 0.2273 | 0.5725 | 0.9192 | 0.2136 | 0.4286 | 0.5785 | 0.3266 |
931
+
932
+ #### Nano BEIR
933
+
934
+ * Dataset: `NanoBEIR_mean`
935
+ * Evaluated with [<code>NanoBEIREvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.NanoBEIREvaluator) with these parameters:
936
+ ```json
937
+ {
938
+ "dataset_names": [
939
+ "climatefever",
940
+ "dbpedia",
941
+ "fever",
942
+ "fiqa2018",
943
+ "hotpotqa",
944
+ "msmarco",
945
+ "nfcorpus",
946
+ "nq",
947
+ "quoraretrieval",
948
+ "scidocs",
949
+ "arguana",
950
+ "scifact",
951
+ "touche2020"
952
+ ],
953
+ "dataset_id": "sentence-transformers/NanoBEIR-en"
954
+ }
955
+ ```
956
+
957
+ | Metric | Value |
958
+ |:--------------------|:-----------|
959
+ | cosine_accuracy@10 | 0.8415 |
960
+ | cosine_precision@10 | 0.1663 |
961
+ | cosine_recall@10 | 0.5979 |
962
+ | **cosine_ndcg@10** | **0.5484** |
963
+ | cosine_mrr@10 | 0.6102 |
964
+ | cosine_map@10 | 0.4642 |
965
+
966
+ <!--
967
+ ## Bias, Risks and Limitations
968
+
969
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
970
+ -->
971
+
972
+ <!--
973
+ ### Recommendations
974
+
975
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
976
+ -->
977
+
978
+ ## Training Details
979
+
980
+ ### Training Datasets
981
+ <details><summary>msmarco</summary>
982
+
983
+ #### msmarco
984
+
985
+ * Dataset: [msmarco](https://huggingface.co/datasets/sentence-transformers/msmarco-co-condenser-margin-mse-sym-mnrl-mean-v1) at [84ed2d3](https://huggingface.co/datasets/sentence-transformers/msmarco-co-condenser-margin-mse-sym-mnrl-mean-v1/tree/84ed2d35626f617d890bd493b4d6db69a741e0e2)
986
+ * Size: 502,939 training samples
987
+ * Columns: <code>query</code> and <code>positive</code>
988
+ * Approximate statistics based on the first 1000 samples:
989
+ | | query | positive |
990
+ |:--------|:---------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
991
+ | type | string | string |
992
+ | details | <ul><li>min: 4 tokens</li><li>mean: 9.26 tokens</li><li>max: 25 tokens</li></ul> | <ul><li>min: 19 tokens</li><li>mean: 80.68 tokens</li><li>max: 230 tokens</li></ul> |
993
+ * Samples:
994
+ | query | positive |
995
+ |:-------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
996
+ | <code>is cabinet refacing worth the cost?</code> | <code>Fans of refacing say this mini-makeover can give a kitchen a whole new look at a much lower cost than installing all-new cabinets. Cabinet refacing can save up to 50 percent compared to the cost of replacing, says Cheryl Catalano, owner of Kitchen Solvers, a cabinet refacing franchise in Napierville, Illinois. From.</code> |
997
+ | <code>is the fovea ethmoidalis a bone</code> | <code>Ethmoid bone/fovea ethmoidalis. The medial portion of the ethmoid bone is a cruciate membranous bone composed of the crista galli, cribriform plate, and perpendicular ethmoidal plate. The crista is a thick piece of bone, shaped like a “cock's comb,” that projects intracranially and attaches to the falx cerebri.</code> |
998
+ | <code>average pitches per inning</code> | <code>The likelihood of a pitcher completing nine innings if he throws an average of 14 pitches or less per inning is reinforced by the totals of the 89 games in which pitchers did actually complete nine innings of work.</code> |
999
+ * Loss: [<code>CachedMultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedmultiplenegativesrankingloss) with these parameters:
1000
+ ```json
1001
+ {
1002
+ "scale": 20.0,
1003
+ "similarity_fct": "cos_sim",
1004
+ "mini_batch_size": 128,
1005
+ "gather_across_devices": false
1006
+ }
1007
+ ```
1008
+ </details>
1009
+ <details><summary>natural_questions</summary>
1010
+
1011
+ #### natural_questions
1012
+
1013
+ * Dataset: [natural_questions](https://huggingface.co/datasets/sentence-transformers/natural-questions) at [f9e894e](https://huggingface.co/datasets/sentence-transformers/natural-questions/tree/f9e894e1081e206e577b4eaa9ee6de2b06ae6f17)
1014
+ * Size: 100,231 training samples
1015
+ * Columns: <code>query</code> and <code>positive</code>
1016
+ * Approximate statistics based on the first 1000 samples:
1017
+ | | query | positive |
1018
+ |:--------|:-----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
1019
+ | type | string | string |
1020
+ | details | <ul><li>min: 10 tokens</li><li>mean: 12.46 tokens</li><li>max: 22 tokens</li></ul> | <ul><li>min: 12 tokens</li><li>mean: 137.8 tokens</li><li>max: 512 tokens</li></ul> |
1021
+ * Samples:
1022
+ | query | positive |
1023
+ |:------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
1024
+ | <code>difference between russian blue and british blue cat</code> | <code>Russian Blue The coat is known as a "double coat", with the undercoat being soft, downy and equal in length to the guard hairs, which are an even blue with silver tips. However, the tail may have a few very dull, almost unnoticeable stripes. The coat is described as thick, plush and soft to the touch. The feeling is softer than the softest silk. The silver tips give the coat a shimmering appearance. Its eyes are almost always a dark and vivid green. Any white patches of fur or yellow eyes in adulthood are seen as flaws in show cats.[3] Russian Blues should not be confused with British Blues (which are not a distinct breed, but rather a British Shorthair with a blue coat as the British Shorthair breed itself comes in a wide variety of colors and patterns), nor the Chartreux or Korat which are two other naturally occurring breeds of blue cats, although they have similar traits.</code> |
1025
+ | <code>who played the little girl on mrs doubtfire</code> | <code>Mara Wilson Mara Elizabeth Wilson[2] (born July 24, 1987) is an American writer and former child actress. She is known for playing Natalie Hillard in Mrs. Doubtfire (1993), Susan Walker in Miracle on 34th Street (1994), Matilda Wormwood in Matilda (1996) and Lily Stone in Thomas and the Magic Railroad (2000). Since retiring from film acting, Wilson has focused on writing.</code> |
1026
+ | <code>what year did the movie the sound of music come out</code> | <code>The Sound of Music (film) The film was released on March 2, 1965 in the United States, initially as a limited roadshow theatrical release. Although critical response to the film was widely mixed, the film was a major commercial success, becoming the number one box office movie after four weeks, and the highest-grossing film of 1965. By November 1966, The Sound of Music had become the highest-grossing film of all-time—surpassing Gone with the Wind—and held that distinction for five years. The film was just as popular throughout the world, breaking previous box-office records in twenty-nine countries. Following an initial theatrical release that lasted four and a half years, and two successful re-releases, the film sold 283 million admissions worldwide and earned a total worldwide gross of $286,000,000.</code> |
1027
+ * Loss: [<code>CachedMultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedmultiplenegativesrankingloss) with these parameters:
1028
+ ```json
1029
+ {
1030
+ "scale": 20.0,
1031
+ "similarity_fct": "cos_sim",
1032
+ "mini_batch_size": 128,
1033
+ "gather_across_devices": false
1034
+ }
1035
+ ```
1036
+ </details>
1037
+ <details><summary>gooaq</summary>
1038
+
1039
+ #### gooaq
1040
+
1041
+ * Dataset: [gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq) at [b089f72](https://huggingface.co/datasets/sentence-transformers/gooaq/tree/b089f728748a068b7bc5234e5bcf5b25e3c8279c)
1042
+ * Size: 3,012,496 training samples
1043
+ * Columns: <code>query</code> and <code>positive</code>
1044
+ * Approximate statistics based on the first 1000 samples:
1045
+ | | query | positive |
1046
+ |:--------|:----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
1047
+ | type | string | string |
1048
+ | details | <ul><li>min: 8 tokens</li><li>mean: 12.05 tokens</li><li>max: 21 tokens</li></ul> | <ul><li>min: 13 tokens</li><li>mean: 59.08 tokens</li><li>max: 116 tokens</li></ul> |
1049
+ * Samples:
1050
+ | query | positive |
1051
+ |:-----------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
1052
+ | <code>how do i program my directv remote with my tv?</code> | <code>['Press MENU on your remote.', 'Select Settings & Help > Settings > Remote Control > Program Remote.', 'Choose the device (TV, audio, DVD) you wish to program. ... ', 'Follow the on-screen prompts to complete programming.']</code> |
1053
+ | <code>are rodrigues fruit bats nocturnal?</code> | <code>Before its numbers were threatened by habitat destruction, storms, and hunting, some of those groups could number 500 or more members. Sunrise, sunset. Rodrigues fruit bats are most active at dawn, at dusk, and at night.</code> |
1054
+ | <code>why does your heart rate increase during exercise bbc bitesize?</code> | <code>During exercise there is an increase in physical activity and muscle cells respire more than they do when the body is at rest. The heart rate increases during exercise. The rate and depth of breathing increases - this makes sure that more oxygen is absorbed into the blood, and more carbon dioxide is removed from it.</code> |
1055
+ * Loss: [<code>CachedMultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedmultiplenegativesrankingloss) with these parameters:
1056
+ ```json
1057
+ {
1058
+ "scale": 20.0,
1059
+ "similarity_fct": "cos_sim",
1060
+ "mini_batch_size": 128,
1061
+ "gather_across_devices": false
1062
+ }
1063
+ ```
1064
+ </details>
1065
+ <details><summary>ccnews</summary>
1066
+
1067
+ #### ccnews
1068
+
1069
+ * Dataset: [ccnews](https://huggingface.co/datasets/sentence-transformers/ccnews) at [6118cc0](https://huggingface.co/datasets/sentence-transformers/ccnews/tree/6118cc09daf7977d6dddef2c6e4b7a4c92db9f57)
1070
+ * Size: 614,664 training samples
1071
+ * Columns: <code>query</code> and <code>positive</code>
1072
+ * Approximate statistics based on the first 1000 samples:
1073
+ | | query | positive |
1074
+ |:--------|:----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
1075
+ | type | string | string |
1076
+ | details | <ul><li>min: 7 tokens</li><li>mean: 16.71 tokens</li><li>max: 56 tokens</li></ul> | <ul><li>min: 18 tokens</li><li>mean: 349.3 tokens</li><li>max: 512 tokens</li></ul> |
1077
+ * Samples:
1078
+ | query | positive |
1079
+ |:----------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
1080
+ | <code>Rupee rises for 2nd consecutive day, gains 8 paise against US dollar today</code> | <code>The rupee rose 8 paise to close at 64.37 apiece US dollar at the interbank foreign exchange market today.<br>The Indian rupee appreciated for the second consecutive day and gained over 8 paise against the US dollar on Monday. The domestic currency opened unchanged today, very quickly edged higher and extended the gains to hit a day’s high of 64.34. The rupee rose 8 paise to close at 64.37 apiece US dollar at the interbank foreign exchange market today. The Reserve Bank of India fixed the reference rate of the rupee at 64.3616 against the US dollar on Monday. The Indian rupee moved up 23 paise against the US dollar in just 2 days as Narendra Modi led BJP is most likely to conquer Gujarat for the fifth consecutive time in the state elections. Way back in March 2017, the rupee appreciated as much as 79 paise in a single day to close at a 16-month high against the US dollar after Bharatiya Janata Party’s landslide victory in Uttar Pradesh state elections.<br>Finance Minister Arun Jaitley is all ...</code> |
1081
+ | <code>Microsoft pushes for ‘Digital Geneva Convention’ for cybercrimes</code> | <code>Technology companies, he added, need to preserve trust and stability online by pledging neutrality in cyber conflict. ( Image for representation, Source: Reuters) Technology companies, he added, need to preserve trust and stability online by pledging neutrality in cyber conflict. ( Image for representation, Source: Reuters)<br>Microsoft President Brad Smith on Tuesday pressed the world’s governments to form an international body to protect civilians from state-sponsored hacking, saying recent high-profile attacks showed a need for global norms to police government activity in cyberspace.<br>Countries need to develop and abide by global rules for cyber attacks similar to those established for armed conflict at the 1949 Geneva Convention that followed World War Two, Smith said. Technology companies, he added, need to preserve trust and stability online by pledging neutrality in cyber conflict.<br>Watch all our videos from Express Technology<br>“We need a Digital Geneva Convention that will commit go...</code> |
1082
+ | <code>Prince Gets Purple Pantone Color ‘Love Symbol #2’</code> | <code>By Abby Hassler<br>Prince, also known as “The Purple One” is finally getting his very own Pantone color. Pantone and Prince’s Estate announced today (August 14) that the late singer has his own purple hue, “Love Symbol #2,” which is named after the iconic symbol the singer used as an emblem for his name.<br>Related: Wesley Snipes Beat Out Prince for His Role in Michael Jackson’s ‘Bad’<br>“The color purple was synonymous with who Prince was and will always be. This is an incredible way for his legacy to live on forever,” Troy Carter, entertainment adviser to Prince’s Estate, said.<br>“We are honored to have worked on the development of Love Symbol #2, a distinctive new purple shade created in memory of Prince, ‘the purple one,'” added Laurie Pressman, vice president of the Pantone Color Institute. “A musical icon known for his artistic brilliance, Love Symbol #2 is emblematic of Prince’s distinctive style. Long associated with the purple family, Love Symbol #2 enables Prince’s unique purple shade t...</code> |
1083
+ * Loss: [<code>CachedMultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedmultiplenegativesrankingloss) with these parameters:
1084
+ ```json
1085
+ {
1086
+ "scale": 20.0,
1087
+ "similarity_fct": "cos_sim",
1088
+ "mini_batch_size": 128,
1089
+ "gather_across_devices": false
1090
+ }
1091
+ ```
1092
+ </details>
1093
+ <details><summary>hotpotqa</summary>
1094
+
1095
+ #### hotpotqa
1096
+
1097
+ * Dataset: [hotpotqa](https://huggingface.co/datasets/sentence-transformers/hotpotqa) at [f07d3cd](https://huggingface.co/datasets/sentence-transformers/hotpotqa/tree/f07d3cd2d290ea2e83ed35e33d67d6a4658b8786)
1098
+ * Size: 84,516 training samples
1099
+ * Columns: <code>query</code> and <code>positive</code>
1100
+ * Approximate statistics based on the first 1000 samples:
1101
+ | | query | positive |
1102
+ |:--------|:-----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
1103
+ | type | string | string |
1104
+ | details | <ul><li>min: 8 tokens</li><li>mean: 25.82 tokens</li><li>max: 140 tokens</li></ul> | <ul><li>min: 18 tokens</li><li>mean: 103.34 tokens</li><li>max: 350 tokens</li></ul> |
1105
+ * Samples:
1106
+ | query | positive |
1107
+ |:------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
1108
+ | <code>Which magazine covers a wider range of topics, Decibel or Paper?</code> | <code>Decibel (magazine) Decibel is a monthly heavy metal magazine published by the Philadelphia-based Red Flag Media since October 2004. Its sections include Upfront, Features, Reviews, Guest Columns and the Decibel Hall of Fame. The magazine's tag-line is currently "Extremely Extreme" (previously "The New Noise"); the editor-in-chief is Albert Mudrian.</code> |
1109
+ | <code>what bbc drama features such actors as Sian Reeves and Ben Daniels?</code> | <code>Siân Reeves Siân Reeves (born Siân Rivers on May 9, 1966 in West Bromwich) is a British actress, most famous for playing the role of Sydney Henshall in the BBC drama "Cutting It", and for playing villain Sally Spode in "Emmerdale".</code> |
1110
+ | <code>What size population does the County Connection public transit in Concord, California service?</code> | <code>County Connection The County Connection (officially, the Central Contra Costa Transit Authority, CCCTA) is a Concord-based public transit agency operating fixed-route bus and ADA paratransit (County Connection LINK) service in and around central Contra Costa County in the San Francisco Bay Area. Established in 1980 as a joint powers authority, CCCTA assumed control of public bus service within central Contra Costa first begun by Oakland-based AC Transit as it expanded into suburban Contra Costa County in the mid-1970s (especially after the opening of BART).</code> |
1111
+ * Loss: [<code>CachedMultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedmultiplenegativesrankingloss) with these parameters:
1112
+ ```json
1113
+ {
1114
+ "scale": 20.0,
1115
+ "similarity_fct": "cos_sim",
1116
+ "mini_batch_size": 128,
1117
+ "gather_across_devices": false
1118
+ }
1119
+ ```
1120
+ </details>
1121
+
1122
+ ### Training Hyperparameters
1123
+ #### Non-Default Hyperparameters
1124
+
1125
+ - `per_device_train_batch_size`: 8192
1126
+ - `per_device_eval_batch_size`: 512
1127
+ - `learning_rate`: 0.0001
1128
+ - `weight_decay`: 0.01
1129
+ - `num_train_epochs`: 1
1130
+ - `lr_scheduler_type`: cosine
1131
+ - `warmup_ratio`: 0.1
1132
+ - `seed`: 12
1133
+ - `bf16`: True
1134
+ - `dataloader_drop_last`: True
1135
+ - `dataloader_num_workers`: 12
1136
+ - `dataloader_prefetch_factor`: 2
1137
+ - `remove_unused_columns`: False
1138
+ - `optim`: adamw_torch
1139
+ - `batch_sampler`: no_duplicates
1140
+
1141
+ #### All Hyperparameters
1142
+ <details><summary>Click to expand</summary>
1143
+
1144
+ - `overwrite_output_dir`: False
1145
+ - `do_predict`: False
1146
+ - `eval_strategy`: no
1147
+ - `prediction_loss_only`: True
1148
+ - `per_device_train_batch_size`: 8192
1149
+ - `per_device_eval_batch_size`: 512
1150
+ - `per_gpu_train_batch_size`: None
1151
+ - `per_gpu_eval_batch_size`: None
1152
+ - `gradient_accumulation_steps`: 1
1153
+ - `eval_accumulation_steps`: None
1154
+ - `torch_empty_cache_steps`: None
1155
+ - `learning_rate`: 0.0001
1156
+ - `weight_decay`: 0.01
1157
+ - `adam_beta1`: 0.9
1158
+ - `adam_beta2`: 0.999
1159
+ - `adam_epsilon`: 1e-08
1160
+ - `max_grad_norm`: 1.0
1161
+ - `num_train_epochs`: 1
1162
+ - `max_steps`: -1
1163
+ - `lr_scheduler_type`: cosine
1164
+ - `lr_scheduler_kwargs`: {}
1165
+ - `warmup_ratio`: 0.1
1166
+ - `warmup_steps`: 0
1167
+ - `log_level`: passive
1168
+ - `log_level_replica`: warning
1169
+ - `log_on_each_node`: True
1170
+ - `logging_nan_inf_filter`: True
1171
+ - `save_safetensors`: True
1172
+ - `save_on_each_node`: False
1173
+ - `save_only_model`: False
1174
+ - `restore_callback_states_from_checkpoint`: False
1175
+ - `no_cuda`: False
1176
+ - `use_cpu`: False
1177
+ - `use_mps_device`: False
1178
+ - `seed`: 12
1179
+ - `data_seed`: None
1180
+ - `jit_mode_eval`: False
1181
+ - `bf16`: True
1182
+ - `fp16`: False
1183
+ - `fp16_opt_level`: O1
1184
+ - `half_precision_backend`: auto
1185
+ - `bf16_full_eval`: False
1186
+ - `fp16_full_eval`: False
1187
+ - `tf32`: None
1188
+ - `local_rank`: 0
1189
+ - `ddp_backend`: None
1190
+ - `tpu_num_cores`: None
1191
+ - `tpu_metrics_debug`: False
1192
+ - `debug`: []
1193
+ - `dataloader_drop_last`: True
1194
+ - `dataloader_num_workers`: 12
1195
+ - `dataloader_prefetch_factor`: 2
1196
+ - `past_index`: -1
1197
+ - `disable_tqdm`: False
1198
+ - `remove_unused_columns`: False
1199
+ - `label_names`: None
1200
+ - `load_best_model_at_end`: False
1201
+ - `ignore_data_skip`: False
1202
+ - `fsdp`: []
1203
+ - `fsdp_min_num_params`: 0
1204
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
1205
+ - `fsdp_transformer_layer_cls_to_wrap`: None
1206
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
1207
+ - `parallelism_config`: None
1208
+ - `deepspeed`: None
1209
+ - `label_smoothing_factor`: 0.0
1210
+ - `optim`: adamw_torch
1211
+ - `optim_args`: None
1212
+ - `adafactor`: False
1213
+ - `group_by_length`: False
1214
+ - `length_column_name`: length
1215
+ - `project`: huggingface
1216
+ - `trackio_space_id`: trackio
1217
+ - `ddp_find_unused_parameters`: None
1218
+ - `ddp_bucket_cap_mb`: None
1219
+ - `ddp_broadcast_buffers`: False
1220
+ - `dataloader_pin_memory`: True
1221
+ - `dataloader_persistent_workers`: False
1222
+ - `skip_memory_metrics`: True
1223
+ - `use_legacy_prediction_loop`: False
1224
+ - `push_to_hub`: False
1225
+ - `resume_from_checkpoint`: None
1226
+ - `hub_model_id`: None
1227
+ - `hub_strategy`: every_save
1228
+ - `hub_private_repo`: None
1229
+ - `hub_always_push`: False
1230
+ - `hub_revision`: None
1231
+ - `gradient_checkpointing`: False
1232
+ - `gradient_checkpointing_kwargs`: None
1233
+ - `include_inputs_for_metrics`: False
1234
+ - `include_for_metrics`: []
1235
+ - `eval_do_concat_batches`: True
1236
+ - `fp16_backend`: auto
1237
+ - `push_to_hub_model_id`: None
1238
+ - `push_to_hub_organization`: None
1239
+ - `mp_parameters`:
1240
+ - `auto_find_batch_size`: False
1241
+ - `full_determinism`: False
1242
+ - `torchdynamo`: None
1243
+ - `ray_scope`: last
1244
+ - `ddp_timeout`: 1800
1245
+ - `torch_compile`: False
1246
+ - `torch_compile_backend`: None
1247
+ - `torch_compile_mode`: None
1248
+ - `include_tokens_per_second`: False
1249
+ - `include_num_input_tokens_seen`: no
1250
+ - `neftune_noise_alpha`: None
1251
+ - `optim_target_modules`: None
1252
+ - `batch_eval_metrics`: False
1253
+ - `eval_on_start`: False
1254
+ - `use_liger_kernel`: False
1255
+ - `liger_kernel_config`: None
1256
+ - `eval_use_gather_object`: False
1257
+ - `average_tokens_across_devices`: True
1258
+ - `prompts`: None
1259
+ - `batch_sampler`: no_duplicates
1260
+ - `multi_dataset_batch_sampler`: proportional
1261
+ - `router_mapping`: {}
1262
+ - `learning_rate_mapping`: {}
1263
+
1264
+ </details>
1265
+
1266
+ ### Training Logs
1267
+ | Epoch | Step | Training Loss | NanoClimateFEVER_cosine_ndcg@10 | NanoDBPedia_cosine_ndcg@10 | NanoFEVER_cosine_ndcg@10 | NanoFiQA2018_cosine_ndcg@10 | NanoHotpotQA_cosine_ndcg@10 | NanoMSMARCO_cosine_ndcg@10 | NanoNFCorpus_cosine_ndcg@10 | NanoNQ_cosine_ndcg@10 | NanoQuoraRetrieval_cosine_ndcg@10 | NanoSCIDOCS_cosine_ndcg@10 | NanoArguAna_cosine_ndcg@10 | NanoSciFact_cosine_ndcg@10 | NanoTouche2020_cosine_ndcg@10 | NanoBEIR_mean_cosine_ndcg@10 |
1268
+ |:------:|:----:|:-------------:|:-------------------------------:|:--------------------------:|:------------------------:|:---------------------------:|:---------------------------:|:--------------------------:|:---------------------------:|:---------------------:|:---------------------------------:|:--------------------------:|:--------------------------:|:--------------------------:|:-----------------------------:|:----------------------------:|
1269
+ | 0.0190 | 10 | 8.226 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1270
+ | 0.0381 | 20 | 5.503 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1271
+ | 0.0571 | 30 | 3.4245 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1272
+ | 0.0762 | 40 | 1.907 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1273
+ | 0.0952 | 50 | 1.3564 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1274
+ | 0.1143 | 60 | 1.1161 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1275
+ | 0.1333 | 70 | 1.0269 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1276
+ | 0.1524 | 80 | 0.804 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1277
+ | 0.1714 | 90 | 0.7459 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1278
+ | 0.1905 | 100 | 0.6271 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1279
+ | 0.2095 | 110 | 0.8254 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1280
+ | 0.2286 | 120 | 0.7112 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1281
+ | 0.2476 | 130 | 0.6292 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1282
+ | 0.2667 | 140 | 0.6022 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1283
+ | 0.2857 | 150 | 0.782 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1284
+ | 0.3048 | 160 | 0.5896 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1285
+ | 0.3238 | 170 | 0.6357 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1286
+ | 0.3429 | 180 | 0.6329 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1287
+ | 0.3619 | 190 | 0.7885 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1288
+ | 0.3810 | 200 | 0.484 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1289
+ | 0.4 | 210 | 0.5834 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1290
+ | 0.4190 | 220 | 0.5229 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1291
+ | 0.4381 | 230 | 0.5112 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1292
+ | 0.4571 | 240 | 0.4973 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1293
+ | 0.4762 | 250 | 0.5582 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1294
+ | 0.4952 | 260 | 0.437 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1295
+ | 0.5143 | 270 | 0.5495 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1296
+ | 0.5333 | 280 | 0.5378 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1297
+ | 0.5524 | 290 | 0.4802 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1298
+ | 0.5714 | 300 | 0.5221 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1299
+ | 0.5905 | 310 | 0.5243 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1300
+ | 0.6095 | 320 | 0.4762 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1301
+ | 0.6286 | 330 | 0.571 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1302
+ | 0.6476 | 340 | 0.465 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1303
+ | 0.6667 | 350 | 0.5644 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1304
+ | 0.6857 | 360 | 0.5494 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1305
+ | 0.7048 | 370 | 0.5148 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1306
+ | 0.7238 | 380 | 0.5109 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1307
+ | 0.7429 | 390 | 0.5357 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1308
+ | 0.7619 | 400 | 0.4638 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1309
+ | 0.7810 | 410 | 0.403 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1310
+ | 0.8 | 420 | 0.5423 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1311
+ | 0.8190 | 430 | 0.4469 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1312
+ | 0.8381 | 440 | 0.5935 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1313
+ | 0.8571 | 450 | 0.3879 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1314
+ | 0.8762 | 460 | 0.5288 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1315
+ | 0.8952 | 470 | 0.5372 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1316
+ | 0.9143 | 480 | 0.4814 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1317
+ | 0.9333 | 490 | 0.4817 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1318
+ | 0.9524 | 500 | 0.3893 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1319
+ | 0.9714 | 510 | 0.434 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1320
+ | 0.9905 | 520 | 0.3894 | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1321
+ | 0 | 521 | - | 0.3249 | 0.5073 | 0.8029 | 0.4646 | 0.6343 | 0.5555 | 0.3071 | 0.6336 | 0.9391 | 0.3220 | 0.5328 | 0.6297 | 0.4754 | 0.5484 |
1322
+
1323
+
1324
+ ### Framework Versions
1325
+ - Python: 3.11.14
1326
+ - Sentence Transformers: 5.3.0.dev0
1327
+ - Transformers: 4.57.1
1328
+ - PyTorch: 2.8.0+cu129
1329
+ - Accelerate: 1.12.0
1330
+ - Datasets: 4.4.1
1331
+ - Tokenizers: 0.22.1
1332
+
1333
+ ## Citation
1334
+
1335
+ ### BibTeX
1336
+
1337
+ #### Sentence Transformers
1338
+ ```bibtex
1339
+ @inproceedings{reimers-2019-sentence-bert,
1340
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
1341
+ author = "Reimers, Nils and Gurevych, Iryna",
1342
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
1343
+ month = "11",
1344
+ year = "2019",
1345
+ publisher = "Association for Computational Linguistics",
1346
+ url = "https://arxiv.org/abs/1908.10084",
1347
+ }
1348
+ ```
1349
+
1350
+ #### CachedMultipleNegativesRankingLoss
1351
+ ```bibtex
1352
+ @misc{gao2021scaling,
1353
+ title={Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup},
1354
+ author={Luyu Gao and Yunyi Zhang and Jiawei Han and Jamie Callan},
1355
+ year={2021},
1356
+ eprint={2101.06983},
1357
+ archivePrefix={arXiv},
1358
+ primaryClass={cs.LG}
1359
+ }
1360
+ ```
1361
+
1362
+ <!--
1363
+ ## Glossary
1364
+
1365
+ *Clearly define terms in order to be accessible across audiences.*
1366
+ -->
1367
+
1368
+ <!--
1369
+ ## Model Card Authors
1370
+
1371
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
1372
+ -->
1373
+
1374
+ <!--
1375
+ ## Model Card Contact
1376
+
1377
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
1378
+ -->
config.json ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "ModernBertModel"
4
+ ],
5
+ "attention_bias": false,
6
+ "attention_dropout": 0.0,
7
+ "bos_token_id": 50281,
8
+ "classifier_activation": "gelu",
9
+ "classifier_bias": false,
10
+ "classifier_dropout": 0.0,
11
+ "classifier_pooling": "mean",
12
+ "cls_token_id": 50281,
13
+ "decoder_bias": true,
14
+ "deterministic_flash_attn": false,
15
+ "dtype": "float32",
16
+ "embedding_dropout": 0.0,
17
+ "eos_token_id": 50282,
18
+ "global_attn_every_n_layers": 3,
19
+ "global_rope_theta": 160000.0,
20
+ "gradient_checkpointing": false,
21
+ "hidden_activation": "gelu",
22
+ "hidden_size": 768,
23
+ "initializer_cutoff_factor": 2.0,
24
+ "initializer_range": 0.02,
25
+ "intermediate_size": 1152,
26
+ "layer_norm_eps": 1e-05,
27
+ "local_attention": 128,
28
+ "local_rope_theta": 10000.0,
29
+ "max_position_embeddings": 8192,
30
+ "mlp_bias": false,
31
+ "mlp_dropout": 0.0,
32
+ "model_type": "modernbert",
33
+ "norm_bias": false,
34
+ "norm_eps": 1e-05,
35
+ "num_attention_heads": 12,
36
+ "num_hidden_layers": 22,
37
+ "pad_token_id": 50283,
38
+ "position_embedding_type": "absolute",
39
+ "repad_logits_with_grad": false,
40
+ "sep_token_id": 50282,
41
+ "sparse_pred_ignore_index": -100,
42
+ "sparse_prediction": false,
43
+ "transformers_version": "4.57.1",
44
+ "vocab_size": 50368
45
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_type": "SentenceTransformer",
3
+ "__version__": {
4
+ "sentence_transformers": "5.3.0.dev0",
5
+ "transformers": "4.57.1",
6
+ "pytorch": "2.8.0+cu129"
7
+ },
8
+ "prompts": {
9
+ "query": "",
10
+ "document": ""
11
+ },
12
+ "default_prompt_name": null,
13
+ "similarity_fn_name": "cosine"
14
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:712018e1134a0d5c0ddb1f98ea964b3d4ac6ac536fc819c8e54a7d954ae4eca0
3
+ size 596070136
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": true,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,952 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "|||IP_ADDRESS|||",
5
+ "lstrip": false,
6
+ "normalized": true,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": false
10
+ },
11
+ "1": {
12
+ "content": "<|padding|>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "50254": {
20
+ "content": " ",
21
+ "lstrip": false,
22
+ "normalized": true,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": false
26
+ },
27
+ "50255": {
28
+ "content": " ",
29
+ "lstrip": false,
30
+ "normalized": true,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": false
34
+ },
35
+ "50256": {
36
+ "content": " ",
37
+ "lstrip": false,
38
+ "normalized": true,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": false
42
+ },
43
+ "50257": {
44
+ "content": " ",
45
+ "lstrip": false,
46
+ "normalized": true,
47
+ "rstrip": false,
48
+ "single_word": false,
49
+ "special": false
50
+ },
51
+ "50258": {
52
+ "content": " ",
53
+ "lstrip": false,
54
+ "normalized": true,
55
+ "rstrip": false,
56
+ "single_word": false,
57
+ "special": false
58
+ },
59
+ "50259": {
60
+ "content": " ",
61
+ "lstrip": false,
62
+ "normalized": true,
63
+ "rstrip": false,
64
+ "single_word": false,
65
+ "special": false
66
+ },
67
+ "50260": {
68
+ "content": " ",
69
+ "lstrip": false,
70
+ "normalized": true,
71
+ "rstrip": false,
72
+ "single_word": false,
73
+ "special": false
74
+ },
75
+ "50261": {
76
+ "content": " ",
77
+ "lstrip": false,
78
+ "normalized": true,
79
+ "rstrip": false,
80
+ "single_word": false,
81
+ "special": false
82
+ },
83
+ "50262": {
84
+ "content": " ",
85
+ "lstrip": false,
86
+ "normalized": true,
87
+ "rstrip": false,
88
+ "single_word": false,
89
+ "special": false
90
+ },
91
+ "50263": {
92
+ "content": " ",
93
+ "lstrip": false,
94
+ "normalized": true,
95
+ "rstrip": false,
96
+ "single_word": false,
97
+ "special": false
98
+ },
99
+ "50264": {
100
+ "content": " ",
101
+ "lstrip": false,
102
+ "normalized": true,
103
+ "rstrip": false,
104
+ "single_word": false,
105
+ "special": false
106
+ },
107
+ "50265": {
108
+ "content": " ",
109
+ "lstrip": false,
110
+ "normalized": true,
111
+ "rstrip": false,
112
+ "single_word": false,
113
+ "special": false
114
+ },
115
+ "50266": {
116
+ "content": " ",
117
+ "lstrip": false,
118
+ "normalized": true,
119
+ "rstrip": false,
120
+ "single_word": false,
121
+ "special": false
122
+ },
123
+ "50267": {
124
+ "content": " ",
125
+ "lstrip": false,
126
+ "normalized": true,
127
+ "rstrip": false,
128
+ "single_word": false,
129
+ "special": false
130
+ },
131
+ "50268": {
132
+ "content": " ",
133
+ "lstrip": false,
134
+ "normalized": true,
135
+ "rstrip": false,
136
+ "single_word": false,
137
+ "special": false
138
+ },
139
+ "50269": {
140
+ "content": " ",
141
+ "lstrip": false,
142
+ "normalized": true,
143
+ "rstrip": false,
144
+ "single_word": false,
145
+ "special": false
146
+ },
147
+ "50270": {
148
+ "content": " ",
149
+ "lstrip": false,
150
+ "normalized": true,
151
+ "rstrip": false,
152
+ "single_word": false,
153
+ "special": false
154
+ },
155
+ "50271": {
156
+ "content": " ",
157
+ "lstrip": false,
158
+ "normalized": true,
159
+ "rstrip": false,
160
+ "single_word": false,
161
+ "special": false
162
+ },
163
+ "50272": {
164
+ "content": " ",
165
+ "lstrip": false,
166
+ "normalized": true,
167
+ "rstrip": false,
168
+ "single_word": false,
169
+ "special": false
170
+ },
171
+ "50273": {
172
+ "content": " ",
173
+ "lstrip": false,
174
+ "normalized": true,
175
+ "rstrip": false,
176
+ "single_word": false,
177
+ "special": false
178
+ },
179
+ "50274": {
180
+ "content": " ",
181
+ "lstrip": false,
182
+ "normalized": true,
183
+ "rstrip": false,
184
+ "single_word": false,
185
+ "special": false
186
+ },
187
+ "50275": {
188
+ "content": " ",
189
+ "lstrip": false,
190
+ "normalized": true,
191
+ "rstrip": false,
192
+ "single_word": false,
193
+ "special": false
194
+ },
195
+ "50276": {
196
+ "content": " ",
197
+ "lstrip": false,
198
+ "normalized": true,
199
+ "rstrip": false,
200
+ "single_word": false,
201
+ "special": false
202
+ },
203
+ "50277": {
204
+ "content": "|||EMAIL_ADDRESS|||",
205
+ "lstrip": false,
206
+ "normalized": true,
207
+ "rstrip": false,
208
+ "single_word": false,
209
+ "special": false
210
+ },
211
+ "50278": {
212
+ "content": "|||PHONE_NUMBER|||",
213
+ "lstrip": false,
214
+ "normalized": true,
215
+ "rstrip": false,
216
+ "single_word": false,
217
+ "special": false
218
+ },
219
+ "50279": {
220
+ "content": "<|endoftext|>",
221
+ "lstrip": false,
222
+ "normalized": false,
223
+ "rstrip": false,
224
+ "single_word": false,
225
+ "special": true
226
+ },
227
+ "50280": {
228
+ "content": "[UNK]",
229
+ "lstrip": false,
230
+ "normalized": false,
231
+ "rstrip": false,
232
+ "single_word": false,
233
+ "special": true
234
+ },
235
+ "50281": {
236
+ "content": "[CLS]",
237
+ "lstrip": false,
238
+ "normalized": false,
239
+ "rstrip": false,
240
+ "single_word": false,
241
+ "special": true
242
+ },
243
+ "50282": {
244
+ "content": "[SEP]",
245
+ "lstrip": false,
246
+ "normalized": false,
247
+ "rstrip": false,
248
+ "single_word": false,
249
+ "special": true
250
+ },
251
+ "50283": {
252
+ "content": "[PAD]",
253
+ "lstrip": false,
254
+ "normalized": false,
255
+ "rstrip": false,
256
+ "single_word": false,
257
+ "special": true
258
+ },
259
+ "50284": {
260
+ "content": "[MASK]",
261
+ "lstrip": true,
262
+ "normalized": false,
263
+ "rstrip": false,
264
+ "single_word": false,
265
+ "special": true
266
+ },
267
+ "50285": {
268
+ "content": "[unused0]",
269
+ "lstrip": false,
270
+ "normalized": true,
271
+ "rstrip": false,
272
+ "single_word": false,
273
+ "special": false
274
+ },
275
+ "50286": {
276
+ "content": "[unused1]",
277
+ "lstrip": false,
278
+ "normalized": true,
279
+ "rstrip": false,
280
+ "single_word": false,
281
+ "special": false
282
+ },
283
+ "50287": {
284
+ "content": "[unused2]",
285
+ "lstrip": false,
286
+ "normalized": true,
287
+ "rstrip": false,
288
+ "single_word": false,
289
+ "special": false
290
+ },
291
+ "50288": {
292
+ "content": "[unused3]",
293
+ "lstrip": false,
294
+ "normalized": true,
295
+ "rstrip": false,
296
+ "single_word": false,
297
+ "special": false
298
+ },
299
+ "50289": {
300
+ "content": "[unused4]",
301
+ "lstrip": false,
302
+ "normalized": true,
303
+ "rstrip": false,
304
+ "single_word": false,
305
+ "special": false
306
+ },
307
+ "50290": {
308
+ "content": "[unused5]",
309
+ "lstrip": false,
310
+ "normalized": true,
311
+ "rstrip": false,
312
+ "single_word": false,
313
+ "special": false
314
+ },
315
+ "50291": {
316
+ "content": "[unused6]",
317
+ "lstrip": false,
318
+ "normalized": true,
319
+ "rstrip": false,
320
+ "single_word": false,
321
+ "special": false
322
+ },
323
+ "50292": {
324
+ "content": "[unused7]",
325
+ "lstrip": false,
326
+ "normalized": true,
327
+ "rstrip": false,
328
+ "single_word": false,
329
+ "special": false
330
+ },
331
+ "50293": {
332
+ "content": "[unused8]",
333
+ "lstrip": false,
334
+ "normalized": true,
335
+ "rstrip": false,
336
+ "single_word": false,
337
+ "special": false
338
+ },
339
+ "50294": {
340
+ "content": "[unused9]",
341
+ "lstrip": false,
342
+ "normalized": true,
343
+ "rstrip": false,
344
+ "single_word": false,
345
+ "special": false
346
+ },
347
+ "50295": {
348
+ "content": "[unused10]",
349
+ "lstrip": false,
350
+ "normalized": true,
351
+ "rstrip": false,
352
+ "single_word": false,
353
+ "special": false
354
+ },
355
+ "50296": {
356
+ "content": "[unused11]",
357
+ "lstrip": false,
358
+ "normalized": true,
359
+ "rstrip": false,
360
+ "single_word": false,
361
+ "special": false
362
+ },
363
+ "50297": {
364
+ "content": "[unused12]",
365
+ "lstrip": false,
366
+ "normalized": true,
367
+ "rstrip": false,
368
+ "single_word": false,
369
+ "special": false
370
+ },
371
+ "50298": {
372
+ "content": "[unused13]",
373
+ "lstrip": false,
374
+ "normalized": true,
375
+ "rstrip": false,
376
+ "single_word": false,
377
+ "special": false
378
+ },
379
+ "50299": {
380
+ "content": "[unused14]",
381
+ "lstrip": false,
382
+ "normalized": true,
383
+ "rstrip": false,
384
+ "single_word": false,
385
+ "special": false
386
+ },
387
+ "50300": {
388
+ "content": "[unused15]",
389
+ "lstrip": false,
390
+ "normalized": true,
391
+ "rstrip": false,
392
+ "single_word": false,
393
+ "special": false
394
+ },
395
+ "50301": {
396
+ "content": "[unused16]",
397
+ "lstrip": false,
398
+ "normalized": true,
399
+ "rstrip": false,
400
+ "single_word": false,
401
+ "special": false
402
+ },
403
+ "50302": {
404
+ "content": "[unused17]",
405
+ "lstrip": false,
406
+ "normalized": true,
407
+ "rstrip": false,
408
+ "single_word": false,
409
+ "special": false
410
+ },
411
+ "50303": {
412
+ "content": "[unused18]",
413
+ "lstrip": false,
414
+ "normalized": true,
415
+ "rstrip": false,
416
+ "single_word": false,
417
+ "special": false
418
+ },
419
+ "50304": {
420
+ "content": "[unused19]",
421
+ "lstrip": false,
422
+ "normalized": true,
423
+ "rstrip": false,
424
+ "single_word": false,
425
+ "special": false
426
+ },
427
+ "50305": {
428
+ "content": "[unused20]",
429
+ "lstrip": false,
430
+ "normalized": true,
431
+ "rstrip": false,
432
+ "single_word": false,
433
+ "special": false
434
+ },
435
+ "50306": {
436
+ "content": "[unused21]",
437
+ "lstrip": false,
438
+ "normalized": true,
439
+ "rstrip": false,
440
+ "single_word": false,
441
+ "special": false
442
+ },
443
+ "50307": {
444
+ "content": "[unused22]",
445
+ "lstrip": false,
446
+ "normalized": true,
447
+ "rstrip": false,
448
+ "single_word": false,
449
+ "special": false
450
+ },
451
+ "50308": {
452
+ "content": "[unused23]",
453
+ "lstrip": false,
454
+ "normalized": true,
455
+ "rstrip": false,
456
+ "single_word": false,
457
+ "special": false
458
+ },
459
+ "50309": {
460
+ "content": "[unused24]",
461
+ "lstrip": false,
462
+ "normalized": true,
463
+ "rstrip": false,
464
+ "single_word": false,
465
+ "special": false
466
+ },
467
+ "50310": {
468
+ "content": "[unused25]",
469
+ "lstrip": false,
470
+ "normalized": true,
471
+ "rstrip": false,
472
+ "single_word": false,
473
+ "special": false
474
+ },
475
+ "50311": {
476
+ "content": "[unused26]",
477
+ "lstrip": false,
478
+ "normalized": true,
479
+ "rstrip": false,
480
+ "single_word": false,
481
+ "special": false
482
+ },
483
+ "50312": {
484
+ "content": "[unused27]",
485
+ "lstrip": false,
486
+ "normalized": true,
487
+ "rstrip": false,
488
+ "single_word": false,
489
+ "special": false
490
+ },
491
+ "50313": {
492
+ "content": "[unused28]",
493
+ "lstrip": false,
494
+ "normalized": true,
495
+ "rstrip": false,
496
+ "single_word": false,
497
+ "special": false
498
+ },
499
+ "50314": {
500
+ "content": "[unused29]",
501
+ "lstrip": false,
502
+ "normalized": true,
503
+ "rstrip": false,
504
+ "single_word": false,
505
+ "special": false
506
+ },
507
+ "50315": {
508
+ "content": "[unused30]",
509
+ "lstrip": false,
510
+ "normalized": true,
511
+ "rstrip": false,
512
+ "single_word": false,
513
+ "special": false
514
+ },
515
+ "50316": {
516
+ "content": "[unused31]",
517
+ "lstrip": false,
518
+ "normalized": true,
519
+ "rstrip": false,
520
+ "single_word": false,
521
+ "special": false
522
+ },
523
+ "50317": {
524
+ "content": "[unused32]",
525
+ "lstrip": false,
526
+ "normalized": true,
527
+ "rstrip": false,
528
+ "single_word": false,
529
+ "special": false
530
+ },
531
+ "50318": {
532
+ "content": "[unused33]",
533
+ "lstrip": false,
534
+ "normalized": true,
535
+ "rstrip": false,
536
+ "single_word": false,
537
+ "special": false
538
+ },
539
+ "50319": {
540
+ "content": "[unused34]",
541
+ "lstrip": false,
542
+ "normalized": true,
543
+ "rstrip": false,
544
+ "single_word": false,
545
+ "special": false
546
+ },
547
+ "50320": {
548
+ "content": "[unused35]",
549
+ "lstrip": false,
550
+ "normalized": true,
551
+ "rstrip": false,
552
+ "single_word": false,
553
+ "special": false
554
+ },
555
+ "50321": {
556
+ "content": "[unused36]",
557
+ "lstrip": false,
558
+ "normalized": true,
559
+ "rstrip": false,
560
+ "single_word": false,
561
+ "special": false
562
+ },
563
+ "50322": {
564
+ "content": "[unused37]",
565
+ "lstrip": false,
566
+ "normalized": true,
567
+ "rstrip": false,
568
+ "single_word": false,
569
+ "special": false
570
+ },
571
+ "50323": {
572
+ "content": "[unused38]",
573
+ "lstrip": false,
574
+ "normalized": true,
575
+ "rstrip": false,
576
+ "single_word": false,
577
+ "special": false
578
+ },
579
+ "50324": {
580
+ "content": "[unused39]",
581
+ "lstrip": false,
582
+ "normalized": true,
583
+ "rstrip": false,
584
+ "single_word": false,
585
+ "special": false
586
+ },
587
+ "50325": {
588
+ "content": "[unused40]",
589
+ "lstrip": false,
590
+ "normalized": true,
591
+ "rstrip": false,
592
+ "single_word": false,
593
+ "special": false
594
+ },
595
+ "50326": {
596
+ "content": "[unused41]",
597
+ "lstrip": false,
598
+ "normalized": true,
599
+ "rstrip": false,
600
+ "single_word": false,
601
+ "special": false
602
+ },
603
+ "50327": {
604
+ "content": "[unused42]",
605
+ "lstrip": false,
606
+ "normalized": true,
607
+ "rstrip": false,
608
+ "single_word": false,
609
+ "special": false
610
+ },
611
+ "50328": {
612
+ "content": "[unused43]",
613
+ "lstrip": false,
614
+ "normalized": true,
615
+ "rstrip": false,
616
+ "single_word": false,
617
+ "special": false
618
+ },
619
+ "50329": {
620
+ "content": "[unused44]",
621
+ "lstrip": false,
622
+ "normalized": true,
623
+ "rstrip": false,
624
+ "single_word": false,
625
+ "special": false
626
+ },
627
+ "50330": {
628
+ "content": "[unused45]",
629
+ "lstrip": false,
630
+ "normalized": true,
631
+ "rstrip": false,
632
+ "single_word": false,
633
+ "special": false
634
+ },
635
+ "50331": {
636
+ "content": "[unused46]",
637
+ "lstrip": false,
638
+ "normalized": true,
639
+ "rstrip": false,
640
+ "single_word": false,
641
+ "special": false
642
+ },
643
+ "50332": {
644
+ "content": "[unused47]",
645
+ "lstrip": false,
646
+ "normalized": true,
647
+ "rstrip": false,
648
+ "single_word": false,
649
+ "special": false
650
+ },
651
+ "50333": {
652
+ "content": "[unused48]",
653
+ "lstrip": false,
654
+ "normalized": true,
655
+ "rstrip": false,
656
+ "single_word": false,
657
+ "special": false
658
+ },
659
+ "50334": {
660
+ "content": "[unused49]",
661
+ "lstrip": false,
662
+ "normalized": true,
663
+ "rstrip": false,
664
+ "single_word": false,
665
+ "special": false
666
+ },
667
+ "50335": {
668
+ "content": "[unused50]",
669
+ "lstrip": false,
670
+ "normalized": true,
671
+ "rstrip": false,
672
+ "single_word": false,
673
+ "special": false
674
+ },
675
+ "50336": {
676
+ "content": "[unused51]",
677
+ "lstrip": false,
678
+ "normalized": true,
679
+ "rstrip": false,
680
+ "single_word": false,
681
+ "special": false
682
+ },
683
+ "50337": {
684
+ "content": "[unused52]",
685
+ "lstrip": false,
686
+ "normalized": true,
687
+ "rstrip": false,
688
+ "single_word": false,
689
+ "special": false
690
+ },
691
+ "50338": {
692
+ "content": "[unused53]",
693
+ "lstrip": false,
694
+ "normalized": true,
695
+ "rstrip": false,
696
+ "single_word": false,
697
+ "special": false
698
+ },
699
+ "50339": {
700
+ "content": "[unused54]",
701
+ "lstrip": false,
702
+ "normalized": true,
703
+ "rstrip": false,
704
+ "single_word": false,
705
+ "special": false
706
+ },
707
+ "50340": {
708
+ "content": "[unused55]",
709
+ "lstrip": false,
710
+ "normalized": true,
711
+ "rstrip": false,
712
+ "single_word": false,
713
+ "special": false
714
+ },
715
+ "50341": {
716
+ "content": "[unused56]",
717
+ "lstrip": false,
718
+ "normalized": true,
719
+ "rstrip": false,
720
+ "single_word": false,
721
+ "special": false
722
+ },
723
+ "50342": {
724
+ "content": "[unused57]",
725
+ "lstrip": false,
726
+ "normalized": true,
727
+ "rstrip": false,
728
+ "single_word": false,
729
+ "special": false
730
+ },
731
+ "50343": {
732
+ "content": "[unused58]",
733
+ "lstrip": false,
734
+ "normalized": true,
735
+ "rstrip": false,
736
+ "single_word": false,
737
+ "special": false
738
+ },
739
+ "50344": {
740
+ "content": "[unused59]",
741
+ "lstrip": false,
742
+ "normalized": true,
743
+ "rstrip": false,
744
+ "single_word": false,
745
+ "special": false
746
+ },
747
+ "50345": {
748
+ "content": "[unused60]",
749
+ "lstrip": false,
750
+ "normalized": true,
751
+ "rstrip": false,
752
+ "single_word": false,
753
+ "special": false
754
+ },
755
+ "50346": {
756
+ "content": "[unused61]",
757
+ "lstrip": false,
758
+ "normalized": true,
759
+ "rstrip": false,
760
+ "single_word": false,
761
+ "special": false
762
+ },
763
+ "50347": {
764
+ "content": "[unused62]",
765
+ "lstrip": false,
766
+ "normalized": true,
767
+ "rstrip": false,
768
+ "single_word": false,
769
+ "special": false
770
+ },
771
+ "50348": {
772
+ "content": "[unused63]",
773
+ "lstrip": false,
774
+ "normalized": true,
775
+ "rstrip": false,
776
+ "single_word": false,
777
+ "special": false
778
+ },
779
+ "50349": {
780
+ "content": "[unused64]",
781
+ "lstrip": false,
782
+ "normalized": true,
783
+ "rstrip": false,
784
+ "single_word": false,
785
+ "special": false
786
+ },
787
+ "50350": {
788
+ "content": "[unused65]",
789
+ "lstrip": false,
790
+ "normalized": true,
791
+ "rstrip": false,
792
+ "single_word": false,
793
+ "special": false
794
+ },
795
+ "50351": {
796
+ "content": "[unused66]",
797
+ "lstrip": false,
798
+ "normalized": true,
799
+ "rstrip": false,
800
+ "single_word": false,
801
+ "special": false
802
+ },
803
+ "50352": {
804
+ "content": "[unused67]",
805
+ "lstrip": false,
806
+ "normalized": true,
807
+ "rstrip": false,
808
+ "single_word": false,
809
+ "special": false
810
+ },
811
+ "50353": {
812
+ "content": "[unused68]",
813
+ "lstrip": false,
814
+ "normalized": true,
815
+ "rstrip": false,
816
+ "single_word": false,
817
+ "special": false
818
+ },
819
+ "50354": {
820
+ "content": "[unused69]",
821
+ "lstrip": false,
822
+ "normalized": true,
823
+ "rstrip": false,
824
+ "single_word": false,
825
+ "special": false
826
+ },
827
+ "50355": {
828
+ "content": "[unused70]",
829
+ "lstrip": false,
830
+ "normalized": true,
831
+ "rstrip": false,
832
+ "single_word": false,
833
+ "special": false
834
+ },
835
+ "50356": {
836
+ "content": "[unused71]",
837
+ "lstrip": false,
838
+ "normalized": true,
839
+ "rstrip": false,
840
+ "single_word": false,
841
+ "special": false
842
+ },
843
+ "50357": {
844
+ "content": "[unused72]",
845
+ "lstrip": false,
846
+ "normalized": true,
847
+ "rstrip": false,
848
+ "single_word": false,
849
+ "special": false
850
+ },
851
+ "50358": {
852
+ "content": "[unused73]",
853
+ "lstrip": false,
854
+ "normalized": true,
855
+ "rstrip": false,
856
+ "single_word": false,
857
+ "special": false
858
+ },
859
+ "50359": {
860
+ "content": "[unused74]",
861
+ "lstrip": false,
862
+ "normalized": true,
863
+ "rstrip": false,
864
+ "single_word": false,
865
+ "special": false
866
+ },
867
+ "50360": {
868
+ "content": "[unused75]",
869
+ "lstrip": false,
870
+ "normalized": true,
871
+ "rstrip": false,
872
+ "single_word": false,
873
+ "special": false
874
+ },
875
+ "50361": {
876
+ "content": "[unused76]",
877
+ "lstrip": false,
878
+ "normalized": true,
879
+ "rstrip": false,
880
+ "single_word": false,
881
+ "special": false
882
+ },
883
+ "50362": {
884
+ "content": "[unused77]",
885
+ "lstrip": false,
886
+ "normalized": true,
887
+ "rstrip": false,
888
+ "single_word": false,
889
+ "special": false
890
+ },
891
+ "50363": {
892
+ "content": "[unused78]",
893
+ "lstrip": false,
894
+ "normalized": true,
895
+ "rstrip": false,
896
+ "single_word": false,
897
+ "special": false
898
+ },
899
+ "50364": {
900
+ "content": "[unused79]",
901
+ "lstrip": false,
902
+ "normalized": true,
903
+ "rstrip": false,
904
+ "single_word": false,
905
+ "special": false
906
+ },
907
+ "50365": {
908
+ "content": "[unused80]",
909
+ "lstrip": false,
910
+ "normalized": true,
911
+ "rstrip": false,
912
+ "single_word": false,
913
+ "special": false
914
+ },
915
+ "50366": {
916
+ "content": "[unused81]",
917
+ "lstrip": false,
918
+ "normalized": true,
919
+ "rstrip": false,
920
+ "single_word": false,
921
+ "special": false
922
+ },
923
+ "50367": {
924
+ "content": "[unused82]",
925
+ "lstrip": false,
926
+ "normalized": true,
927
+ "rstrip": false,
928
+ "single_word": false,
929
+ "special": false
930
+ }
931
+ },
932
+ "clean_up_tokenization_spaces": true,
933
+ "cls_token": "[CLS]",
934
+ "extra_special_tokens": {},
935
+ "mask_token": "[MASK]",
936
+ "max_length": 512,
937
+ "model_input_names": [
938
+ "input_ids",
939
+ "attention_mask"
940
+ ],
941
+ "model_max_length": 512,
942
+ "pad_to_multiple_of": null,
943
+ "pad_token": "[PAD]",
944
+ "pad_token_type_id": 0,
945
+ "padding_side": "right",
946
+ "sep_token": "[SEP]",
947
+ "stride": 0,
948
+ "tokenizer_class": "PreTrainedTokenizerFast",
949
+ "truncation_side": "right",
950
+ "truncation_strategy": "longest_first",
951
+ "unk_token": "[UNK]"
952
+ }