frankwong2001 commited on
Commit
cd663dc
·
verified ·
1 Parent(s): 26ca5f3

Add new SentenceTransformer model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,605 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - sentence-similarity
5
+ - feature-extraction
6
+ - dense
7
+ - generated_from_trainer
8
+ - dataset_size:10556
9
+ - loss:MultipleNegativesRankingLoss
10
+ base_model: nomic-ai/modernbert-embed-base
11
+ widget:
12
+ - source_sentence: The Risk Analytics Analyst/Compliance Analytics Analyst is responsible
13
+ for the development, implementation and/or utilisation of quantitative models
14
+ and data analysis to support day-to-day risk and compliance functions. He/She
15
+ supports independent research required for the development of risk and compliance
16
+ quantitative models and data analytics methodologies, along with testing and validation
17
+ to ensure their suitability for business requirements. He enables the deployment
18
+ of models and guides others in the use of analytics to support business needs.
19
+ He is also involved in the conduct of analysis and modelling, and compiles findings
20
+ to draw insights and create reports. The Risk Analytics Analyst/Compliance Analytics
21
+ Analyst is technically proficient with numerical, quantitative and data analysis
22
+ approaches to meet business requirements. He is highly analytical, conceptual
23
+ and able to communicate complex ideas in simple and easy to understand terms.
24
+ He is able to draw connections between numerical data and contexts within risk
25
+ and/or compliance functions to provide evidence and insights to influence decision-making.
26
+ sentences:
27
+ - "The Risk Analyst is responsible for overseeing the financial auditing processes\
28
+ \ within the organization. \n\nThe Compliance Manager is tasked with leading a\
29
+ \ team of junior analysts to ensure adherence to industry regulations and standards.\n\
30
+ \nThe Data Science Associate focuses on developing machine learning algorithms\
31
+ \ for product recommendations in the retail sector.\n\nThe Risk Management Specialist\
32
+ \ works in a global market, ensuring compliance with international standards and\
33
+ \ regulations across multiple countries.\n\nThe Business Intelligence Analyst\
34
+ \ combines responsibilities from data analysis and project management, overseeing\
35
+ \ digital marketing strategies while managing risk assessments."
36
+ - 'The Multimedia Specialist is tasked with the technical execution of multimedia
37
+ content design strategies. This role encompasses the planning, setup, and maintenance
38
+ of systems including servers and visual playback devices, as well as the processing
39
+ and distribution of video signals to visual output devices such as projectors
40
+ and LED displays. A strong understanding of video capture technology, including
41
+ cameras, is essential, along with expertise in designing, deploying, and configuring
42
+ network infrastructure to achieve the desired performance effects.
43
+
44
+
45
+ Additionally, Multimedia Specialists utilize video systems to enhance video content,
46
+ executing techniques like content layering, applying visual effects, and mapping
47
+ projections onto various surfaces. Depending on their qualifications and experience,
48
+ they may also specialize as Network Engineers. These professionals can work in
49
+ diverse environments such as venues, rental firms, production houses, or directly
50
+ within production teams, either on a full-time or casual basis.'
51
+ - The Risk Analytics Analyst is tasked with the design, execution, and application
52
+ of quantitative models and data analysis to enhance daily risk and compliance
53
+ operations. This role involves conducting independent research to develop robust
54
+ risk and compliance models and data analytics methodologies, alongside performing
55
+ testing and validation to ensure alignment with business objectives. The analyst
56
+ facilitates the implementation of these models and provides guidance on analytics
57
+ usage to meet business requirements. Additionally, he/she engages in comprehensive
58
+ analysis and modeling, synthesizing findings to generate insights and produce
59
+ detailed reports. The Risk Analytics Analyst demonstrates strong technical proficiency
60
+ in numerical and quantitative analysis techniques, showcasing exceptional analytical
61
+ and conceptual skills, with the ability to convey complex concepts in a clear
62
+ and accessible manner. He/she effectively connects numerical data with risk and
63
+ compliance contexts to deliver evidence-based insights that support informed decision-making.
64
+ - source_sentence: 'The Optimisation Engineer supports cross-functional clean energy
65
+ areas and is responsible for maximising the efficiency, reliability, and performance
66
+ of smart grid systems and battery energy storage technologies. He/She analyses
67
+ system data to identify inefficiencies, designing and implementing optimisation
68
+ strategies to enhance system performance. He also leads innovative initiatives
69
+ to improve energy storage systems and the smart grid performance, working with
70
+ research and development teams to develop and implement new technologies or software.
71
+
72
+
73
+ He possesses strong problem-solving skills, systems thinking, and data analysis
74
+ proficiency. He must also be adept at innovation and collaboration, working effectively
75
+ with stakeholders to achieve target outcomes.'
76
+ sentences:
77
+ - 'The Optimisation Engineer plays a crucial role in supporting various clean energy
78
+ initiatives and is tasked with enhancing the efficiency, reliability, and performance
79
+ of smart grid systems and battery storage technologies. He/She evaluates system
80
+ data to pinpoint inefficiencies and formulates optimisation strategies to boost
81
+ system performance. Additionally, he/she spearheads innovative projects aimed
82
+ at advancing energy storage solutions and smart grid capabilities, collaborating
83
+ with research and development teams to create and implement new technologies or
84
+ software.
85
+
86
+
87
+ He possesses excellent problem-solving abilities, systems thinking, and expertise
88
+ in data analysis. Furthermore, he must excel in innovation and teamwork, effectively
89
+ engaging with stakeholders to achieve desired results.'
90
+ - The Chef prepares a wide range of dishes, focusing on culinary excellence and
91
+ presentation. He/She manages kitchen operations, ensuring that meals are prepared
92
+ to the highest standards and served promptly. Additionally, he/she experiments
93
+ with new recipes and flavors, collaborating with the kitchen staff to create a
94
+ unique dining experience for guests.
95
+ - The Wardrobe Supervisor is tasked with managing the execution of costume designs
96
+ for a production according to established designs and plans. This includes overseeing
97
+ the acquisition or creation of costumes, fitting and adjusting apparel for individual
98
+ cast members, managing costume operations during performances, and ensuring proper
99
+ maintenance and repair of costumes, which encompasses laundry, ironing, and post-production
100
+ storage. They effectively adhere to costume plans and technical specifications
101
+ while also offering recommendations and creative or technical insights. Wardrobe
102
+ Supervisors are generally found in larger venues, productions, and organizations
103
+ where the scale necessitates a dedicated focus on costumes. In smaller venues
104
+ and productions, other personnel often assume the responsibilities associated
105
+ with this role.
106
+ - source_sentence: The Threat Analysis Manager plans out strategies to pre-empt potential
107
+ threats in an organisation's cyber related systems. He/She is responsible for
108
+ identifying the IT assets that are prone to cyber threats and attacks. He proactively
109
+ monitors the open web and identifies potential threats and groups or individuals
110
+ capable of attempting cyber-attacks. He runs tests and analyses different areas
111
+ of the IT assets to ensure they are safe from cyber-attacks. He is familiar with
112
+ cyber security standards, protocols and frameworks. He is knowledgeable in using
113
+ various cyber security analysis tools and techniques to monitor and identify potential
114
+ incidents. The Threat Analysis Manager is alert and vigilant in performing monitoring
115
+ activities, and is able to analyse and identify potential security-related issues,
116
+ which may have critical impact on security and operational systems. He communicates
117
+ clearly in his interactions with others and coordinates effectively with his team
118
+ to perform security operations.
119
+ sentences:
120
+ - The E-Commerce Manager spearheads the expansion of the online retail sector by
121
+ improving customer interactions and overseeing the process of order fulfillment,
122
+ alongside technological and infrastructure strategies. This role also entails
123
+ producing data-informed commercial insights and nurturing partnerships essential
124
+ to the business. Operating within a dynamic and digital-focused setting, he/she
125
+ manages the complete commercialization and operation of the e-commerce platform.
126
+ As a motivated, cooperative, and results-driven leader, he/she demonstrates technological
127
+ proficiency and strong business insight, effectively juggling various projects.
128
+ - The Event Coordinator organizes and manages logistics for corporate events and
129
+ functions. He/She is responsible for coordinating with vendors, managing budgets,
130
+ and ensuring that all event details are executed smoothly. The coordinator works
131
+ closely with clients to understand their needs and preferences, and creates memorable
132
+ experiences through effective planning and execution. He/She is adept at problem-solving
133
+ and handles unexpected challenges with ease, ensuring successful events that align
134
+ with client expectations.
135
+ - The Threat Analysis Manager is responsible for developing strategies to proactively
136
+ address potential threats to an organization’s cyber systems. He/She identifies
137
+ IT assets vulnerable to cyber threats and continuously monitors the open web for
138
+ potential dangers and groups or individuals that may attempt cyber-attacks. The
139
+ manager conducts tests and analyzes various aspects of IT assets to ensure their
140
+ safety from cyber threats. With a strong understanding of cyber security standards,
141
+ protocols, and frameworks, he/she utilizes a variety of cyber security analysis
142
+ tools and techniques to detect and respond to potential incidents. The Threat
143
+ Analysis Manager remains alert and vigilant during monitoring activities, effectively
144
+ analyzing and identifying security-related issues that could significantly impact
145
+ security and operational systems. He/She communicates clearly and collaborates
146
+ efficiently with team members to execute security operations.
147
+ - source_sentence: The Lead Designer manages all aspects of the design process, from
148
+ research and ideation to creative conceptualisation and design. He/She collaborates
149
+ with stakeholders to research and develop cohesive design plans, concepts and
150
+ prototypes. As a team lead, he initiates research activities to be performed and
151
+ provides on-the-job training to enhance the core competence of his team members.
152
+ He also works with a diverse group of internal and external stakeholders to ensure
153
+ final design output meet the needs of the organisation or customers. The ability
154
+ to delegate and lead project teams towards successful adoption of new design ideas
155
+ is essential for the Lead Designer. He possesses a strong mastery of design fundamentals
156
+ in and can generate creative work that meets the requirements of stakeholders.
157
+ He is able to work on multiple projects concurrently and deliver on expectations
158
+ within tight deadlines. He may specialise as an Architect, Landscape Architect/Landscape
159
+ Designer, Interior Designer, Fashion Designer, Product Designer, Furniture Designer,
160
+ Graphic Designer and/or Interaction Designer, etc.
161
+ sentences:
162
+ - The Lead Designer oversees every facet of the design workflow, ranging from research
163
+ and idea generation to innovative conceptualization and execution. This role involves
164
+ working closely with stakeholders to formulate comprehensive design strategies,
165
+ concepts, and prototypes. Acting as a team leader, he/she initiates research efforts
166
+ and provides mentorship to bolster the skill set of team members. Collaborating
167
+ with a varied group of internal and external partners, he/she ensures that the
168
+ final design outputs align with the requirements of the organization and its customers.
169
+ The ability to guide and motivate project teams towards the successful implementation
170
+ of new design concepts is crucial for the Lead Designer. With a robust understanding
171
+ of design principles, he/she consistently produces creative outputs that satisfy
172
+ stakeholder expectations. The role demands the capacity to manage multiple projects
173
+ simultaneously and meet deadlines effectively. Specializations may include Architecture,
174
+ Landscape Architecture, Interior Design, Fashion Design, Product Design, Furniture
175
+ Design, Graphic Design, or Interaction Design.
176
+ - The Project Executive is tasked with collecting requirements from both internal
177
+ and external stakeholders, as well as planning and executing project logistics
178
+ for the storage and transportation of complex and heavy cargo. Additionally, he/she
179
+ manages contractors and vendors to ensure adherence to the project lifecycle and
180
+ compliance with project specifications. With strong analytical and systematic
181
+ skills, he/she is expected to explore alternative solutions and evaluate the feasibility
182
+ of plans. He/she must also work closely with stakeholders to implement new processes
183
+ and technologies that provide innovative solutions for customers.
184
+ - "The Lead Designer evaluates the performance of the marketing team, focusing on\
185
+ \ campaign execution and brand messaging. \n\nThe Lead Designer directs a junior\
186
+ \ team of graphic artists, requiring minimal experience and limited decision-making\
187
+ \ authority.\n\nThe Lead Designer analyzes compliance regulations in the healthcare\
188
+ \ sector, applying design skills to ensure adherence to legal standards.\n\nThe\
189
+ \ Lead Designer is responsible for design projects in a different country, adapting\
190
+ \ to unique cultural and regulatory challenges while managing cross-border teams.\n\
191
+ \nThe Lead Designer combines responsibilities of a project manager and a software\
192
+ \ engineer, overseeing both design initiatives and technical development without\
193
+ \ clear delineation of roles."
194
+ - source_sentence: The Assistant Pastry Cook/Assistant Baker/Kitchen Assistant is
195
+ responsible for the production of pastry and baked goods. He/She prepares the
196
+ baking equipment and ingredients, and applies finishing touches in post-production
197
+ of pastries and baked goods. He follows hygiene, safety and other standards, and
198
+ carries out food and beverage operational tasks. He may suggest areas for continuous
199
+ improvement within his own workstation. Attentive and meticulous, he possesses
200
+ good time management skills and is able to multi-task, while performing physical
201
+ tasks in a high-volume production environment. He is able to work under high temperatures,
202
+ and in a flexible schedule, including weekends, evenings, and public holidays.
203
+ sentences:
204
+ - The Manager (Airside Operations) is responsible for crafting emergency response
205
+ strategies and implementing policies for Foreign Object Debris (FOD) clearance.
206
+ This role includes reviewing and refining Standard Operating Procedures (SOPs)
207
+ to enhance the allocation of aircraft stands and streamline operational planning.
208
+ The manager ensures compliance with safety and performance benchmarks at the airside
209
+ while developing systems to ensure adherence to safety and security protocols.
210
+ As a mentor, they assess the developmental needs of their team members and provide
211
+ guidance to help them reach their full potential. Additionally, the Manager (Airside
212
+ Operations) leads change management efforts within the organization. A deep understanding
213
+ of airport operations and industry standards is essential, along with familiarity
214
+ with aerodrome safety SOPs. Staying updated on international regulations and developments
215
+ in airside operations is crucial. Strong management and stakeholder engagement
216
+ skills are necessary to lead the team effectively and represent the organization
217
+ in external interactions.
218
+ - The Assistant Chef/Line Cook/Kitchen Staff is responsible for the preparation
219
+ of savory dishes and entrees. He/She sets up the cooking appliances and ingredients,
220
+ and applies garnishes during the serving of dishes. He follows cleanliness, safety,
221
+ and other protocols, and performs kitchen and dining operational duties. He may
222
+ recommend changes for better efficiency within his own cooking area. Attentive
223
+ and thorough, he exhibits strong organizational skills and is adept at prioritizing
224
+ tasks, while executing routine duties in a low-volume service environment. He
225
+ is able to work in cold conditions, and adheres to a rigid schedule, which excludes
226
+ weekends, evenings, and public holidays.
227
+ - The Assistant Pastry Cook/Assistant Baker/Kitchen Assistant is tasked with producing
228
+ a variety of pastries and baked goods. He/She prepares the necessary baking equipment
229
+ and ingredients, and adds finishing touches during the post-production phase of
230
+ the items. He adheres to hygiene, safety, and other relevant standards, while
231
+ executing food and beverage operational duties. He may also identify opportunities
232
+ for continuous improvement in his workstation. Detail-oriented and diligent, he
233
+ demonstrates strong time management skills and the ability to multi-task in a
234
+ fast-paced production setting. He is capable of working in high temperatures and
235
+ maintains a flexible schedule, which may include weekends, evenings, and public
236
+ holidays.
237
+ datasets:
238
+ - frankwong2001/ssf-train-valid-combi-v1v2v3
239
+ pipeline_tag: sentence-similarity
240
+ library_name: sentence-transformers
241
+ ---
242
+
243
+ # SentenceTransformer based on nomic-ai/modernbert-embed-base
244
+
245
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [nomic-ai/modernbert-embed-base](https://huggingface.co/nomic-ai/modernbert-embed-base) on the [ssf-train-valid-combi-v1v2v3](https://huggingface.co/datasets/frankwong2001/ssf-train-valid-combi-v1v2v3) dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
246
+
247
+ ## Model Details
248
+
249
+ ### Model Description
250
+ - **Model Type:** Sentence Transformer
251
+ - **Base model:** [nomic-ai/modernbert-embed-base](https://huggingface.co/nomic-ai/modernbert-embed-base) <!-- at revision d556a88e332558790b210f7bdbe87da2fa94a8d8 -->
252
+ - **Maximum Sequence Length:** 8192 tokens
253
+ - **Output Dimensionality:** 768 dimensions
254
+ - **Similarity Function:** Cosine Similarity
255
+ - **Training Dataset:**
256
+ - [ssf-train-valid-combi-v1v2v3](https://huggingface.co/datasets/frankwong2001/ssf-train-valid-combi-v1v2v3)
257
+ <!-- - **Language:** Unknown -->
258
+ <!-- - **License:** Unknown -->
259
+
260
+ ### Model Sources
261
+
262
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
263
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
264
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
265
+
266
+ ### Full Model Architecture
267
+
268
+ ```
269
+ SentenceTransformer(
270
+ (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False, 'architecture': 'ModernBertModel'})
271
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
272
+ (2): Normalize()
273
+ )
274
+ ```
275
+
276
+ ## Usage
277
+
278
+ ### Direct Usage (Sentence Transformers)
279
+
280
+ First install the Sentence Transformers library:
281
+
282
+ ```bash
283
+ pip install -U sentence-transformers
284
+ ```
285
+
286
+ Then you can load this model and run inference.
287
+ ```python
288
+ from sentence_transformers import SentenceTransformer
289
+
290
+ # Download from the 🤗 Hub
291
+ model = SentenceTransformer("frankwong2001/4_modernbert-embed-base")
292
+ # Run inference
293
+ sentences = [
294
+ 'The Assistant Pastry Cook/Assistant Baker/Kitchen Assistant is responsible for the production of pastry and baked goods. He/She prepares the baking equipment and ingredients, and applies finishing touches in post-production of pastries and baked goods. He follows hygiene, safety and other standards, and carries out food and beverage operational tasks. He may suggest areas for continuous improvement within his own workstation. Attentive and meticulous, he possesses good time management skills and is able to multi-task, while performing physical tasks in a high-volume production environment. He is able to work under high temperatures, and in a flexible schedule, including weekends, evenings, and public holidays.',
295
+ 'The Assistant Pastry Cook/Assistant Baker/Kitchen Assistant is tasked with producing a variety of pastries and baked goods. He/She prepares the necessary baking equipment and ingredients, and adds finishing touches during the post-production phase of the items. He adheres to hygiene, safety, and other relevant standards, while executing food and beverage operational duties. He may also identify opportunities for continuous improvement in his workstation. Detail-oriented and diligent, he demonstrates strong time management skills and the ability to multi-task in a fast-paced production setting. He is capable of working in high temperatures and maintains a flexible schedule, which may include weekends, evenings, and public holidays.',
296
+ 'The Assistant Chef/Line Cook/Kitchen Staff is responsible for the preparation of savory dishes and entrees. He/She sets up the cooking appliances and ingredients, and applies garnishes during the serving of dishes. He follows cleanliness, safety, and other protocols, and performs kitchen and dining operational duties. He may recommend changes for better efficiency within his own cooking area. Attentive and thorough, he exhibits strong organizational skills and is adept at prioritizing tasks, while executing routine duties in a low-volume service environment. He is able to work in cold conditions, and adheres to a rigid schedule, which excludes weekends, evenings, and public holidays.',
297
+ ]
298
+ embeddings = model.encode(sentences)
299
+ print(embeddings.shape)
300
+ # [3, 768]
301
+
302
+ # Get the similarity scores for the embeddings
303
+ similarities = model.similarity(embeddings, embeddings)
304
+ print(similarities)
305
+ # tensor([[1.0000, 0.9323, 0.2374],
306
+ # [0.9323, 1.0000, 0.2841],
307
+ # [0.2374, 0.2841, 1.0000]])
308
+ ```
309
+
310
+ <!--
311
+ ### Direct Usage (Transformers)
312
+
313
+ <details><summary>Click to see the direct usage in Transformers</summary>
314
+
315
+ </details>
316
+ -->
317
+
318
+ <!--
319
+ ### Downstream Usage (Sentence Transformers)
320
+
321
+ You can finetune this model on your own dataset.
322
+
323
+ <details><summary>Click to expand</summary>
324
+
325
+ </details>
326
+ -->
327
+
328
+ <!--
329
+ ### Out-of-Scope Use
330
+
331
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
332
+ -->
333
+
334
+ <!--
335
+ ## Bias, Risks and Limitations
336
+
337
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
338
+ -->
339
+
340
+ <!--
341
+ ### Recommendations
342
+
343
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
344
+ -->
345
+
346
+ ## Training Details
347
+
348
+ ### Training Dataset
349
+
350
+ #### ssf-train-valid-combi-v1v2v3
351
+
352
+ * Dataset: [ssf-train-valid-combi-v1v2v3](https://huggingface.co/datasets/frankwong2001/ssf-train-valid-combi-v1v2v3) at [cf63f9b](https://huggingface.co/datasets/frankwong2001/ssf-train-valid-combi-v1v2v3/tree/cf63f9b57bcbd4071c5ea589a19d78efa3548516)
353
+ * Size: 10,556 training samples
354
+ * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
355
+ * Approximate statistics based on the first 1000 samples:
356
+ | | anchor | positive | negative |
357
+ |:--------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------|
358
+ | type | string | string | string |
359
+ | details | <ul><li>min: 63 tokens</li><li>mean: 168.29 tokens</li><li>max: 355 tokens</li></ul> | <ul><li>min: 66 tokens</li><li>mean: 161.05 tokens</li><li>max: 317 tokens</li></ul> | <ul><li>min: 35 tokens</li><li>mean: 175.65 tokens</li><li>max: 1455 tokens</li></ul> |
360
+ * Samples:
361
+ | anchor | positive | negative |
362
+ |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
363
+ | <code>Lighting Designers are responsible for crafting lighting designs in line with the creative vision of the production. They are responsible for designing all aspects of lighting; from creating plans and specifying equipment placement to cue development, including brightness, colour and transitions. They are also responsible for all paperwork related to lighting, such as hook-ups, schedules and colour lists. They create lighting plots specifying the placement and configuration of all instruments and oversee lighting during production runs. Lighting Designers lead the lighting team and coordinate the development, installation and operation of the lighting design and any other special electrical effects. They decide on the lighting equipment to use from existing inventory or on the lighting rental package for venues with cold rig. They work in tandem with the creative leadership, other production designers and the lighting team to ensure the lighting complements all creative elements of the...</code> | <code>The Lighting Designer is tasked with developing innovative lighting concepts that align with the artistic direction of the performance. This role involves designing every detail of the lighting setup, from drafting plans and determining equipment locations to developing cues that encompass brightness, color, and transitions. Additionally, the Lighting Designer manages all documentation related to lighting, including connection details, schedules, and color charts. They create comprehensive lighting plots that outline the arrangement and configuration of all lighting fixtures and supervise the lighting during live performances. Leading the lighting team, the Lighting Designer coordinates the design, setup, and operation of the lighting scheme, as well as any unique electrical effects. They make informed decisions regarding the lighting gear to be utilized from the available inventory or select rental packages for venues with limited equipment. Collaborating closely with the creative tea...</code> | <code>The Lighting Technician is responsible for maintaining the sound systems in line with the overall production requirements. They handle all aspects of audio setup, from developing sound plans and specifying equipment placement to cue development, including volume, effects, and transitions. They also manage all paperwork related to sound, such as connection diagrams, schedules, and effect lists. The Lighting Technician creates sound plots specifying the placement and configuration of all audio instruments and oversees sound during rehearsal runs. They lead the sound team and coordinate the development, installation, and operation of the audio design and any other special sound effects. They decide on the audio equipment to use from existing inventory or on the sound rental package for venues with limited resources. They work closely with the technical team to ensure the sound complements all technical elements of the production.</code> |
364
+ | <code>The Associate Director (Facilities Management) is responsible for driving strategies to improve facility operations. He/She builds strategic relationships with stakeholders and drives service excellence. He formulates organisational Workplace Safety and Health (WSH) practices as well as the green building strategies to fulfil environmental sustainability regulations. He is in charge of approving tender specifications, awarding works to selected bidders and endorsing contracts. He oversees the teams' development and recruitment and is responsible for the departments' financial planning and risk management. He is a subject matter expert and possesses excellent negotiation and people management skills. He is able to influence and communicate effectively with internal and external stakeholders.</code> | <code>The Associate Director (Facilities Management) plays a pivotal role in enhancing the efficiency of facility operations. This position involves cultivating strong partnerships with key stakeholders and championing exceptional service delivery. The individual is tasked with developing organizational practices for Workplace Safety and Health (WSH) alongside implementing sustainable green building initiatives to meet environmental regulations. Responsibilities also include approving tender proposals, selecting contractors for projects, and endorsing contractual agreements. Additionally, the Associate Director oversees team growth and recruitment efforts, manages the department's financial strategies, and addresses risk management. A recognized expert in the field, the individual demonstrates outstanding negotiation and interpersonal skills, effectively influencing and engaging with both internal and external parties.</code> | <code>The Associate Director (Event Management) coordinates various events and activities, focusing on logistics and attendee engagement. <br><br>The Associate Director (Facilities Management) is responsible for overseeing entry-level staff and managing basic office supplies without strategic oversight.<br><br>The Associate Director (Quality Assurance) ensures compliance with industry standards in a manufacturing environment, focusing on product testing and regulatory reporting.<br><br>The Associate Director (Facilities Management) manages operations in a multicultural context, aligning with international regulations and cross-border practices.<br><br>The Associate Director (Project Management) combines project planning and client relations, tasked with both strategic oversight and detailed operational execution across multiple projects.</code> |
365
+ | <code>The Derivatives Trading Manager/Senior Derivatives Trader assumes responsibilities of quantifiable derivative trading portfolios and their supporting activities. He/She plans derivative trading activities that support his portfolio objectives and take leads in structured products' deal making. He is expected to develop derivative portfolio strategies to guide positions during various market trends or economic conditions. Armed with strong numerical and business acumen, he possesses a good understanding of market conditions as well.</code> | <code>The Derivatives Trading Manager is responsible for managing quantifiable derivative trading portfolios and their associated activities. He/She plans trading initiatives that align with portfolio objectives and takes the lead in structuring product deals. He is expected to formulate derivative portfolio strategies to navigate positions during diverse market trends or economic circumstances. Equipped with strong numerical skills and business insight, he has a solid grasp of market dynamics as well.</code> | <code>The Derivatives Trading Assistant manages non-quantifiable trading activities and their unrelated tasks. He/She organizes trading schedules that do not align with portfolio goals and takes a supportive role in unstructured product negotiations. He is expected to assist in developing non-derivative strategies to navigate decisions during unrelated market fluctuations or economic situations. Lacking strong numerical skills, he has minimal understanding of market dynamics as well.</code> |
366
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
367
+ ```json
368
+ {
369
+ "scale": 20.0,
370
+ "similarity_fct": "cos_sim",
371
+ "gather_across_devices": false
372
+ }
373
+ ```
374
+
375
+ ### Evaluation Dataset
376
+
377
+ #### ssf-train-valid-combi-v1v2v3
378
+
379
+ * Dataset: [ssf-train-valid-combi-v1v2v3](https://huggingface.co/datasets/frankwong2001/ssf-train-valid-combi-v1v2v3) at [cf63f9b](https://huggingface.co/datasets/frankwong2001/ssf-train-valid-combi-v1v2v3/tree/cf63f9b57bcbd4071c5ea589a19d78efa3548516)
380
+ * Size: 2,639 evaluation samples
381
+ * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
382
+ * Approximate statistics based on the first 1000 samples:
383
+ | | anchor | positive | negative |
384
+ |:--------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------|
385
+ | type | string | string | string |
386
+ | details | <ul><li>min: 57 tokens</li><li>mean: 170.84 tokens</li><li>max: 352 tokens</li></ul> | <ul><li>min: 55 tokens</li><li>mean: 163.57 tokens</li><li>max: 319 tokens</li></ul> | <ul><li>min: 20 tokens</li><li>mean: 189.74 tokens</li><li>max: 1282 tokens</li></ul> |
387
+ * Samples:
388
+ | anchor | positive | negative |
389
+ |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
390
+ | <code>The General Manager/Managing Director/Vice President (Aircraft Maintenance) is responsible for defining the long-term strategic direction to grow the business in line with the organisations overall vision, mission and values. He/She promotes strategic aircraft maintenance programmes for business competitiveness and sets direction for leading aerospace maintenance practices in the organisation. He represents the organisation with customers, investors, and business partners, and holds responsibility for promoting organisational compliance with airworthiness and legislative requirements, fostering a culture of workplace safety and health, and championing leading practices and quality and risk management. He inspires the organisation towards achieving business goals by striving for continuous improvement, driving digital innovation and evaluating the organisation's approach towards a lean and sustainable enterprise. He demonstrates excellent leadership capabilities and builds strategic par...</code> | <code>The General Manager/Managing Director/Vice President (Aircraft Maintenance) is tasked with shaping the long-term strategic vision to enhance business growth in alignment with the organization's core values and mission. He/She advocates for strategic aircraft maintenance initiatives to ensure business competitiveness and provides direction for leading aerospace maintenance practices within the organization. He represents the organization to customers, investors, and business partners, while also ensuring compliance with airworthiness and legislative standards, promoting a culture of workplace safety and health, and championing best practices in quality and risk management. He motivates the organization to reach its business objectives through continuous improvement, driving digital innovation, and assessing the organization's commitment to a lean and sustainable enterprise. He showcases strong leadership skills and cultivates strategic partnerships with stakeholders to advance business ...</code> | <code>The Software Engineer develops innovative applications and ensures software quality through rigorous testing and debugging processes. He/She collaborates with design teams to create user-friendly interfaces and integrates feedback to enhance functionality and performance. He is responsible for maintaining code repositories and documenting software processes while staying updated with the latest technological advancements. He also participates in team meetings to discuss project progress and brainstorm new ideas for product improvement.</code> |
391
+ | <code>The Associate User Interface Designer performs requirements analysis for the design of user interfaces (UIs) and drafts technical specifications for the design of UIs. He/She assists in the development and programming of intuitive and responsive UIs for each screen or page with which a user interacts. He assists in developing prototypes for UIs, conducts usability testing for validation, and supports the evaluation of the effectiveness of the UI. He prepares reports on UI design performance indicators, proposes, modifications in the design of user interface based on user feedback, as well as solutions to address design issues. He works in a team and is familiar with programming languages used by the organisation to design and develop UIs. He is familiar with graphic designing tools, and is also knowledgeable of Universal Principles of Design as well as commonly used design methods. The Associate UI Designer adopts a broad perspective to user interface design concepts, and is open to ex...</code> | <code>The Associate User Interface Designer is responsible for conducting requirements analysis to create user interfaces (UIs) and drafting technical specifications for UI design. He/She contributes to the development and programming of intuitive and responsive UIs for each user interaction screen or page. He assists in creating prototypes for UIs, performs usability testing for validation, and supports the assessment of UI effectiveness. He prepares reports on UI design performance metrics, suggests modifications based on user feedback, and offers solutions to design challenges. He collaborates within a team and is proficient in the programming languages utilized by the organization for UI design and development. He is knowledgeable in graphic design tools and well-versed in Universal Principles of Design and common design methodologies. The Associate UI Designer takes a comprehensive view of user interface design concepts and is eager to explore innovative options in the development of so...</code> | <code>The Associate Software Engineer conducts requirements analysis for the development of software applications and drafts technical specifications for application functionality. He/She assists in the coding and debugging of efficient and scalable applications for various platforms. He aids in creating application prototypes, performs functionality testing for validation, and supports the analysis of application performance metrics. He prepares reports on application development indicators, suggests enhancements based on user input, and provides solutions to functionality issues. He works within a team and is familiar with programming languages employed by the organization for software development. He is experienced with database management tools and knowledgeable of Software Development Life Cycle methodologies as well as commonly used programming practices. The Associate Software Engineer adopts a narrow focus on application development concepts and is hesitant to explore new approaches ...</code> |
392
+ | <code>The Senior Project Engineer is responsible for executing project management plans from start to finish, to ensure project completions on time, and within budget. He/She typically comes from an engineering background with work experience in production and/or design, and is able to develop project schedules, budgets and manage project staff and subcontractors. He has good communication and negotiation skills for engaging internal and external parties to secure specialised resources and contributions for projects, and managing ongoing relationships with sub-contractors. He oversees sub-contractors schedules, performance, and payments, and has the responsibility to reschedule and coordinate work to ensure compliance with applicable project schedules.</code> | <code>The Senior Project Engineer plays a pivotal role in implementing comprehensive project management strategies from inception to completion, ensuring that projects are delivered on time and within financial constraints. This position typically requires a background in engineering, along with relevant experience in production and design. The Senior Project Engineer is adept at creating project timelines and budgets while effectively managing project teams and subcontractors. Strong communication and negotiation skills are essential for collaborating with both internal and external stakeholders to secure specialized resources and maintain productive relationships with subcontractors. Additionally, this role involves monitoring subcontractor schedules, performance metrics, and payment processes, along with the responsibility to adjust and coordinate work to align with project timelines.</code> | <code>The Senior Project Engineer is tasked with overseeing the financial audits of various departments, ensuring compliance with internal policies and external regulations. <br><br>The Senior Project Engineer is responsible for mentoring junior engineers while managing large-scale engineering projects, requiring at least ten years of experience in a senior leadership role.<br><br>The Senior Project Engineer focuses on compliance checks within the healthcare sector, utilizing analytical skills to assess regulatory frameworks and implement necessary changes.<br><br>The Senior Project Engineer is involved in developing marketing strategies for consumer products, requiring expertise in market analysis and brand management across international markets.<br><br>The Senior Project Engineer combines responsibilities of a project manager and a quality assurance officer, overseeing project execution while also conducting product testing and compliance evaluations.</code> |
393
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
394
+ ```json
395
+ {
396
+ "scale": 20.0,
397
+ "similarity_fct": "cos_sim",
398
+ "gather_across_devices": false
399
+ }
400
+ ```
401
+
402
+ ### Training Hyperparameters
403
+ #### Non-Default Hyperparameters
404
+
405
+ - `eval_strategy`: epoch
406
+ - `per_device_train_batch_size`: 16
407
+ - `gradient_accumulation_steps`: 8
408
+ - `learning_rate`: 2e-05
409
+ - `num_train_epochs`: 5
410
+ - `lr_scheduler_type`: cosine
411
+ - `warmup_ratio`: 0.1
412
+ - `bf16`: True
413
+ - `tf32`: False
414
+ - `load_best_model_at_end`: True
415
+ - `batch_sampler`: no_duplicates
416
+
417
+ #### All Hyperparameters
418
+ <details><summary>Click to expand</summary>
419
+
420
+ - `overwrite_output_dir`: False
421
+ - `do_predict`: False
422
+ - `eval_strategy`: epoch
423
+ - `prediction_loss_only`: True
424
+ - `per_device_train_batch_size`: 16
425
+ - `per_device_eval_batch_size`: 8
426
+ - `per_gpu_train_batch_size`: None
427
+ - `per_gpu_eval_batch_size`: None
428
+ - `gradient_accumulation_steps`: 8
429
+ - `eval_accumulation_steps`: None
430
+ - `torch_empty_cache_steps`: None
431
+ - `learning_rate`: 2e-05
432
+ - `weight_decay`: 0.0
433
+ - `adam_beta1`: 0.9
434
+ - `adam_beta2`: 0.999
435
+ - `adam_epsilon`: 1e-08
436
+ - `max_grad_norm`: 1.0
437
+ - `num_train_epochs`: 5
438
+ - `max_steps`: -1
439
+ - `lr_scheduler_type`: cosine
440
+ - `lr_scheduler_kwargs`: {}
441
+ - `warmup_ratio`: 0.1
442
+ - `warmup_steps`: 0
443
+ - `log_level`: passive
444
+ - `log_level_replica`: warning
445
+ - `log_on_each_node`: True
446
+ - `logging_nan_inf_filter`: True
447
+ - `save_safetensors`: True
448
+ - `save_on_each_node`: False
449
+ - `save_only_model`: False
450
+ - `restore_callback_states_from_checkpoint`: False
451
+ - `no_cuda`: False
452
+ - `use_cpu`: False
453
+ - `use_mps_device`: False
454
+ - `seed`: 42
455
+ - `data_seed`: None
456
+ - `jit_mode_eval`: False
457
+ - `use_ipex`: False
458
+ - `bf16`: True
459
+ - `fp16`: False
460
+ - `fp16_opt_level`: O1
461
+ - `half_precision_backend`: auto
462
+ - `bf16_full_eval`: False
463
+ - `fp16_full_eval`: False
464
+ - `tf32`: False
465
+ - `local_rank`: 0
466
+ - `ddp_backend`: None
467
+ - `tpu_num_cores`: None
468
+ - `tpu_metrics_debug`: False
469
+ - `debug`: []
470
+ - `dataloader_drop_last`: False
471
+ - `dataloader_num_workers`: 0
472
+ - `dataloader_prefetch_factor`: None
473
+ - `past_index`: -1
474
+ - `disable_tqdm`: False
475
+ - `remove_unused_columns`: True
476
+ - `label_names`: None
477
+ - `load_best_model_at_end`: True
478
+ - `ignore_data_skip`: False
479
+ - `fsdp`: []
480
+ - `fsdp_min_num_params`: 0
481
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
482
+ - `fsdp_transformer_layer_cls_to_wrap`: None
483
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
484
+ - `deepspeed`: None
485
+ - `label_smoothing_factor`: 0.0
486
+ - `optim`: adamw_torch_fused
487
+ - `optim_args`: None
488
+ - `adafactor`: False
489
+ - `group_by_length`: False
490
+ - `length_column_name`: length
491
+ - `ddp_find_unused_parameters`: None
492
+ - `ddp_bucket_cap_mb`: None
493
+ - `ddp_broadcast_buffers`: False
494
+ - `dataloader_pin_memory`: True
495
+ - `dataloader_persistent_workers`: False
496
+ - `skip_memory_metrics`: True
497
+ - `use_legacy_prediction_loop`: False
498
+ - `push_to_hub`: False
499
+ - `resume_from_checkpoint`: None
500
+ - `hub_model_id`: None
501
+ - `hub_strategy`: every_save
502
+ - `hub_private_repo`: None
503
+ - `hub_always_push`: False
504
+ - `hub_revision`: None
505
+ - `gradient_checkpointing`: False
506
+ - `gradient_checkpointing_kwargs`: None
507
+ - `include_inputs_for_metrics`: False
508
+ - `include_for_metrics`: []
509
+ - `eval_do_concat_batches`: True
510
+ - `fp16_backend`: auto
511
+ - `push_to_hub_model_id`: None
512
+ - `push_to_hub_organization`: None
513
+ - `mp_parameters`:
514
+ - `auto_find_batch_size`: False
515
+ - `full_determinism`: False
516
+ - `torchdynamo`: None
517
+ - `ray_scope`: last
518
+ - `ddp_timeout`: 1800
519
+ - `torch_compile`: False
520
+ - `torch_compile_backend`: None
521
+ - `torch_compile_mode`: None
522
+ - `include_tokens_per_second`: False
523
+ - `include_num_input_tokens_seen`: False
524
+ - `neftune_noise_alpha`: None
525
+ - `optim_target_modules`: None
526
+ - `batch_eval_metrics`: False
527
+ - `eval_on_start`: False
528
+ - `use_liger_kernel`: False
529
+ - `liger_kernel_config`: None
530
+ - `eval_use_gather_object`: False
531
+ - `average_tokens_across_devices`: False
532
+ - `prompts`: None
533
+ - `batch_sampler`: no_duplicates
534
+ - `multi_dataset_batch_sampler`: proportional
535
+ - `router_mapping`: {}
536
+ - `learning_rate_mapping`: {}
537
+
538
+ </details>
539
+
540
+ ### Training Logs
541
+ | Epoch | Step | Training Loss | Validation Loss |
542
+ |:-------:|:-------:|:-------------:|:---------------:|
543
+ | 1.0 | 83 | 0.0317 | 0.0009 |
544
+ | 2.0 | 166 | 0.0009 | 0.0006 |
545
+ | 3.0 | 249 | 0.0006 | 0.0006 |
546
+ | 4.0 | 332 | 0.0006 | 0.0005 |
547
+ | **5.0** | **415** | **0.0005** | **0.0005** |
548
+
549
+ * The bold row denotes the saved checkpoint.
550
+
551
+ ### Framework Versions
552
+ - Python: 3.12.11
553
+ - Sentence Transformers: 5.1.0
554
+ - Transformers: 4.55.0
555
+ - PyTorch: 2.8.0+cu128
556
+ - Accelerate: 1.10.0
557
+ - Datasets: 4.0.0
558
+ - Tokenizers: 0.21.4
559
+
560
+ ## Citation
561
+
562
+ ### BibTeX
563
+
564
+ #### Sentence Transformers
565
+ ```bibtex
566
+ @inproceedings{reimers-2019-sentence-bert,
567
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
568
+ author = "Reimers, Nils and Gurevych, Iryna",
569
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
570
+ month = "11",
571
+ year = "2019",
572
+ publisher = "Association for Computational Linguistics",
573
+ url = "https://arxiv.org/abs/1908.10084",
574
+ }
575
+ ```
576
+
577
+ #### MultipleNegativesRankingLoss
578
+ ```bibtex
579
+ @misc{henderson2017efficient,
580
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
581
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
582
+ year={2017},
583
+ eprint={1705.00652},
584
+ archivePrefix={arXiv},
585
+ primaryClass={cs.CL}
586
+ }
587
+ ```
588
+
589
+ <!--
590
+ ## Glossary
591
+
592
+ *Clearly define terms in order to be accessible across audiences.*
593
+ -->
594
+
595
+ <!--
596
+ ## Model Card Authors
597
+
598
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
599
+ -->
600
+
601
+ <!--
602
+ ## Model Card Contact
603
+
604
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
605
+ -->
config.json ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "ModernBertModel"
4
+ ],
5
+ "attention_bias": false,
6
+ "attention_dropout": 0.0,
7
+ "bos_token_id": 50281,
8
+ "classifier_activation": "gelu",
9
+ "classifier_bias": false,
10
+ "classifier_dropout": 0.0,
11
+ "classifier_pooling": "mean",
12
+ "cls_token_id": 50281,
13
+ "decoder_bias": true,
14
+ "deterministic_flash_attn": false,
15
+ "embedding_dropout": 0.0,
16
+ "eos_token_id": 50282,
17
+ "global_attn_every_n_layers": 3,
18
+ "global_rope_theta": 160000.0,
19
+ "gradient_checkpointing": false,
20
+ "hidden_activation": "gelu",
21
+ "hidden_size": 768,
22
+ "initializer_cutoff_factor": 2.0,
23
+ "initializer_range": 0.02,
24
+ "intermediate_size": 1152,
25
+ "layer_norm_eps": 1e-05,
26
+ "local_attention": 128,
27
+ "local_rope_theta": 10000.0,
28
+ "max_position_embeddings": 8192,
29
+ "mlp_bias": false,
30
+ "mlp_dropout": 0.0,
31
+ "model_type": "modernbert",
32
+ "norm_bias": false,
33
+ "norm_eps": 1e-05,
34
+ "num_attention_heads": 12,
35
+ "num_hidden_layers": 22,
36
+ "pad_token_id": 50283,
37
+ "position_embedding_type": "absolute",
38
+ "repad_logits_with_grad": false,
39
+ "sep_token_id": 50282,
40
+ "sparse_pred_ignore_index": -100,
41
+ "sparse_prediction": false,
42
+ "torch_dtype": "float32",
43
+ "transformers_version": "4.55.0",
44
+ "vocab_size": 50368
45
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "5.1.0",
4
+ "transformers": "4.55.0",
5
+ "pytorch": "2.8.0+cu128"
6
+ },
7
+ "prompts": {
8
+ "query": "",
9
+ "document": ""
10
+ },
11
+ "default_prompt_name": null,
12
+ "similarity_fn_name": "cosine",
13
+ "model_type": "SentenceTransformer"
14
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:66d186703673dbc144fffc368e5c5da662280b1d0ae23ed5a386c42c3b49f263
3
+ size 596070136
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 8192,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": true,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,945 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "|||IP_ADDRESS|||",
5
+ "lstrip": false,
6
+ "normalized": true,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": false
10
+ },
11
+ "1": {
12
+ "content": "<|padding|>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "50254": {
20
+ "content": " ",
21
+ "lstrip": false,
22
+ "normalized": true,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": false
26
+ },
27
+ "50255": {
28
+ "content": " ",
29
+ "lstrip": false,
30
+ "normalized": true,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": false
34
+ },
35
+ "50256": {
36
+ "content": " ",
37
+ "lstrip": false,
38
+ "normalized": true,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": false
42
+ },
43
+ "50257": {
44
+ "content": " ",
45
+ "lstrip": false,
46
+ "normalized": true,
47
+ "rstrip": false,
48
+ "single_word": false,
49
+ "special": false
50
+ },
51
+ "50258": {
52
+ "content": " ",
53
+ "lstrip": false,
54
+ "normalized": true,
55
+ "rstrip": false,
56
+ "single_word": false,
57
+ "special": false
58
+ },
59
+ "50259": {
60
+ "content": " ",
61
+ "lstrip": false,
62
+ "normalized": true,
63
+ "rstrip": false,
64
+ "single_word": false,
65
+ "special": false
66
+ },
67
+ "50260": {
68
+ "content": " ",
69
+ "lstrip": false,
70
+ "normalized": true,
71
+ "rstrip": false,
72
+ "single_word": false,
73
+ "special": false
74
+ },
75
+ "50261": {
76
+ "content": " ",
77
+ "lstrip": false,
78
+ "normalized": true,
79
+ "rstrip": false,
80
+ "single_word": false,
81
+ "special": false
82
+ },
83
+ "50262": {
84
+ "content": " ",
85
+ "lstrip": false,
86
+ "normalized": true,
87
+ "rstrip": false,
88
+ "single_word": false,
89
+ "special": false
90
+ },
91
+ "50263": {
92
+ "content": " ",
93
+ "lstrip": false,
94
+ "normalized": true,
95
+ "rstrip": false,
96
+ "single_word": false,
97
+ "special": false
98
+ },
99
+ "50264": {
100
+ "content": " ",
101
+ "lstrip": false,
102
+ "normalized": true,
103
+ "rstrip": false,
104
+ "single_word": false,
105
+ "special": false
106
+ },
107
+ "50265": {
108
+ "content": " ",
109
+ "lstrip": false,
110
+ "normalized": true,
111
+ "rstrip": false,
112
+ "single_word": false,
113
+ "special": false
114
+ },
115
+ "50266": {
116
+ "content": " ",
117
+ "lstrip": false,
118
+ "normalized": true,
119
+ "rstrip": false,
120
+ "single_word": false,
121
+ "special": false
122
+ },
123
+ "50267": {
124
+ "content": " ",
125
+ "lstrip": false,
126
+ "normalized": true,
127
+ "rstrip": false,
128
+ "single_word": false,
129
+ "special": false
130
+ },
131
+ "50268": {
132
+ "content": " ",
133
+ "lstrip": false,
134
+ "normalized": true,
135
+ "rstrip": false,
136
+ "single_word": false,
137
+ "special": false
138
+ },
139
+ "50269": {
140
+ "content": " ",
141
+ "lstrip": false,
142
+ "normalized": true,
143
+ "rstrip": false,
144
+ "single_word": false,
145
+ "special": false
146
+ },
147
+ "50270": {
148
+ "content": " ",
149
+ "lstrip": false,
150
+ "normalized": true,
151
+ "rstrip": false,
152
+ "single_word": false,
153
+ "special": false
154
+ },
155
+ "50271": {
156
+ "content": " ",
157
+ "lstrip": false,
158
+ "normalized": true,
159
+ "rstrip": false,
160
+ "single_word": false,
161
+ "special": false
162
+ },
163
+ "50272": {
164
+ "content": " ",
165
+ "lstrip": false,
166
+ "normalized": true,
167
+ "rstrip": false,
168
+ "single_word": false,
169
+ "special": false
170
+ },
171
+ "50273": {
172
+ "content": " ",
173
+ "lstrip": false,
174
+ "normalized": true,
175
+ "rstrip": false,
176
+ "single_word": false,
177
+ "special": false
178
+ },
179
+ "50274": {
180
+ "content": " ",
181
+ "lstrip": false,
182
+ "normalized": true,
183
+ "rstrip": false,
184
+ "single_word": false,
185
+ "special": false
186
+ },
187
+ "50275": {
188
+ "content": " ",
189
+ "lstrip": false,
190
+ "normalized": true,
191
+ "rstrip": false,
192
+ "single_word": false,
193
+ "special": false
194
+ },
195
+ "50276": {
196
+ "content": " ",
197
+ "lstrip": false,
198
+ "normalized": true,
199
+ "rstrip": false,
200
+ "single_word": false,
201
+ "special": false
202
+ },
203
+ "50277": {
204
+ "content": "|||EMAIL_ADDRESS|||",
205
+ "lstrip": false,
206
+ "normalized": true,
207
+ "rstrip": false,
208
+ "single_word": false,
209
+ "special": false
210
+ },
211
+ "50278": {
212
+ "content": "|||PHONE_NUMBER|||",
213
+ "lstrip": false,
214
+ "normalized": true,
215
+ "rstrip": false,
216
+ "single_word": false,
217
+ "special": false
218
+ },
219
+ "50279": {
220
+ "content": "<|endoftext|>",
221
+ "lstrip": false,
222
+ "normalized": false,
223
+ "rstrip": false,
224
+ "single_word": false,
225
+ "special": true
226
+ },
227
+ "50280": {
228
+ "content": "[UNK]",
229
+ "lstrip": false,
230
+ "normalized": false,
231
+ "rstrip": false,
232
+ "single_word": false,
233
+ "special": true
234
+ },
235
+ "50281": {
236
+ "content": "[CLS]",
237
+ "lstrip": false,
238
+ "normalized": false,
239
+ "rstrip": false,
240
+ "single_word": false,
241
+ "special": true
242
+ },
243
+ "50282": {
244
+ "content": "[SEP]",
245
+ "lstrip": false,
246
+ "normalized": false,
247
+ "rstrip": false,
248
+ "single_word": false,
249
+ "special": true
250
+ },
251
+ "50283": {
252
+ "content": "[PAD]",
253
+ "lstrip": false,
254
+ "normalized": false,
255
+ "rstrip": false,
256
+ "single_word": false,
257
+ "special": true
258
+ },
259
+ "50284": {
260
+ "content": "[MASK]",
261
+ "lstrip": true,
262
+ "normalized": false,
263
+ "rstrip": false,
264
+ "single_word": false,
265
+ "special": true
266
+ },
267
+ "50285": {
268
+ "content": "[unused0]",
269
+ "lstrip": false,
270
+ "normalized": true,
271
+ "rstrip": false,
272
+ "single_word": false,
273
+ "special": false
274
+ },
275
+ "50286": {
276
+ "content": "[unused1]",
277
+ "lstrip": false,
278
+ "normalized": true,
279
+ "rstrip": false,
280
+ "single_word": false,
281
+ "special": false
282
+ },
283
+ "50287": {
284
+ "content": "[unused2]",
285
+ "lstrip": false,
286
+ "normalized": true,
287
+ "rstrip": false,
288
+ "single_word": false,
289
+ "special": false
290
+ },
291
+ "50288": {
292
+ "content": "[unused3]",
293
+ "lstrip": false,
294
+ "normalized": true,
295
+ "rstrip": false,
296
+ "single_word": false,
297
+ "special": false
298
+ },
299
+ "50289": {
300
+ "content": "[unused4]",
301
+ "lstrip": false,
302
+ "normalized": true,
303
+ "rstrip": false,
304
+ "single_word": false,
305
+ "special": false
306
+ },
307
+ "50290": {
308
+ "content": "[unused5]",
309
+ "lstrip": false,
310
+ "normalized": true,
311
+ "rstrip": false,
312
+ "single_word": false,
313
+ "special": false
314
+ },
315
+ "50291": {
316
+ "content": "[unused6]",
317
+ "lstrip": false,
318
+ "normalized": true,
319
+ "rstrip": false,
320
+ "single_word": false,
321
+ "special": false
322
+ },
323
+ "50292": {
324
+ "content": "[unused7]",
325
+ "lstrip": false,
326
+ "normalized": true,
327
+ "rstrip": false,
328
+ "single_word": false,
329
+ "special": false
330
+ },
331
+ "50293": {
332
+ "content": "[unused8]",
333
+ "lstrip": false,
334
+ "normalized": true,
335
+ "rstrip": false,
336
+ "single_word": false,
337
+ "special": false
338
+ },
339
+ "50294": {
340
+ "content": "[unused9]",
341
+ "lstrip": false,
342
+ "normalized": true,
343
+ "rstrip": false,
344
+ "single_word": false,
345
+ "special": false
346
+ },
347
+ "50295": {
348
+ "content": "[unused10]",
349
+ "lstrip": false,
350
+ "normalized": true,
351
+ "rstrip": false,
352
+ "single_word": false,
353
+ "special": false
354
+ },
355
+ "50296": {
356
+ "content": "[unused11]",
357
+ "lstrip": false,
358
+ "normalized": true,
359
+ "rstrip": false,
360
+ "single_word": false,
361
+ "special": false
362
+ },
363
+ "50297": {
364
+ "content": "[unused12]",
365
+ "lstrip": false,
366
+ "normalized": true,
367
+ "rstrip": false,
368
+ "single_word": false,
369
+ "special": false
370
+ },
371
+ "50298": {
372
+ "content": "[unused13]",
373
+ "lstrip": false,
374
+ "normalized": true,
375
+ "rstrip": false,
376
+ "single_word": false,
377
+ "special": false
378
+ },
379
+ "50299": {
380
+ "content": "[unused14]",
381
+ "lstrip": false,
382
+ "normalized": true,
383
+ "rstrip": false,
384
+ "single_word": false,
385
+ "special": false
386
+ },
387
+ "50300": {
388
+ "content": "[unused15]",
389
+ "lstrip": false,
390
+ "normalized": true,
391
+ "rstrip": false,
392
+ "single_word": false,
393
+ "special": false
394
+ },
395
+ "50301": {
396
+ "content": "[unused16]",
397
+ "lstrip": false,
398
+ "normalized": true,
399
+ "rstrip": false,
400
+ "single_word": false,
401
+ "special": false
402
+ },
403
+ "50302": {
404
+ "content": "[unused17]",
405
+ "lstrip": false,
406
+ "normalized": true,
407
+ "rstrip": false,
408
+ "single_word": false,
409
+ "special": false
410
+ },
411
+ "50303": {
412
+ "content": "[unused18]",
413
+ "lstrip": false,
414
+ "normalized": true,
415
+ "rstrip": false,
416
+ "single_word": false,
417
+ "special": false
418
+ },
419
+ "50304": {
420
+ "content": "[unused19]",
421
+ "lstrip": false,
422
+ "normalized": true,
423
+ "rstrip": false,
424
+ "single_word": false,
425
+ "special": false
426
+ },
427
+ "50305": {
428
+ "content": "[unused20]",
429
+ "lstrip": false,
430
+ "normalized": true,
431
+ "rstrip": false,
432
+ "single_word": false,
433
+ "special": false
434
+ },
435
+ "50306": {
436
+ "content": "[unused21]",
437
+ "lstrip": false,
438
+ "normalized": true,
439
+ "rstrip": false,
440
+ "single_word": false,
441
+ "special": false
442
+ },
443
+ "50307": {
444
+ "content": "[unused22]",
445
+ "lstrip": false,
446
+ "normalized": true,
447
+ "rstrip": false,
448
+ "single_word": false,
449
+ "special": false
450
+ },
451
+ "50308": {
452
+ "content": "[unused23]",
453
+ "lstrip": false,
454
+ "normalized": true,
455
+ "rstrip": false,
456
+ "single_word": false,
457
+ "special": false
458
+ },
459
+ "50309": {
460
+ "content": "[unused24]",
461
+ "lstrip": false,
462
+ "normalized": true,
463
+ "rstrip": false,
464
+ "single_word": false,
465
+ "special": false
466
+ },
467
+ "50310": {
468
+ "content": "[unused25]",
469
+ "lstrip": false,
470
+ "normalized": true,
471
+ "rstrip": false,
472
+ "single_word": false,
473
+ "special": false
474
+ },
475
+ "50311": {
476
+ "content": "[unused26]",
477
+ "lstrip": false,
478
+ "normalized": true,
479
+ "rstrip": false,
480
+ "single_word": false,
481
+ "special": false
482
+ },
483
+ "50312": {
484
+ "content": "[unused27]",
485
+ "lstrip": false,
486
+ "normalized": true,
487
+ "rstrip": false,
488
+ "single_word": false,
489
+ "special": false
490
+ },
491
+ "50313": {
492
+ "content": "[unused28]",
493
+ "lstrip": false,
494
+ "normalized": true,
495
+ "rstrip": false,
496
+ "single_word": false,
497
+ "special": false
498
+ },
499
+ "50314": {
500
+ "content": "[unused29]",
501
+ "lstrip": false,
502
+ "normalized": true,
503
+ "rstrip": false,
504
+ "single_word": false,
505
+ "special": false
506
+ },
507
+ "50315": {
508
+ "content": "[unused30]",
509
+ "lstrip": false,
510
+ "normalized": true,
511
+ "rstrip": false,
512
+ "single_word": false,
513
+ "special": false
514
+ },
515
+ "50316": {
516
+ "content": "[unused31]",
517
+ "lstrip": false,
518
+ "normalized": true,
519
+ "rstrip": false,
520
+ "single_word": false,
521
+ "special": false
522
+ },
523
+ "50317": {
524
+ "content": "[unused32]",
525
+ "lstrip": false,
526
+ "normalized": true,
527
+ "rstrip": false,
528
+ "single_word": false,
529
+ "special": false
530
+ },
531
+ "50318": {
532
+ "content": "[unused33]",
533
+ "lstrip": false,
534
+ "normalized": true,
535
+ "rstrip": false,
536
+ "single_word": false,
537
+ "special": false
538
+ },
539
+ "50319": {
540
+ "content": "[unused34]",
541
+ "lstrip": false,
542
+ "normalized": true,
543
+ "rstrip": false,
544
+ "single_word": false,
545
+ "special": false
546
+ },
547
+ "50320": {
548
+ "content": "[unused35]",
549
+ "lstrip": false,
550
+ "normalized": true,
551
+ "rstrip": false,
552
+ "single_word": false,
553
+ "special": false
554
+ },
555
+ "50321": {
556
+ "content": "[unused36]",
557
+ "lstrip": false,
558
+ "normalized": true,
559
+ "rstrip": false,
560
+ "single_word": false,
561
+ "special": false
562
+ },
563
+ "50322": {
564
+ "content": "[unused37]",
565
+ "lstrip": false,
566
+ "normalized": true,
567
+ "rstrip": false,
568
+ "single_word": false,
569
+ "special": false
570
+ },
571
+ "50323": {
572
+ "content": "[unused38]",
573
+ "lstrip": false,
574
+ "normalized": true,
575
+ "rstrip": false,
576
+ "single_word": false,
577
+ "special": false
578
+ },
579
+ "50324": {
580
+ "content": "[unused39]",
581
+ "lstrip": false,
582
+ "normalized": true,
583
+ "rstrip": false,
584
+ "single_word": false,
585
+ "special": false
586
+ },
587
+ "50325": {
588
+ "content": "[unused40]",
589
+ "lstrip": false,
590
+ "normalized": true,
591
+ "rstrip": false,
592
+ "single_word": false,
593
+ "special": false
594
+ },
595
+ "50326": {
596
+ "content": "[unused41]",
597
+ "lstrip": false,
598
+ "normalized": true,
599
+ "rstrip": false,
600
+ "single_word": false,
601
+ "special": false
602
+ },
603
+ "50327": {
604
+ "content": "[unused42]",
605
+ "lstrip": false,
606
+ "normalized": true,
607
+ "rstrip": false,
608
+ "single_word": false,
609
+ "special": false
610
+ },
611
+ "50328": {
612
+ "content": "[unused43]",
613
+ "lstrip": false,
614
+ "normalized": true,
615
+ "rstrip": false,
616
+ "single_word": false,
617
+ "special": false
618
+ },
619
+ "50329": {
620
+ "content": "[unused44]",
621
+ "lstrip": false,
622
+ "normalized": true,
623
+ "rstrip": false,
624
+ "single_word": false,
625
+ "special": false
626
+ },
627
+ "50330": {
628
+ "content": "[unused45]",
629
+ "lstrip": false,
630
+ "normalized": true,
631
+ "rstrip": false,
632
+ "single_word": false,
633
+ "special": false
634
+ },
635
+ "50331": {
636
+ "content": "[unused46]",
637
+ "lstrip": false,
638
+ "normalized": true,
639
+ "rstrip": false,
640
+ "single_word": false,
641
+ "special": false
642
+ },
643
+ "50332": {
644
+ "content": "[unused47]",
645
+ "lstrip": false,
646
+ "normalized": true,
647
+ "rstrip": false,
648
+ "single_word": false,
649
+ "special": false
650
+ },
651
+ "50333": {
652
+ "content": "[unused48]",
653
+ "lstrip": false,
654
+ "normalized": true,
655
+ "rstrip": false,
656
+ "single_word": false,
657
+ "special": false
658
+ },
659
+ "50334": {
660
+ "content": "[unused49]",
661
+ "lstrip": false,
662
+ "normalized": true,
663
+ "rstrip": false,
664
+ "single_word": false,
665
+ "special": false
666
+ },
667
+ "50335": {
668
+ "content": "[unused50]",
669
+ "lstrip": false,
670
+ "normalized": true,
671
+ "rstrip": false,
672
+ "single_word": false,
673
+ "special": false
674
+ },
675
+ "50336": {
676
+ "content": "[unused51]",
677
+ "lstrip": false,
678
+ "normalized": true,
679
+ "rstrip": false,
680
+ "single_word": false,
681
+ "special": false
682
+ },
683
+ "50337": {
684
+ "content": "[unused52]",
685
+ "lstrip": false,
686
+ "normalized": true,
687
+ "rstrip": false,
688
+ "single_word": false,
689
+ "special": false
690
+ },
691
+ "50338": {
692
+ "content": "[unused53]",
693
+ "lstrip": false,
694
+ "normalized": true,
695
+ "rstrip": false,
696
+ "single_word": false,
697
+ "special": false
698
+ },
699
+ "50339": {
700
+ "content": "[unused54]",
701
+ "lstrip": false,
702
+ "normalized": true,
703
+ "rstrip": false,
704
+ "single_word": false,
705
+ "special": false
706
+ },
707
+ "50340": {
708
+ "content": "[unused55]",
709
+ "lstrip": false,
710
+ "normalized": true,
711
+ "rstrip": false,
712
+ "single_word": false,
713
+ "special": false
714
+ },
715
+ "50341": {
716
+ "content": "[unused56]",
717
+ "lstrip": false,
718
+ "normalized": true,
719
+ "rstrip": false,
720
+ "single_word": false,
721
+ "special": false
722
+ },
723
+ "50342": {
724
+ "content": "[unused57]",
725
+ "lstrip": false,
726
+ "normalized": true,
727
+ "rstrip": false,
728
+ "single_word": false,
729
+ "special": false
730
+ },
731
+ "50343": {
732
+ "content": "[unused58]",
733
+ "lstrip": false,
734
+ "normalized": true,
735
+ "rstrip": false,
736
+ "single_word": false,
737
+ "special": false
738
+ },
739
+ "50344": {
740
+ "content": "[unused59]",
741
+ "lstrip": false,
742
+ "normalized": true,
743
+ "rstrip": false,
744
+ "single_word": false,
745
+ "special": false
746
+ },
747
+ "50345": {
748
+ "content": "[unused60]",
749
+ "lstrip": false,
750
+ "normalized": true,
751
+ "rstrip": false,
752
+ "single_word": false,
753
+ "special": false
754
+ },
755
+ "50346": {
756
+ "content": "[unused61]",
757
+ "lstrip": false,
758
+ "normalized": true,
759
+ "rstrip": false,
760
+ "single_word": false,
761
+ "special": false
762
+ },
763
+ "50347": {
764
+ "content": "[unused62]",
765
+ "lstrip": false,
766
+ "normalized": true,
767
+ "rstrip": false,
768
+ "single_word": false,
769
+ "special": false
770
+ },
771
+ "50348": {
772
+ "content": "[unused63]",
773
+ "lstrip": false,
774
+ "normalized": true,
775
+ "rstrip": false,
776
+ "single_word": false,
777
+ "special": false
778
+ },
779
+ "50349": {
780
+ "content": "[unused64]",
781
+ "lstrip": false,
782
+ "normalized": true,
783
+ "rstrip": false,
784
+ "single_word": false,
785
+ "special": false
786
+ },
787
+ "50350": {
788
+ "content": "[unused65]",
789
+ "lstrip": false,
790
+ "normalized": true,
791
+ "rstrip": false,
792
+ "single_word": false,
793
+ "special": false
794
+ },
795
+ "50351": {
796
+ "content": "[unused66]",
797
+ "lstrip": false,
798
+ "normalized": true,
799
+ "rstrip": false,
800
+ "single_word": false,
801
+ "special": false
802
+ },
803
+ "50352": {
804
+ "content": "[unused67]",
805
+ "lstrip": false,
806
+ "normalized": true,
807
+ "rstrip": false,
808
+ "single_word": false,
809
+ "special": false
810
+ },
811
+ "50353": {
812
+ "content": "[unused68]",
813
+ "lstrip": false,
814
+ "normalized": true,
815
+ "rstrip": false,
816
+ "single_word": false,
817
+ "special": false
818
+ },
819
+ "50354": {
820
+ "content": "[unused69]",
821
+ "lstrip": false,
822
+ "normalized": true,
823
+ "rstrip": false,
824
+ "single_word": false,
825
+ "special": false
826
+ },
827
+ "50355": {
828
+ "content": "[unused70]",
829
+ "lstrip": false,
830
+ "normalized": true,
831
+ "rstrip": false,
832
+ "single_word": false,
833
+ "special": false
834
+ },
835
+ "50356": {
836
+ "content": "[unused71]",
837
+ "lstrip": false,
838
+ "normalized": true,
839
+ "rstrip": false,
840
+ "single_word": false,
841
+ "special": false
842
+ },
843
+ "50357": {
844
+ "content": "[unused72]",
845
+ "lstrip": false,
846
+ "normalized": true,
847
+ "rstrip": false,
848
+ "single_word": false,
849
+ "special": false
850
+ },
851
+ "50358": {
852
+ "content": "[unused73]",
853
+ "lstrip": false,
854
+ "normalized": true,
855
+ "rstrip": false,
856
+ "single_word": false,
857
+ "special": false
858
+ },
859
+ "50359": {
860
+ "content": "[unused74]",
861
+ "lstrip": false,
862
+ "normalized": true,
863
+ "rstrip": false,
864
+ "single_word": false,
865
+ "special": false
866
+ },
867
+ "50360": {
868
+ "content": "[unused75]",
869
+ "lstrip": false,
870
+ "normalized": true,
871
+ "rstrip": false,
872
+ "single_word": false,
873
+ "special": false
874
+ },
875
+ "50361": {
876
+ "content": "[unused76]",
877
+ "lstrip": false,
878
+ "normalized": true,
879
+ "rstrip": false,
880
+ "single_word": false,
881
+ "special": false
882
+ },
883
+ "50362": {
884
+ "content": "[unused77]",
885
+ "lstrip": false,
886
+ "normalized": true,
887
+ "rstrip": false,
888
+ "single_word": false,
889
+ "special": false
890
+ },
891
+ "50363": {
892
+ "content": "[unused78]",
893
+ "lstrip": false,
894
+ "normalized": true,
895
+ "rstrip": false,
896
+ "single_word": false,
897
+ "special": false
898
+ },
899
+ "50364": {
900
+ "content": "[unused79]",
901
+ "lstrip": false,
902
+ "normalized": true,
903
+ "rstrip": false,
904
+ "single_word": false,
905
+ "special": false
906
+ },
907
+ "50365": {
908
+ "content": "[unused80]",
909
+ "lstrip": false,
910
+ "normalized": true,
911
+ "rstrip": false,
912
+ "single_word": false,
913
+ "special": false
914
+ },
915
+ "50366": {
916
+ "content": "[unused81]",
917
+ "lstrip": false,
918
+ "normalized": true,
919
+ "rstrip": false,
920
+ "single_word": false,
921
+ "special": false
922
+ },
923
+ "50367": {
924
+ "content": "[unused82]",
925
+ "lstrip": false,
926
+ "normalized": true,
927
+ "rstrip": false,
928
+ "single_word": false,
929
+ "special": false
930
+ }
931
+ },
932
+ "clean_up_tokenization_spaces": true,
933
+ "cls_token": "[CLS]",
934
+ "extra_special_tokens": {},
935
+ "mask_token": "[MASK]",
936
+ "model_input_names": [
937
+ "input_ids",
938
+ "attention_mask"
939
+ ],
940
+ "model_max_length": 8192,
941
+ "pad_token": "[PAD]",
942
+ "sep_token": "[SEP]",
943
+ "tokenizer_class": "PreTrainedTokenizerFast",
944
+ "unk_token": "[UNK]"
945
+ }