Fatini commited on
Commit
f5960b4
·
verified ·
1 Parent(s): 7676ee5

Add new SentenceTransformer model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,590 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - sentence-similarity
5
+ - feature-extraction
6
+ - dense
7
+ - generated_from_trainer
8
+ - dataset_size:3016
9
+ - loss:MultipleNegativesRankingLoss
10
+ base_model: nomic-ai/modernbert-embed-base
11
+ widget:
12
+ - source_sentence: The Market and Liquidity Risk Analyst is responsible for conducting
13
+ routine identification, measurement, mitigation, monitoring, and reporting of
14
+ market and liquidity risks. He/She also conducts independent analyses to provide
15
+ greater insight into risk exposures and mitigation efforts within market and liquidity
16
+ risk scenarios. He actively supports the compliance of regulatory requirements
17
+ relating to market and liquidity risk management in order to ensure the financial
18
+ health of the organisation. The Market and Liquidity Risk Analyst's duties may
19
+ require him to be contactable after office hours. He is well-versed with financial
20
+ risks relating to the organisation and products. He possesses strong analytical
21
+ and practical abilities in applying various methodologies to identify and analyse
22
+ risks. He is a strong communicator, works well in teams, and is self-motivated
23
+ in achieving individual and organisational goals.
24
+ sentences:
25
+ - The Underwriting Manager leads the underwriting team by providing strategic direction
26
+ and oversight. This role involves thoroughly evaluating insurance applications
27
+ from potential clients, analyzing associated risks, and making informed underwriting
28
+ decisions in line with company policies. The manager monitors team performance
29
+ to ensure timely case processing and high customer satisfaction. Collaborating
30
+ closely with customer service teams, the Underwriting Manager facilitates clear
31
+ communication regarding underwriting procedures and business rules. Responsibilities
32
+ also include managing relationships with intermediaries, in-house underwriters,
33
+ and external clients within a fast-paced environment. The ideal candidate demonstrates
34
+ strong analytical and numerical proficiency, is adept with risk evaluation tools,
35
+ and effectively communicates complex information to diverse stakeholders. Sound
36
+ judgment and logical thinking are essential for driving effective decision-making
37
+ in this role.
38
+ - The Market and Liquidity Risk Manager oversees the strategic direction and governance
39
+ of market and liquidity risk frameworks, leading a team responsible for risk identification,
40
+ measurement, and mitigation. This senior role involves setting policies, managing
41
+ escalations, and liaising with regulatory bodies to ensure compliance with evolving
42
+ risk management standards across multiple jurisdictions. The manager is accountable
43
+ for high-level decision making and resource allocation, requiring extensive experience
44
+ and leadership skills. The position demands proactive communication with executive
45
+ leadership and may involve international travel to coordinate risk activities
46
+ across global offices.
47
+ - The Market and Liquidity Risk Analyst is tasked with the ongoing identification,
48
+ assessment, control, and reporting of market and liquidity risks. This role involves
49
+ performing independent evaluations to deepen understanding of risk exposures and
50
+ the effectiveness of mitigation strategies within market and liquidity contexts.
51
+ The analyst ensures adherence to regulatory standards related to market and liquidity
52
+ risk management to safeguard the organisation’s financial stability. Availability
53
+ beyond standard working hours may be required. The ideal candidate has comprehensive
54
+ knowledge of financial risks associated with the company and its products, demonstrates
55
+ strong analytical skills in applying risk assessment techniques, communicates
56
+ effectively, collaborates efficiently within teams, and is driven to meet both
57
+ personal and corporate objectives.
58
+ - source_sentence: A Senior Pharmacy Technician Executive in the Patient Care Services
59
+ sub-track is responsible for co-managing dispensing errors, performing medication
60
+ management and providing patient education. S/He reviews day-to-day operations
61
+ of department-based functions to maximise service provision. S/He is required
62
+ to develop and review training curriculum, plans and materials. S/He works in
63
+ various settings such as hospitals, outpatient clinics, polyclinics and retail
64
+ pharmacies. S/He should be proactive and conscientious. S/He should possess effective
65
+ interpersonal, leadership and problem-solving skills.
66
+ sentences:
67
+ - The Senior Pharmacy Technician Executive plays a key role in overseeing medication
68
+ dispensing accuracy, managing pharmaceutical care, and educating patients within
69
+ the Patient Care Services domain. This position involves supervising daily departmental
70
+ activities to enhance service delivery and contributing to the design and evaluation
71
+ of training programs and materials. Operating across diverse healthcare environments
72
+ including hospitals, outpatient clinics, polyclinics, and retail pharmacies, the
73
+ role demands a proactive and diligent professional equipped with strong leadership,
74
+ communication, and analytical problem-solving skills.
75
+ - The Senior Pharmacy Technician Executive is responsible for managing the procurement
76
+ and inventory control of pharmaceutical supplies within hospital logistics. This
77
+ role focuses on coordinating supply chain activities, ensuring timely stock replenishment,
78
+ and maintaining vendor relationships rather than direct patient care or medication
79
+ dispensing. It requires expertise in inventory management systems, negotiation
80
+ skills, and operational planning, and is typically performed in warehouse or supply
81
+ management settings rather than clinical environments.
82
+ - The Senior Technician (Mechanical and Electrical) is responsible for conducting
83
+ both preventive and corrective maintenance on mechanical and electrical equipment.
84
+ This role demands strong technical expertise and hands-on skills in managing diverse
85
+ mechanical and electrical systems. Key responsibilities include diagnosing system
86
+ faults, delivering technical support and mentorship to junior technicians, and
87
+ overseeing contractors and external parties to ensure adherence to safety protocols
88
+ and operational standards. The technician operates in shifts and performs tasks
89
+ across multiple rail facilities, including workshops and train stations. Effective
90
+ teamwork and clear communication are essential to facilitate maintenance operations.
91
+ - source_sentence: The Engineer (Engineering Procurement) is responsible for conducting
92
+ procurement activities to support engineering projects. He/She is responsible
93
+ for developing sourcing proposals and conducting vendor pre-qualification and
94
+ assessment. He typically has an engineering background and is able to translate
95
+ project requirements into specifications for materials, equipment and services
96
+ to procure. He manages a team of officers and contributes to the improvement of
97
+ business operations. He is comfortable in engaging and interacting with vendors
98
+ and other external parties to fulfil his responsibilities in coordinating vendor
99
+ selection processes, maintaining vendor contract records and databases, and following
100
+ up on vendors' deliverables.
101
+ sentences:
102
+ - The Engineer (Quality Assurance) is tasked with developing and implementing quality
103
+ control protocols across engineering projects. This role requires expertise in
104
+ defining quality standards, conducting product inspections, and ensuring compliance
105
+ with industry regulations. The engineer supervises a team focused on testing procedures
106
+ and corrective actions, collaborating closely with production units to maintain
107
+ high-quality outputs. Vendor interactions are limited to ensuring product conformity
108
+ rather than procurement activities.
109
+ - The Executive - On-Demand Media Technology and Operations is responsible for managing
110
+ the organisation’s content distribution across various on-demand media platforms.
111
+ This role involves handling key processes such as content ingestion, encoding,
112
+ transcoding, and performing rigorous quality assurance to ensure adherence to
113
+ the company’s technical guidelines. The executive develops and implements media
114
+ software solutions tailored to streamline media workflows and satisfy customer
115
+ delivery expectations. Additionally, they architect, deploy, and manage content
116
+ delivery networks (CDNs) to guarantee seamless content access for viewers. Their
117
+ duties encompass the full cycle of on-demand media operations from design through
118
+ to maintenance. A strong focus on innovation, process optimization, and effective
119
+ collaboration with cross-functional teams is essential for success in this position.
120
+ - The Engineer (Engineering Procurement) oversees procurement processes essential
121
+ to engineering initiatives. This role involves creating sourcing strategies, evaluating
122
+ and pre-qualifying suppliers, and translating engineering needs into detailed
123
+ specifications for purchasing materials, equipment, and services. The engineer
124
+ leads a group of officers and actively supports operational enhancements. Strong
125
+ vendor relationship management and coordination of supplier selection, contract
126
+ documentation, and deliverable tracking are key aspects of this position.
127
+ - source_sentence: The Civil and Structural Engineer manages planning and development
128
+ of projects. He/She develops engineering designs based on project requirements,
129
+ from conceptual to schematic and detailed designs. He conducts project assessments
130
+ and is able to provide feasible and creative solutions based on the assessment
131
+ results. He participates in the tendering processes and monitors the work of contractors
132
+ and subcontractors. He plans the team's manpower allocation and provides on-the-job
133
+ coaching to junior staff. He is meticulous and highly detail-orientated. He is
134
+ well versed in civil and structural engineering practices. He is analytical, has
135
+ excellent problem-solving skills, and also possesses strong interpersonal skills
136
+ essential for engagement with internal and external stakeholders. He is required
137
+ to work both in office and at project sites.
138
+ sentences:
139
+ - The Civil and Structural Engineer leads the financial auditing processes within
140
+ a construction firm, focusing on budget compliance and cost-saving strategies.
141
+ They review accounting records, prepare financial reports, and ensure adherence
142
+ to corporate financial policies. The engineer supervises the auditing team, provides
143
+ training on auditing standards, and liaises with external auditors. Strong knowledge
144
+ of accounting principles, financial regulations, and audit software is essential.
145
+ This role demands meticulous attention to financial details, analytical skills
146
+ for detecting discrepancies, and the ability to communicate effectively with finance
147
+ and project management departments. Work is primarily conducted in an office environment
148
+ with occasional visits to project sites for financial inspections.
149
+ - The Design Project Manager oversees the planning and execution of design initiatives,
150
+ coordinating project schedules and activities with relevant stakeholders. They
151
+ manage scope adjustments, address challenges, and mitigate risks that could affect
152
+ project delivery. In their leadership role, they allocate personnel and resources
153
+ effectively across projects and mentor team members to enhance their capabilities.
154
+ The Design Project Manager directs a team towards achieving project goals, often
155
+ engaging in extensive stakeholder communication, reviewing deliverables, and offering
156
+ strategic guidance. Strong organizational skills and task prioritization are essential,
157
+ alongside a thorough understanding of quality assurance processes to ensure optimal
158
+ product performance.
159
+ - The Civil and Structural Engineer oversees the planning and execution of engineering
160
+ projects, creating designs that meet specific project criteria from initial concept
161
+ through detailed development. They perform thorough project evaluations and propose
162
+ innovative, practical solutions based on these analyses. The engineer is actively
163
+ involved in the tendering process and supervises contractors and subcontractors
164
+ to ensure compliance and quality. Additionally, they manage team resource allocation
165
+ and mentor junior engineers, demonstrating a strong attention to detail and in-depth
166
+ knowledge of civil and structural engineering principles. Excellent analytical
167
+ thinking, problem-solving abilities, and interpersonal communication skills are
168
+ vital for effective collaboration with both internal teams and external partners.
169
+ This role requires working in both office settings and on-site project locations.
170
+ - source_sentence: The Sales Account Manager acts as a key point of contact between
171
+ an organisation and its clients. He/She possesses thorough product knowledge and
172
+ oversees product and/or service sales. He works with customers to identify their
173
+ wants and prepares reports by collecting, analysing, and summarising sales information.
174
+ He contacts existing customers to discuss and give recommendations on how specific
175
+ products or services can meet their needs. He maintains customer relationships
176
+ to strategically place new products and drive sales for long-term growth. He works
177
+ in a fast-paced and dynamic environment, and travels frequently to clients' premises
178
+ for meetings. He is familiar with client relationship management and sales tools.
179
+ He is knowledgeable of the organisation's products and services, as well as trends,
180
+ developments and challenges of the industry domain. The Sales Account Manager
181
+ is a resourceful, people-focused and persistent individual, who takes rejection
182
+ as a personal challenge to succeed when given opportunity. He appreciates the
183
+ value of long lasting relationships and prioritises efforts to build trust with
184
+ existing and potential customers. He exhibits good listening skills and is able
185
+ to establish rapport with customers and team members alike easily.
186
+ sentences:
187
+ - The Centre Leader is responsible for shaping strategic initiatives and organizational
188
+ frameworks that promote a supportive and trust-based environment, encouraging
189
+ mentorship, teamwork, and ongoing professional growth within the Centre. This
190
+ role oversees the effective management and improvement of Centre functions in
191
+ compliance with relevant industry standards. The Centre Leader champions operational
192
+ and programmatic quality by establishing robust procedures across key domains
193
+ such as governance, stakeholder engagement, continuous learning, curriculum design,
194
+ and teaching methodologies. With excellent communication and influential leadership,
195
+ they build lasting partnerships and embody the Centre’s core mission, vision,
196
+ and values, while ensuring the welfare of all personnel.
197
+ - The Sales Account Manager serves as the primary liaison between the company and
198
+ its clientele, leveraging deep product expertise to drive sales of products and
199
+ services. This role involves collaborating closely with customers to understand
200
+ their requirements, preparing detailed sales reports through data collection and
201
+ analysis, and proactively reaching out to current clients to offer tailored recommendations.
202
+ The Sales Account Manager nurtures client relationships with the objective of
203
+ introducing new products and fostering sustainable revenue growth. Operating in
204
+ a dynamic and fast-moving setting, frequent travel to client locations is expected.
205
+ Proficiency in customer relationship management software and sales platforms is
206
+ essential. In addition to comprehensive knowledge of the company’s offerings,
207
+ the individual stays informed of industry trends and challenges. The Sales Account
208
+ Manager demonstrates resilience, strong interpersonal skills, and a commitment
209
+ to building trust through attentive listening and relationship cultivation.
210
+ - The Sales Account Manager leads a team responsible for developing marketing strategies
211
+ and overseeing brand promotion within the organisation. He/She designs campaigns
212
+ to increase product awareness, coordinates with advertising agencies, and manages
213
+ digital content creation to enhance market presence. The role requires supervising
214
+ junior marketing staff, setting performance targets, and analysing campaign metrics
215
+ to adjust marketing approaches. The Sales Account Manager operates primarily from
216
+ the office, with limited travel obligations, and collaborates closely with the
217
+ product development team to align messaging. Strong skills in marketing analytics,
218
+ campaign management tools, and creative communication are essential. This role
219
+ focuses on strategic branding and promotional activities rather than direct client
220
+ sales or relationship management.
221
+ datasets:
222
+ - dnth/ssf-train-valid-v3
223
+ pipeline_tag: sentence-similarity
224
+ library_name: sentence-transformers
225
+ ---
226
+
227
+ # SentenceTransformer based on nomic-ai/modernbert-embed-base
228
+
229
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [nomic-ai/modernbert-embed-base](https://huggingface.co/nomic-ai/modernbert-embed-base) on the [ssf-train-valid-v3](https://huggingface.co/datasets/dnth/ssf-train-valid-v3) dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
230
+
231
+ ## Model Details
232
+
233
+ ### Model Description
234
+ - **Model Type:** Sentence Transformer
235
+ - **Base model:** [nomic-ai/modernbert-embed-base](https://huggingface.co/nomic-ai/modernbert-embed-base) <!-- at revision d556a88e332558790b210f7bdbe87da2fa94a8d8 -->
236
+ - **Maximum Sequence Length:** 8192 tokens
237
+ - **Output Dimensionality:** 768 dimensions
238
+ - **Similarity Function:** Cosine Similarity
239
+ - **Training Dataset:**
240
+ - [ssf-train-valid-v3](https://huggingface.co/datasets/dnth/ssf-train-valid-v3)
241
+ <!-- - **Language:** Unknown -->
242
+ <!-- - **License:** Unknown -->
243
+
244
+ ### Model Sources
245
+
246
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
247
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
248
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
249
+
250
+ ### Full Model Architecture
251
+
252
+ ```
253
+ SentenceTransformer(
254
+ (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False, 'architecture': 'ModernBertModel'})
255
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
256
+ (2): Normalize()
257
+ )
258
+ ```
259
+
260
+ ## Usage
261
+
262
+ ### Direct Usage (Sentence Transformers)
263
+
264
+ First install the Sentence Transformers library:
265
+
266
+ ```bash
267
+ pip install -U sentence-transformers
268
+ ```
269
+
270
+ Then you can load this model and run inference.
271
+ ```python
272
+ from sentence_transformers import SentenceTransformer
273
+
274
+ # Download from the 🤗 Hub
275
+ model = SentenceTransformer("Fatini/ssf-retriever-modernbert-embed-base-attempt1")
276
+ # Run inference
277
+ sentences = [
278
+ "The Sales Account Manager acts as a key point of contact between an organisation and its clients. He/She possesses thorough product knowledge and oversees product and/or service sales. He works with customers to identify their wants and prepares reports by collecting, analysing, and summarising sales information. He contacts existing customers to discuss and give recommendations on how specific products or services can meet their needs. He maintains customer relationships to strategically place new products and drive sales for long-term growth. He works in a fast-paced and dynamic environment, and travels frequently to clients' premises for meetings. He is familiar with client relationship management and sales tools. He is knowledgeable of the organisation's products and services, as well as trends, developments and challenges of the industry domain. The Sales Account Manager is a resourceful, people-focused and persistent individual, who takes rejection as a personal challenge to succeed when given opportunity. He appreciates the value of long lasting relationships and prioritises efforts to build trust with existing and potential customers. He exhibits good listening skills and is able to establish rapport with customers and team members alike easily.",
279
+ 'The Sales Account Manager serves as the primary liaison between the company and its clientele, leveraging deep product expertise to drive sales of products and services. This role involves collaborating closely with customers to understand their requirements, preparing detailed sales reports through data collection and analysis, and proactively reaching out to current clients to offer tailored recommendations. The Sales Account Manager nurtures client relationships with the objective of introducing new products and fostering sustainable revenue growth. Operating in a dynamic and fast-moving setting, frequent travel to client locations is expected. Proficiency in customer relationship management software and sales platforms is essential. In addition to comprehensive knowledge of the company’s offerings, the individual stays informed of industry trends and challenges. The Sales Account Manager demonstrates resilience, strong interpersonal skills, and a commitment to building trust through attentive listening and relationship cultivation.',
280
+ 'The Sales Account Manager leads a team responsible for developing marketing strategies and overseeing brand promotion within the organisation. He/She designs campaigns to increase product awareness, coordinates with advertising agencies, and manages digital content creation to enhance market presence. The role requires supervising junior marketing staff, setting performance targets, and analysing campaign metrics to adjust marketing approaches. The Sales Account Manager operates primarily from the office, with limited travel obligations, and collaborates closely with the product development team to align messaging. Strong skills in marketing analytics, campaign management tools, and creative communication are essential. This role focuses on strategic branding and promotional activities rather than direct client sales or relationship management.',
281
+ ]
282
+ embeddings = model.encode(sentences)
283
+ print(embeddings.shape)
284
+ # [3, 768]
285
+
286
+ # Get the similarity scores for the embeddings
287
+ similarities = model.similarity(embeddings, embeddings)
288
+ print(similarities)
289
+ # tensor([[1.0000, 0.9011, 0.5820],
290
+ # [0.9011, 1.0000, 0.6250],
291
+ # [0.5820, 0.6250, 1.0000]])
292
+ ```
293
+
294
+ <!--
295
+ ### Direct Usage (Transformers)
296
+
297
+ <details><summary>Click to see the direct usage in Transformers</summary>
298
+
299
+ </details>
300
+ -->
301
+
302
+ <!--
303
+ ### Downstream Usage (Sentence Transformers)
304
+
305
+ You can finetune this model on your own dataset.
306
+
307
+ <details><summary>Click to expand</summary>
308
+
309
+ </details>
310
+ -->
311
+
312
+ <!--
313
+ ### Out-of-Scope Use
314
+
315
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
316
+ -->
317
+
318
+ <!--
319
+ ## Bias, Risks and Limitations
320
+
321
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
322
+ -->
323
+
324
+ <!--
325
+ ### Recommendations
326
+
327
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
328
+ -->
329
+
330
+ ## Training Details
331
+
332
+ ### Training Dataset
333
+
334
+ #### ssf-train-valid-v3
335
+
336
+ * Dataset: [ssf-train-valid-v3](https://huggingface.co/datasets/dnth/ssf-train-valid-v3) at [f461fff](https://huggingface.co/datasets/dnth/ssf-train-valid-v3/tree/f461fffdfdab1358c6d34b5ed221db30c7444744)
337
+ * Size: 3,016 training samples
338
+ * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
339
+ * Approximate statistics based on the first 1000 samples:
340
+ | | anchor | positive | negative |
341
+ |:--------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
342
+ | type | string | string | string |
343
+ | details | <ul><li>min: 57 tokens</li><li>mean: 169.03 tokens</li><li>max: 403 tokens</li></ul> | <ul><li>min: 55 tokens</li><li>mean: 138.39 tokens</li><li>max: 275 tokens</li></ul> | <ul><li>min: 37 tokens</li><li>mean: 109.4 tokens</li><li>max: 201 tokens</li></ul> |
344
+ * Samples:
345
+ | anchor | positive | negative |
346
+ |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
347
+ | <code>The Marketing Director drives the organisations business strategy by establishing the organisation's integrated marketing communications (IMC) strategy, partnership marketing arrangements and advices on product development and enhancement. He/She provides senior management with marketing advise, develops budget and manpower plans; and focuses on executing the IMC and partnership marketing plans to achieve business results. He directs the research and data analytics to obtain market and client insights, translates client insights into products and product features with market interest or potential market demand. He operates in a rapidly transforming business environment and functions through his understanding of consumers insights, market trends and industry landscape to promote the organisation and increase market demand. He is a results-oriented, astute leader who is able to negotiate strategically. He possesses strong business acumen and broad understanding of consumer, market and in...</code> | <code>The Marketing Director spearheads the company’s business growth by formulating and implementing a comprehensive integrated marketing communications (IMC) strategy and fostering strategic partnerships. This role advises senior leadership on marketing initiatives, oversees budget allocation and manpower planning, and ensures the successful execution of IMC and partnership campaigns to meet business objectives. The Marketing Director leads market research and data analysis efforts to capture consumer and market insights, translating these findings into innovative product developments aligned with emerging market demands. Operating within a dynamic and evolving business landscape, the director leverages deep knowledge of consumer behavior, market trends, and the competitive environment to enhance brand presence and drive demand. A decisive and visionary leader, this individual excels in strategic negotiations, demonstrates strong commercial insight, and inspires teams with a customer-centr...</code> | <code>The Marketing Analyst directs the organisation’s market research functions by designing and implementing data collection frameworks, analysing consumer data, and supporting the development of targeted marketing campaigns. He/She collaborates with product teams to provide data-driven recommendations for product positioning and pricing strategies. Responsible for preparing detailed reports and presentations for marketing managers, the analyst works under close supervision and focuses on executing market segmentation and consumer behavior studies. This role operates in a structured environment, requiring proficiency in data analytics tools and an understanding of marketing metrics to support decision-making. The Marketing Analyst is a detail-oriented individual with strong quantitative skills and the ability to communicate insights to internal teams but does not hold strategic leadership responsibilities or negotiate external partnerships.</code> |
348
+ | <code>An Enrolled Nurse is responsible for providing basic nursing care and patient education under the supervision and direction of a registered nurse, in collaboration with the healthcare teams according to the established policies, procedures and guidelines. S/He attends relevant nursing training to ensure that her/his skills remain up-to-date to provide patients with quality nursing care. S/He coaches new enrolled nurses, students and support care staff. S/He operates in a wide variety of settings such as acute care, primary care, community hospitals, integrated care and long-term care facilities. S/He should be meticulous, accountable and a team player.</code> | <code>The Enrolled Nurse delivers fundamental nursing care and patient education while working under the guidance of a registered nurse and collaborating with multidisciplinary healthcare teams in accordance with established protocols and guidelines. This role involves participating in ongoing nursing training to maintain current competencies and ensure high-quality patient care. Additionally, the Enrolled Nurse mentors newly recruited enrolled nurses, nursing students, and support care personnel. The position operates across diverse healthcare environments, including acute hospitals, primary care centers, community hospitals, integrated care networks, and long-term care institutions. Candidates should demonstrate thoroughness, responsibility, and strong teamwork skills.</code> | <code>The Enrolled Nurse manages patient admission processes and medical records under the supervision of healthcare administrators, coordinating with hospital support services according to institutional policies and administrative guidelines. The role requires attending workshops on healthcare administration to keep skills current and providing training to new administrative staff and interns. This position functions primarily within hospital admissions, billing departments, outpatient clinics, and medical records offices. Attention to detail, accountability, and effective collaboration are essential for success.</code> |
349
+ | <code>The Depot Train Controller directs the movement of trains within the depot, including launching and withdrawing trains in accordance with train service standards and requirements, and facilitating the stabling of trains in the depot for service and maintenance works. He/She coordinates with relevant internal and external stakeholders to execute first-line recovery of trains during train service disruptions. He is organised, meticulous, and systematic in managing the movement and stabling of trains, and in coordinating track access, so as to ensure the highest safety standards for personnel and train movement are upheld in the depot premises.</code> | <code>The Depot Train Controller oversees the scheduling and movement of trains within the depot, managing the deployment and withdrawal of trains following operational standards and service protocols. This role involves collaborating with various internal teams and external partners to facilitate initial recovery actions during service interruptions. The controller must be detail-oriented, methodical, and well-organized when handling train positioning and track allocation to maintain strict safety compliance for both staff and train operations within the depot.</code> | <code>The Depot Maintenance Planner coordinates the scheduling of routine and emergency maintenance activities for trains across the network, ensuring maintenance tasks align with safety regulations and service availability. This role requires liaising with engineering teams and suppliers to optimize resource allocation and minimize downtime. The planner must be proactive, analytical, and adept at balancing maintenance priorities while adhering to compliance standards for depot facilities.</code> |
350
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
351
+ ```json
352
+ {
353
+ "scale": 20.0,
354
+ "similarity_fct": "cos_sim",
355
+ "gather_across_devices": false
356
+ }
357
+ ```
358
+
359
+ ### Evaluation Dataset
360
+
361
+ #### ssf-train-valid-v3
362
+
363
+ * Dataset: [ssf-train-valid-v3](https://huggingface.co/datasets/dnth/ssf-train-valid-v3) at [f461fff](https://huggingface.co/datasets/dnth/ssf-train-valid-v3/tree/f461fffdfdab1358c6d34b5ed221db30c7444744)
364
+ * Size: 754 evaluation samples
365
+ * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
366
+ * Approximate statistics based on the first 754 samples:
367
+ | | anchor | positive | negative |
368
+ |:--------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
369
+ | type | string | string | string |
370
+ | details | <ul><li>min: 57 tokens</li><li>mean: 169.99 tokens</li><li>max: 352 tokens</li></ul> | <ul><li>min: 59 tokens</li><li>mean: 138.43 tokens</li><li>max: 285 tokens</li></ul> | <ul><li>min: 34 tokens</li><li>mean: 109.81 tokens</li><li>max: 274 tokens</li></ul> |
371
+ * Samples:
372
+ | anchor | positive | negative |
373
+ |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
374
+ | <code>The Quality Control Laboratory Analyst/Chemist/Microbiologist monitors sampling, cleanliness and product quality testing activities, performs non-standard quality tests, and manages associated documentation and data. He/She identifies the operating criteria for the tools, equipment and materials to be used, and collaborates with the Engineering and Maintenance department to ensure that laboratory equipment and infrastructure function as required. In addition, he/ implements Standard Operating Procedures (SOPs) and workflow improvements in the laboratory. The Quality Control Laboratory Analyst/Chemist/Microbiologist works in a laboratory setting, primarily in a cleanroom environment, and may be required to work on a shift. He has to exercise critical and analytical thinking to review data and identify discrepancies against set criteria. He requires strong communication and teamwork to collaborate effectively with others in order to fulfil work objectives.</code> | <code>The Quality Control Laboratory Analyst/Chemist/Microbiologist is responsible for overseeing sampling processes, ensuring cleanliness standards, and conducting product quality assessments, including specialized testing procedures. This role involves determining optimal operating parameters for laboratory instruments, coordinating with Engineering and Maintenance teams to maintain equipment functionality, and enforcing Standard Operating Procedures (SOPs) alongside continuous workflow enhancements. Operating mainly within a cleanroom laboratory environment, often on rotational shifts, the analyst must apply strong analytical skills to evaluate data accuracy and detect inconsistencies. Effective communication and collaborative teamwork are essential to achieving quality assurance objectives.</code> | <code>The Quality Control Laboratory Technician coordinates sampling and cleanliness checks, but primarily focuses on maintaining inventory and ordering laboratory supplies. This position works closely with procurement and logistics teams rather than Engineering or Maintenance, ensuring that laboratory consumables are stocked and equipment calibration schedules are tracked. The technician operates in a standard laboratory space and follows established SOPs without involvement in workflow improvements or critical data analysis. While teamwork is necessary for administrative tasks, the role requires minimal analytical judgment and no shift work is typically expected.</code> |
375
+ | <code>The Assistant Keeper/Assistant Aquarist assists in the care and management of wildlife within the organisation/attractions sites. This includes supporting the preparation of food to the wildlife, caring for ill animals, checking enclosures and cages for signs of wear or damage for animal, staff and visitor safety, and giving educational talks/tours to the visitors. He/she also assists in maintaining animal training behaviours and promoting conservation awareness through animal presentations. Conscientious and responsible, he is attentive to the needs of the wildlife under his care, and leverages his strong observation skills to monitor and report the status of characteristics and behaviours of the wildlife under his care. He is able to work both independently and under direction. He is physically fit and works in a shift system encompassing weekends and public holidays. Outside the working hours, he may be on a rota for call-outs. He often stays outdoors for long periods of time even t...</code> | <code>The Assistant Keeper/Assistant Aquarist supports the daily care and management of animals at wildlife facilities or attraction sites. Responsibilities include assisting with feeding routines, providing care for sick animals, inspecting enclosures for safety hazards, and delivering educational tours to visitors. This role involves helping to maintain animal training programs and raising public awareness about conservation through animal demonstrations. The assistant is diligent and attentive, closely observing animal health and behavior to report any changes. Capable of working independently or under supervision, the incumbent must be physically resilient, work shifts including weekends and public holidays, and be prepared for on-call duties. The position often requires extended outdoor work in various weather conditions and may necessitate a valid driving licence or scuba-diving certification depending on the workplace environment.</code> | <code>The Animal Nutritionist develops dietary plans and nutritional programs for wildlife and captive animals within zoological and conservation organizations. This role focuses on formulating balanced diets, analyzing feed components, and collaborating with veterinary teams to optimize animal health through specialized nutrition. The Animal Nutritionist primarily works in laboratory and office settings, conducting research and evaluating the effects of various feed formulations. Unlike hands-on caretaking roles, this position requires advanced knowledge of animal physiology and dietetics, with minimal direct interaction with visitors or enclosure maintenance. This role typically operates during standard office hours and does not involve shift work or outdoor duties.</code> |
376
+ | <code>The Senior Manager works in the field of counselling management. He/She should be qualified and trained to monitor and manage the organisation's strategic initiative, resource management, collaboration and corporate governance to ensure operational efficiency. He oversees strategic implementation, budgets the use of capital and human resources, develops professional development programmes, and initiates professional relationships across agencies. He also ensures operational and governance efficiency through supervision of a multi-disciplinary staff performance evaluation, and policy implementation. He is an experienced management staff who is meticulous, committed and possesses good problem-solving skills.</code> | <code>The Senior Manager in counselling leadership is responsible for guiding and overseeing the organisation’s strategic priorities, resource allocation, and inter-agency partnerships to maximise operational effectiveness. This role entails supervising the execution of strategic plans, managing budgets for both financial and human capital, designing continuing professional development initiatives, and fostering collaborative networks across different agencies. The Senior Manager also ensures adherence to governance policies and evaluates the performance of a multidisciplinary team. This position requires an experienced, detail-oriented leader with strong commitment and advanced problem-solving capabilities.</code> | <code>The Senior Manager in community health administration leads initiatives in healthcare program management and service delivery. He/She coordinates healthcare resources, manages patient care budgets, and develops training programs for medical staff. The role involves overseeing clinical operations, ensuring compliance with health regulations, and supervising multidisciplinary healthcare teams. This professional must be highly experienced, organized, and skilled in resolving complex clinical and administrative issues.</code> |
377
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
378
+ ```json
379
+ {
380
+ "scale": 20.0,
381
+ "similarity_fct": "cos_sim",
382
+ "gather_across_devices": false
383
+ }
384
+ ```
385
+
386
+ ### Training Hyperparameters
387
+ #### Non-Default Hyperparameters
388
+
389
+ - `eval_strategy`: epoch
390
+ - `per_device_train_batch_size`: 32
391
+ - `per_device_eval_batch_size`: 16
392
+ - `gradient_accumulation_steps`: 16
393
+ - `learning_rate`: 2e-05
394
+ - `num_train_epochs`: 5
395
+ - `lr_scheduler_type`: cosine
396
+ - `warmup_ratio`: 0.1
397
+ - `bf16`: True
398
+ - `tf32`: False
399
+ - `load_best_model_at_end`: True
400
+ - `batch_sampler`: no_duplicates
401
+
402
+ #### All Hyperparameters
403
+ <details><summary>Click to expand</summary>
404
+
405
+ - `overwrite_output_dir`: False
406
+ - `do_predict`: False
407
+ - `eval_strategy`: epoch
408
+ - `prediction_loss_only`: True
409
+ - `per_device_train_batch_size`: 32
410
+ - `per_device_eval_batch_size`: 16
411
+ - `per_gpu_train_batch_size`: None
412
+ - `per_gpu_eval_batch_size`: None
413
+ - `gradient_accumulation_steps`: 16
414
+ - `eval_accumulation_steps`: None
415
+ - `torch_empty_cache_steps`: None
416
+ - `learning_rate`: 2e-05
417
+ - `weight_decay`: 0.0
418
+ - `adam_beta1`: 0.9
419
+ - `adam_beta2`: 0.999
420
+ - `adam_epsilon`: 1e-08
421
+ - `max_grad_norm`: 1.0
422
+ - `num_train_epochs`: 5
423
+ - `max_steps`: -1
424
+ - `lr_scheduler_type`: cosine
425
+ - `lr_scheduler_kwargs`: {}
426
+ - `warmup_ratio`: 0.1
427
+ - `warmup_steps`: 0
428
+ - `log_level`: passive
429
+ - `log_level_replica`: warning
430
+ - `log_on_each_node`: True
431
+ - `logging_nan_inf_filter`: True
432
+ - `save_safetensors`: True
433
+ - `save_on_each_node`: False
434
+ - `save_only_model`: False
435
+ - `restore_callback_states_from_checkpoint`: False
436
+ - `no_cuda`: False
437
+ - `use_cpu`: False
438
+ - `use_mps_device`: False
439
+ - `seed`: 42
440
+ - `data_seed`: None
441
+ - `jit_mode_eval`: False
442
+ - `use_ipex`: False
443
+ - `bf16`: True
444
+ - `fp16`: False
445
+ - `fp16_opt_level`: O1
446
+ - `half_precision_backend`: auto
447
+ - `bf16_full_eval`: False
448
+ - `fp16_full_eval`: False
449
+ - `tf32`: False
450
+ - `local_rank`: 0
451
+ - `ddp_backend`: None
452
+ - `tpu_num_cores`: None
453
+ - `tpu_metrics_debug`: False
454
+ - `debug`: []
455
+ - `dataloader_drop_last`: False
456
+ - `dataloader_num_workers`: 0
457
+ - `dataloader_prefetch_factor`: None
458
+ - `past_index`: -1
459
+ - `disable_tqdm`: False
460
+ - `remove_unused_columns`: True
461
+ - `label_names`: None
462
+ - `load_best_model_at_end`: True
463
+ - `ignore_data_skip`: False
464
+ - `fsdp`: []
465
+ - `fsdp_min_num_params`: 0
466
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
467
+ - `fsdp_transformer_layer_cls_to_wrap`: None
468
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
469
+ - `deepspeed`: None
470
+ - `label_smoothing_factor`: 0.0
471
+ - `optim`: adamw_torch_fused
472
+ - `optim_args`: None
473
+ - `adafactor`: False
474
+ - `group_by_length`: False
475
+ - `length_column_name`: length
476
+ - `ddp_find_unused_parameters`: None
477
+ - `ddp_bucket_cap_mb`: None
478
+ - `ddp_broadcast_buffers`: False
479
+ - `dataloader_pin_memory`: True
480
+ - `dataloader_persistent_workers`: False
481
+ - `skip_memory_metrics`: True
482
+ - `use_legacy_prediction_loop`: False
483
+ - `push_to_hub`: False
484
+ - `resume_from_checkpoint`: None
485
+ - `hub_model_id`: None
486
+ - `hub_strategy`: every_save
487
+ - `hub_private_repo`: None
488
+ - `hub_always_push`: False
489
+ - `hub_revision`: None
490
+ - `gradient_checkpointing`: False
491
+ - `gradient_checkpointing_kwargs`: None
492
+ - `include_inputs_for_metrics`: False
493
+ - `include_for_metrics`: []
494
+ - `eval_do_concat_batches`: True
495
+ - `fp16_backend`: auto
496
+ - `push_to_hub_model_id`: None
497
+ - `push_to_hub_organization`: None
498
+ - `mp_parameters`:
499
+ - `auto_find_batch_size`: False
500
+ - `full_determinism`: False
501
+ - `torchdynamo`: None
502
+ - `ray_scope`: last
503
+ - `ddp_timeout`: 1800
504
+ - `torch_compile`: False
505
+ - `torch_compile_backend`: None
506
+ - `torch_compile_mode`: None
507
+ - `include_tokens_per_second`: False
508
+ - `include_num_input_tokens_seen`: False
509
+ - `neftune_noise_alpha`: None
510
+ - `optim_target_modules`: None
511
+ - `batch_eval_metrics`: False
512
+ - `eval_on_start`: False
513
+ - `use_liger_kernel`: False
514
+ - `liger_kernel_config`: None
515
+ - `eval_use_gather_object`: False
516
+ - `average_tokens_across_devices`: False
517
+ - `prompts`: None
518
+ - `batch_sampler`: no_duplicates
519
+ - `multi_dataset_batch_sampler`: proportional
520
+ - `router_mapping`: {}
521
+ - `learning_rate_mapping`: {}
522
+
523
+ </details>
524
+
525
+ ### Training Logs
526
+ | Epoch | Step | Training Loss | Validation Loss |
527
+ |:-------:|:------:|:-------------:|:---------------:|
528
+ | 1.0 | 6 | 0.1655 | 0.0144 |
529
+ | 2.0 | 12 | 0.0099 | 0.0039 |
530
+ | 3.0 | 18 | 0.0055 | 0.0027 |
531
+ | 4.0 | 24 | 0.0046 | 0.0023 |
532
+ | **5.0** | **30** | **0.0042** | **0.0022** |
533
+
534
+ * The bold row denotes the saved checkpoint.
535
+
536
+ ### Framework Versions
537
+ - Python: 3.12.11
538
+ - Sentence Transformers: 5.1.0
539
+ - Transformers: 4.55.0
540
+ - PyTorch: 2.8.0+cu128
541
+ - Accelerate: 1.10.0
542
+ - Datasets: 4.0.0
543
+ - Tokenizers: 0.21.4
544
+
545
+ ## Citation
546
+
547
+ ### BibTeX
548
+
549
+ #### Sentence Transformers
550
+ ```bibtex
551
+ @inproceedings{reimers-2019-sentence-bert,
552
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
553
+ author = "Reimers, Nils and Gurevych, Iryna",
554
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
555
+ month = "11",
556
+ year = "2019",
557
+ publisher = "Association for Computational Linguistics",
558
+ url = "https://arxiv.org/abs/1908.10084",
559
+ }
560
+ ```
561
+
562
+ #### MultipleNegativesRankingLoss
563
+ ```bibtex
564
+ @misc{henderson2017efficient,
565
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
566
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
567
+ year={2017},
568
+ eprint={1705.00652},
569
+ archivePrefix={arXiv},
570
+ primaryClass={cs.CL}
571
+ }
572
+ ```
573
+
574
+ <!--
575
+ ## Glossary
576
+
577
+ *Clearly define terms in order to be accessible across audiences.*
578
+ -->
579
+
580
+ <!--
581
+ ## Model Card Authors
582
+
583
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
584
+ -->
585
+
586
+ <!--
587
+ ## Model Card Contact
588
+
589
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
590
+ -->
config.json ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "ModernBertModel"
4
+ ],
5
+ "attention_bias": false,
6
+ "attention_dropout": 0.0,
7
+ "bos_token_id": 50281,
8
+ "classifier_activation": "gelu",
9
+ "classifier_bias": false,
10
+ "classifier_dropout": 0.0,
11
+ "classifier_pooling": "mean",
12
+ "cls_token_id": 50281,
13
+ "decoder_bias": true,
14
+ "deterministic_flash_attn": false,
15
+ "embedding_dropout": 0.0,
16
+ "eos_token_id": 50282,
17
+ "global_attn_every_n_layers": 3,
18
+ "global_rope_theta": 160000.0,
19
+ "gradient_checkpointing": false,
20
+ "hidden_activation": "gelu",
21
+ "hidden_size": 768,
22
+ "initializer_cutoff_factor": 2.0,
23
+ "initializer_range": 0.02,
24
+ "intermediate_size": 1152,
25
+ "layer_norm_eps": 1e-05,
26
+ "local_attention": 128,
27
+ "local_rope_theta": 10000.0,
28
+ "max_position_embeddings": 8192,
29
+ "mlp_bias": false,
30
+ "mlp_dropout": 0.0,
31
+ "model_type": "modernbert",
32
+ "norm_bias": false,
33
+ "norm_eps": 1e-05,
34
+ "num_attention_heads": 12,
35
+ "num_hidden_layers": 22,
36
+ "pad_token_id": 50283,
37
+ "position_embedding_type": "absolute",
38
+ "repad_logits_with_grad": false,
39
+ "sep_token_id": 50282,
40
+ "sparse_pred_ignore_index": -100,
41
+ "sparse_prediction": false,
42
+ "torch_dtype": "float32",
43
+ "transformers_version": "4.55.0",
44
+ "vocab_size": 50368
45
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "5.1.0",
4
+ "transformers": "4.55.0",
5
+ "pytorch": "2.8.0+cu128"
6
+ },
7
+ "prompts": {
8
+ "query": "",
9
+ "document": ""
10
+ },
11
+ "default_prompt_name": null,
12
+ "similarity_fn_name": "cosine",
13
+ "model_type": "SentenceTransformer"
14
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5d575d112c6f8bffbbb56118df9fb38a419c35f180215ffdab579b98ae3811b8
3
+ size 596070136
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 8192,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": true,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,945 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "|||IP_ADDRESS|||",
5
+ "lstrip": false,
6
+ "normalized": true,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": false
10
+ },
11
+ "1": {
12
+ "content": "<|padding|>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "50254": {
20
+ "content": " ",
21
+ "lstrip": false,
22
+ "normalized": true,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": false
26
+ },
27
+ "50255": {
28
+ "content": " ",
29
+ "lstrip": false,
30
+ "normalized": true,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": false
34
+ },
35
+ "50256": {
36
+ "content": " ",
37
+ "lstrip": false,
38
+ "normalized": true,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": false
42
+ },
43
+ "50257": {
44
+ "content": " ",
45
+ "lstrip": false,
46
+ "normalized": true,
47
+ "rstrip": false,
48
+ "single_word": false,
49
+ "special": false
50
+ },
51
+ "50258": {
52
+ "content": " ",
53
+ "lstrip": false,
54
+ "normalized": true,
55
+ "rstrip": false,
56
+ "single_word": false,
57
+ "special": false
58
+ },
59
+ "50259": {
60
+ "content": " ",
61
+ "lstrip": false,
62
+ "normalized": true,
63
+ "rstrip": false,
64
+ "single_word": false,
65
+ "special": false
66
+ },
67
+ "50260": {
68
+ "content": " ",
69
+ "lstrip": false,
70
+ "normalized": true,
71
+ "rstrip": false,
72
+ "single_word": false,
73
+ "special": false
74
+ },
75
+ "50261": {
76
+ "content": " ",
77
+ "lstrip": false,
78
+ "normalized": true,
79
+ "rstrip": false,
80
+ "single_word": false,
81
+ "special": false
82
+ },
83
+ "50262": {
84
+ "content": " ",
85
+ "lstrip": false,
86
+ "normalized": true,
87
+ "rstrip": false,
88
+ "single_word": false,
89
+ "special": false
90
+ },
91
+ "50263": {
92
+ "content": " ",
93
+ "lstrip": false,
94
+ "normalized": true,
95
+ "rstrip": false,
96
+ "single_word": false,
97
+ "special": false
98
+ },
99
+ "50264": {
100
+ "content": " ",
101
+ "lstrip": false,
102
+ "normalized": true,
103
+ "rstrip": false,
104
+ "single_word": false,
105
+ "special": false
106
+ },
107
+ "50265": {
108
+ "content": " ",
109
+ "lstrip": false,
110
+ "normalized": true,
111
+ "rstrip": false,
112
+ "single_word": false,
113
+ "special": false
114
+ },
115
+ "50266": {
116
+ "content": " ",
117
+ "lstrip": false,
118
+ "normalized": true,
119
+ "rstrip": false,
120
+ "single_word": false,
121
+ "special": false
122
+ },
123
+ "50267": {
124
+ "content": " ",
125
+ "lstrip": false,
126
+ "normalized": true,
127
+ "rstrip": false,
128
+ "single_word": false,
129
+ "special": false
130
+ },
131
+ "50268": {
132
+ "content": " ",
133
+ "lstrip": false,
134
+ "normalized": true,
135
+ "rstrip": false,
136
+ "single_word": false,
137
+ "special": false
138
+ },
139
+ "50269": {
140
+ "content": " ",
141
+ "lstrip": false,
142
+ "normalized": true,
143
+ "rstrip": false,
144
+ "single_word": false,
145
+ "special": false
146
+ },
147
+ "50270": {
148
+ "content": " ",
149
+ "lstrip": false,
150
+ "normalized": true,
151
+ "rstrip": false,
152
+ "single_word": false,
153
+ "special": false
154
+ },
155
+ "50271": {
156
+ "content": " ",
157
+ "lstrip": false,
158
+ "normalized": true,
159
+ "rstrip": false,
160
+ "single_word": false,
161
+ "special": false
162
+ },
163
+ "50272": {
164
+ "content": " ",
165
+ "lstrip": false,
166
+ "normalized": true,
167
+ "rstrip": false,
168
+ "single_word": false,
169
+ "special": false
170
+ },
171
+ "50273": {
172
+ "content": " ",
173
+ "lstrip": false,
174
+ "normalized": true,
175
+ "rstrip": false,
176
+ "single_word": false,
177
+ "special": false
178
+ },
179
+ "50274": {
180
+ "content": " ",
181
+ "lstrip": false,
182
+ "normalized": true,
183
+ "rstrip": false,
184
+ "single_word": false,
185
+ "special": false
186
+ },
187
+ "50275": {
188
+ "content": " ",
189
+ "lstrip": false,
190
+ "normalized": true,
191
+ "rstrip": false,
192
+ "single_word": false,
193
+ "special": false
194
+ },
195
+ "50276": {
196
+ "content": " ",
197
+ "lstrip": false,
198
+ "normalized": true,
199
+ "rstrip": false,
200
+ "single_word": false,
201
+ "special": false
202
+ },
203
+ "50277": {
204
+ "content": "|||EMAIL_ADDRESS|||",
205
+ "lstrip": false,
206
+ "normalized": true,
207
+ "rstrip": false,
208
+ "single_word": false,
209
+ "special": false
210
+ },
211
+ "50278": {
212
+ "content": "|||PHONE_NUMBER|||",
213
+ "lstrip": false,
214
+ "normalized": true,
215
+ "rstrip": false,
216
+ "single_word": false,
217
+ "special": false
218
+ },
219
+ "50279": {
220
+ "content": "<|endoftext|>",
221
+ "lstrip": false,
222
+ "normalized": false,
223
+ "rstrip": false,
224
+ "single_word": false,
225
+ "special": true
226
+ },
227
+ "50280": {
228
+ "content": "[UNK]",
229
+ "lstrip": false,
230
+ "normalized": false,
231
+ "rstrip": false,
232
+ "single_word": false,
233
+ "special": true
234
+ },
235
+ "50281": {
236
+ "content": "[CLS]",
237
+ "lstrip": false,
238
+ "normalized": false,
239
+ "rstrip": false,
240
+ "single_word": false,
241
+ "special": true
242
+ },
243
+ "50282": {
244
+ "content": "[SEP]",
245
+ "lstrip": false,
246
+ "normalized": false,
247
+ "rstrip": false,
248
+ "single_word": false,
249
+ "special": true
250
+ },
251
+ "50283": {
252
+ "content": "[PAD]",
253
+ "lstrip": false,
254
+ "normalized": false,
255
+ "rstrip": false,
256
+ "single_word": false,
257
+ "special": true
258
+ },
259
+ "50284": {
260
+ "content": "[MASK]",
261
+ "lstrip": true,
262
+ "normalized": false,
263
+ "rstrip": false,
264
+ "single_word": false,
265
+ "special": true
266
+ },
267
+ "50285": {
268
+ "content": "[unused0]",
269
+ "lstrip": false,
270
+ "normalized": true,
271
+ "rstrip": false,
272
+ "single_word": false,
273
+ "special": false
274
+ },
275
+ "50286": {
276
+ "content": "[unused1]",
277
+ "lstrip": false,
278
+ "normalized": true,
279
+ "rstrip": false,
280
+ "single_word": false,
281
+ "special": false
282
+ },
283
+ "50287": {
284
+ "content": "[unused2]",
285
+ "lstrip": false,
286
+ "normalized": true,
287
+ "rstrip": false,
288
+ "single_word": false,
289
+ "special": false
290
+ },
291
+ "50288": {
292
+ "content": "[unused3]",
293
+ "lstrip": false,
294
+ "normalized": true,
295
+ "rstrip": false,
296
+ "single_word": false,
297
+ "special": false
298
+ },
299
+ "50289": {
300
+ "content": "[unused4]",
301
+ "lstrip": false,
302
+ "normalized": true,
303
+ "rstrip": false,
304
+ "single_word": false,
305
+ "special": false
306
+ },
307
+ "50290": {
308
+ "content": "[unused5]",
309
+ "lstrip": false,
310
+ "normalized": true,
311
+ "rstrip": false,
312
+ "single_word": false,
313
+ "special": false
314
+ },
315
+ "50291": {
316
+ "content": "[unused6]",
317
+ "lstrip": false,
318
+ "normalized": true,
319
+ "rstrip": false,
320
+ "single_word": false,
321
+ "special": false
322
+ },
323
+ "50292": {
324
+ "content": "[unused7]",
325
+ "lstrip": false,
326
+ "normalized": true,
327
+ "rstrip": false,
328
+ "single_word": false,
329
+ "special": false
330
+ },
331
+ "50293": {
332
+ "content": "[unused8]",
333
+ "lstrip": false,
334
+ "normalized": true,
335
+ "rstrip": false,
336
+ "single_word": false,
337
+ "special": false
338
+ },
339
+ "50294": {
340
+ "content": "[unused9]",
341
+ "lstrip": false,
342
+ "normalized": true,
343
+ "rstrip": false,
344
+ "single_word": false,
345
+ "special": false
346
+ },
347
+ "50295": {
348
+ "content": "[unused10]",
349
+ "lstrip": false,
350
+ "normalized": true,
351
+ "rstrip": false,
352
+ "single_word": false,
353
+ "special": false
354
+ },
355
+ "50296": {
356
+ "content": "[unused11]",
357
+ "lstrip": false,
358
+ "normalized": true,
359
+ "rstrip": false,
360
+ "single_word": false,
361
+ "special": false
362
+ },
363
+ "50297": {
364
+ "content": "[unused12]",
365
+ "lstrip": false,
366
+ "normalized": true,
367
+ "rstrip": false,
368
+ "single_word": false,
369
+ "special": false
370
+ },
371
+ "50298": {
372
+ "content": "[unused13]",
373
+ "lstrip": false,
374
+ "normalized": true,
375
+ "rstrip": false,
376
+ "single_word": false,
377
+ "special": false
378
+ },
379
+ "50299": {
380
+ "content": "[unused14]",
381
+ "lstrip": false,
382
+ "normalized": true,
383
+ "rstrip": false,
384
+ "single_word": false,
385
+ "special": false
386
+ },
387
+ "50300": {
388
+ "content": "[unused15]",
389
+ "lstrip": false,
390
+ "normalized": true,
391
+ "rstrip": false,
392
+ "single_word": false,
393
+ "special": false
394
+ },
395
+ "50301": {
396
+ "content": "[unused16]",
397
+ "lstrip": false,
398
+ "normalized": true,
399
+ "rstrip": false,
400
+ "single_word": false,
401
+ "special": false
402
+ },
403
+ "50302": {
404
+ "content": "[unused17]",
405
+ "lstrip": false,
406
+ "normalized": true,
407
+ "rstrip": false,
408
+ "single_word": false,
409
+ "special": false
410
+ },
411
+ "50303": {
412
+ "content": "[unused18]",
413
+ "lstrip": false,
414
+ "normalized": true,
415
+ "rstrip": false,
416
+ "single_word": false,
417
+ "special": false
418
+ },
419
+ "50304": {
420
+ "content": "[unused19]",
421
+ "lstrip": false,
422
+ "normalized": true,
423
+ "rstrip": false,
424
+ "single_word": false,
425
+ "special": false
426
+ },
427
+ "50305": {
428
+ "content": "[unused20]",
429
+ "lstrip": false,
430
+ "normalized": true,
431
+ "rstrip": false,
432
+ "single_word": false,
433
+ "special": false
434
+ },
435
+ "50306": {
436
+ "content": "[unused21]",
437
+ "lstrip": false,
438
+ "normalized": true,
439
+ "rstrip": false,
440
+ "single_word": false,
441
+ "special": false
442
+ },
443
+ "50307": {
444
+ "content": "[unused22]",
445
+ "lstrip": false,
446
+ "normalized": true,
447
+ "rstrip": false,
448
+ "single_word": false,
449
+ "special": false
450
+ },
451
+ "50308": {
452
+ "content": "[unused23]",
453
+ "lstrip": false,
454
+ "normalized": true,
455
+ "rstrip": false,
456
+ "single_word": false,
457
+ "special": false
458
+ },
459
+ "50309": {
460
+ "content": "[unused24]",
461
+ "lstrip": false,
462
+ "normalized": true,
463
+ "rstrip": false,
464
+ "single_word": false,
465
+ "special": false
466
+ },
467
+ "50310": {
468
+ "content": "[unused25]",
469
+ "lstrip": false,
470
+ "normalized": true,
471
+ "rstrip": false,
472
+ "single_word": false,
473
+ "special": false
474
+ },
475
+ "50311": {
476
+ "content": "[unused26]",
477
+ "lstrip": false,
478
+ "normalized": true,
479
+ "rstrip": false,
480
+ "single_word": false,
481
+ "special": false
482
+ },
483
+ "50312": {
484
+ "content": "[unused27]",
485
+ "lstrip": false,
486
+ "normalized": true,
487
+ "rstrip": false,
488
+ "single_word": false,
489
+ "special": false
490
+ },
491
+ "50313": {
492
+ "content": "[unused28]",
493
+ "lstrip": false,
494
+ "normalized": true,
495
+ "rstrip": false,
496
+ "single_word": false,
497
+ "special": false
498
+ },
499
+ "50314": {
500
+ "content": "[unused29]",
501
+ "lstrip": false,
502
+ "normalized": true,
503
+ "rstrip": false,
504
+ "single_word": false,
505
+ "special": false
506
+ },
507
+ "50315": {
508
+ "content": "[unused30]",
509
+ "lstrip": false,
510
+ "normalized": true,
511
+ "rstrip": false,
512
+ "single_word": false,
513
+ "special": false
514
+ },
515
+ "50316": {
516
+ "content": "[unused31]",
517
+ "lstrip": false,
518
+ "normalized": true,
519
+ "rstrip": false,
520
+ "single_word": false,
521
+ "special": false
522
+ },
523
+ "50317": {
524
+ "content": "[unused32]",
525
+ "lstrip": false,
526
+ "normalized": true,
527
+ "rstrip": false,
528
+ "single_word": false,
529
+ "special": false
530
+ },
531
+ "50318": {
532
+ "content": "[unused33]",
533
+ "lstrip": false,
534
+ "normalized": true,
535
+ "rstrip": false,
536
+ "single_word": false,
537
+ "special": false
538
+ },
539
+ "50319": {
540
+ "content": "[unused34]",
541
+ "lstrip": false,
542
+ "normalized": true,
543
+ "rstrip": false,
544
+ "single_word": false,
545
+ "special": false
546
+ },
547
+ "50320": {
548
+ "content": "[unused35]",
549
+ "lstrip": false,
550
+ "normalized": true,
551
+ "rstrip": false,
552
+ "single_word": false,
553
+ "special": false
554
+ },
555
+ "50321": {
556
+ "content": "[unused36]",
557
+ "lstrip": false,
558
+ "normalized": true,
559
+ "rstrip": false,
560
+ "single_word": false,
561
+ "special": false
562
+ },
563
+ "50322": {
564
+ "content": "[unused37]",
565
+ "lstrip": false,
566
+ "normalized": true,
567
+ "rstrip": false,
568
+ "single_word": false,
569
+ "special": false
570
+ },
571
+ "50323": {
572
+ "content": "[unused38]",
573
+ "lstrip": false,
574
+ "normalized": true,
575
+ "rstrip": false,
576
+ "single_word": false,
577
+ "special": false
578
+ },
579
+ "50324": {
580
+ "content": "[unused39]",
581
+ "lstrip": false,
582
+ "normalized": true,
583
+ "rstrip": false,
584
+ "single_word": false,
585
+ "special": false
586
+ },
587
+ "50325": {
588
+ "content": "[unused40]",
589
+ "lstrip": false,
590
+ "normalized": true,
591
+ "rstrip": false,
592
+ "single_word": false,
593
+ "special": false
594
+ },
595
+ "50326": {
596
+ "content": "[unused41]",
597
+ "lstrip": false,
598
+ "normalized": true,
599
+ "rstrip": false,
600
+ "single_word": false,
601
+ "special": false
602
+ },
603
+ "50327": {
604
+ "content": "[unused42]",
605
+ "lstrip": false,
606
+ "normalized": true,
607
+ "rstrip": false,
608
+ "single_word": false,
609
+ "special": false
610
+ },
611
+ "50328": {
612
+ "content": "[unused43]",
613
+ "lstrip": false,
614
+ "normalized": true,
615
+ "rstrip": false,
616
+ "single_word": false,
617
+ "special": false
618
+ },
619
+ "50329": {
620
+ "content": "[unused44]",
621
+ "lstrip": false,
622
+ "normalized": true,
623
+ "rstrip": false,
624
+ "single_word": false,
625
+ "special": false
626
+ },
627
+ "50330": {
628
+ "content": "[unused45]",
629
+ "lstrip": false,
630
+ "normalized": true,
631
+ "rstrip": false,
632
+ "single_word": false,
633
+ "special": false
634
+ },
635
+ "50331": {
636
+ "content": "[unused46]",
637
+ "lstrip": false,
638
+ "normalized": true,
639
+ "rstrip": false,
640
+ "single_word": false,
641
+ "special": false
642
+ },
643
+ "50332": {
644
+ "content": "[unused47]",
645
+ "lstrip": false,
646
+ "normalized": true,
647
+ "rstrip": false,
648
+ "single_word": false,
649
+ "special": false
650
+ },
651
+ "50333": {
652
+ "content": "[unused48]",
653
+ "lstrip": false,
654
+ "normalized": true,
655
+ "rstrip": false,
656
+ "single_word": false,
657
+ "special": false
658
+ },
659
+ "50334": {
660
+ "content": "[unused49]",
661
+ "lstrip": false,
662
+ "normalized": true,
663
+ "rstrip": false,
664
+ "single_word": false,
665
+ "special": false
666
+ },
667
+ "50335": {
668
+ "content": "[unused50]",
669
+ "lstrip": false,
670
+ "normalized": true,
671
+ "rstrip": false,
672
+ "single_word": false,
673
+ "special": false
674
+ },
675
+ "50336": {
676
+ "content": "[unused51]",
677
+ "lstrip": false,
678
+ "normalized": true,
679
+ "rstrip": false,
680
+ "single_word": false,
681
+ "special": false
682
+ },
683
+ "50337": {
684
+ "content": "[unused52]",
685
+ "lstrip": false,
686
+ "normalized": true,
687
+ "rstrip": false,
688
+ "single_word": false,
689
+ "special": false
690
+ },
691
+ "50338": {
692
+ "content": "[unused53]",
693
+ "lstrip": false,
694
+ "normalized": true,
695
+ "rstrip": false,
696
+ "single_word": false,
697
+ "special": false
698
+ },
699
+ "50339": {
700
+ "content": "[unused54]",
701
+ "lstrip": false,
702
+ "normalized": true,
703
+ "rstrip": false,
704
+ "single_word": false,
705
+ "special": false
706
+ },
707
+ "50340": {
708
+ "content": "[unused55]",
709
+ "lstrip": false,
710
+ "normalized": true,
711
+ "rstrip": false,
712
+ "single_word": false,
713
+ "special": false
714
+ },
715
+ "50341": {
716
+ "content": "[unused56]",
717
+ "lstrip": false,
718
+ "normalized": true,
719
+ "rstrip": false,
720
+ "single_word": false,
721
+ "special": false
722
+ },
723
+ "50342": {
724
+ "content": "[unused57]",
725
+ "lstrip": false,
726
+ "normalized": true,
727
+ "rstrip": false,
728
+ "single_word": false,
729
+ "special": false
730
+ },
731
+ "50343": {
732
+ "content": "[unused58]",
733
+ "lstrip": false,
734
+ "normalized": true,
735
+ "rstrip": false,
736
+ "single_word": false,
737
+ "special": false
738
+ },
739
+ "50344": {
740
+ "content": "[unused59]",
741
+ "lstrip": false,
742
+ "normalized": true,
743
+ "rstrip": false,
744
+ "single_word": false,
745
+ "special": false
746
+ },
747
+ "50345": {
748
+ "content": "[unused60]",
749
+ "lstrip": false,
750
+ "normalized": true,
751
+ "rstrip": false,
752
+ "single_word": false,
753
+ "special": false
754
+ },
755
+ "50346": {
756
+ "content": "[unused61]",
757
+ "lstrip": false,
758
+ "normalized": true,
759
+ "rstrip": false,
760
+ "single_word": false,
761
+ "special": false
762
+ },
763
+ "50347": {
764
+ "content": "[unused62]",
765
+ "lstrip": false,
766
+ "normalized": true,
767
+ "rstrip": false,
768
+ "single_word": false,
769
+ "special": false
770
+ },
771
+ "50348": {
772
+ "content": "[unused63]",
773
+ "lstrip": false,
774
+ "normalized": true,
775
+ "rstrip": false,
776
+ "single_word": false,
777
+ "special": false
778
+ },
779
+ "50349": {
780
+ "content": "[unused64]",
781
+ "lstrip": false,
782
+ "normalized": true,
783
+ "rstrip": false,
784
+ "single_word": false,
785
+ "special": false
786
+ },
787
+ "50350": {
788
+ "content": "[unused65]",
789
+ "lstrip": false,
790
+ "normalized": true,
791
+ "rstrip": false,
792
+ "single_word": false,
793
+ "special": false
794
+ },
795
+ "50351": {
796
+ "content": "[unused66]",
797
+ "lstrip": false,
798
+ "normalized": true,
799
+ "rstrip": false,
800
+ "single_word": false,
801
+ "special": false
802
+ },
803
+ "50352": {
804
+ "content": "[unused67]",
805
+ "lstrip": false,
806
+ "normalized": true,
807
+ "rstrip": false,
808
+ "single_word": false,
809
+ "special": false
810
+ },
811
+ "50353": {
812
+ "content": "[unused68]",
813
+ "lstrip": false,
814
+ "normalized": true,
815
+ "rstrip": false,
816
+ "single_word": false,
817
+ "special": false
818
+ },
819
+ "50354": {
820
+ "content": "[unused69]",
821
+ "lstrip": false,
822
+ "normalized": true,
823
+ "rstrip": false,
824
+ "single_word": false,
825
+ "special": false
826
+ },
827
+ "50355": {
828
+ "content": "[unused70]",
829
+ "lstrip": false,
830
+ "normalized": true,
831
+ "rstrip": false,
832
+ "single_word": false,
833
+ "special": false
834
+ },
835
+ "50356": {
836
+ "content": "[unused71]",
837
+ "lstrip": false,
838
+ "normalized": true,
839
+ "rstrip": false,
840
+ "single_word": false,
841
+ "special": false
842
+ },
843
+ "50357": {
844
+ "content": "[unused72]",
845
+ "lstrip": false,
846
+ "normalized": true,
847
+ "rstrip": false,
848
+ "single_word": false,
849
+ "special": false
850
+ },
851
+ "50358": {
852
+ "content": "[unused73]",
853
+ "lstrip": false,
854
+ "normalized": true,
855
+ "rstrip": false,
856
+ "single_word": false,
857
+ "special": false
858
+ },
859
+ "50359": {
860
+ "content": "[unused74]",
861
+ "lstrip": false,
862
+ "normalized": true,
863
+ "rstrip": false,
864
+ "single_word": false,
865
+ "special": false
866
+ },
867
+ "50360": {
868
+ "content": "[unused75]",
869
+ "lstrip": false,
870
+ "normalized": true,
871
+ "rstrip": false,
872
+ "single_word": false,
873
+ "special": false
874
+ },
875
+ "50361": {
876
+ "content": "[unused76]",
877
+ "lstrip": false,
878
+ "normalized": true,
879
+ "rstrip": false,
880
+ "single_word": false,
881
+ "special": false
882
+ },
883
+ "50362": {
884
+ "content": "[unused77]",
885
+ "lstrip": false,
886
+ "normalized": true,
887
+ "rstrip": false,
888
+ "single_word": false,
889
+ "special": false
890
+ },
891
+ "50363": {
892
+ "content": "[unused78]",
893
+ "lstrip": false,
894
+ "normalized": true,
895
+ "rstrip": false,
896
+ "single_word": false,
897
+ "special": false
898
+ },
899
+ "50364": {
900
+ "content": "[unused79]",
901
+ "lstrip": false,
902
+ "normalized": true,
903
+ "rstrip": false,
904
+ "single_word": false,
905
+ "special": false
906
+ },
907
+ "50365": {
908
+ "content": "[unused80]",
909
+ "lstrip": false,
910
+ "normalized": true,
911
+ "rstrip": false,
912
+ "single_word": false,
913
+ "special": false
914
+ },
915
+ "50366": {
916
+ "content": "[unused81]",
917
+ "lstrip": false,
918
+ "normalized": true,
919
+ "rstrip": false,
920
+ "single_word": false,
921
+ "special": false
922
+ },
923
+ "50367": {
924
+ "content": "[unused82]",
925
+ "lstrip": false,
926
+ "normalized": true,
927
+ "rstrip": false,
928
+ "single_word": false,
929
+ "special": false
930
+ }
931
+ },
932
+ "clean_up_tokenization_spaces": true,
933
+ "cls_token": "[CLS]",
934
+ "extra_special_tokens": {},
935
+ "mask_token": "[MASK]",
936
+ "model_input_names": [
937
+ "input_ids",
938
+ "attention_mask"
939
+ ],
940
+ "model_max_length": 8192,
941
+ "pad_token": "[PAD]",
942
+ "sep_token": "[SEP]",
943
+ "tokenizer_class": "PreTrainedTokenizerFast",
944
+ "unk_token": "[UNK]"
945
+ }