dnth commited on
Commit
28ebc96
·
verified ·
1 Parent(s): 899ad54

Add SetFit model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,279 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - setfit
4
+ - sentence-transformers
5
+ - text-classification
6
+ - generated_from_setfit_trainer
7
+ widget:
8
+ - text: The HR Learning & Development Specialist designs and delivers workforce upskilling
9
+ programmes that align with Singapore’s SkillsFuture initiatives, focusing on employee
10
+ growth, well-being, and career progression. He/She facilitates training workshops,
11
+ coaches managers on supportive leadership, and counsels staff on learning pathways,
12
+ ensuring inclusive and accessible development opportunities across diverse teams.
13
+ He/She collaborates with department heads to identify skill gaps and mobilises
14
+ internal and external resources to deliver targeted interventions, including digital
15
+ learning platforms and micro-credentialing solutions. He/She maintains accurate
16
+ records of training participation and compliance with MOM’s Tripartite Guidelines,
17
+ applying structured processes to track outcomes and report impact. He is empathetic,
18
+ a natural communicator, and adept at building trust with employees at all levels.
19
+ He thrives in Singapore’s dynamic labour landscape, where continuous learning
20
+ is prioritised and HR practices must reflect both human-centric values and regulatory
21
+ precision.
22
+ - text: The Systems Compliance Analyst is tasked with investigating anomalies in digital
23
+ workflow logs across enterprise systems, identifying root causes through data
24
+ pattern analysis, and developing corrective models to align operational processes
25
+ with Singapore’s PDPA and MAS regulatory standards. He/She manually audits hardware
26
+ and network infrastructure configurations, verifying physical server integrity,
27
+ cable terminations, and firmware compatibility using diagnostic tools, while ensuring
28
+ all system changes are documented in structured compliance repositories. He must
29
+ interpret complex audit findings, translate technical deviations into actionable
30
+ remediation plans, and enforce protocol adherence through documented control frameworks.
31
+ He operates within Singapore’s tightly regulated financial technology and logistics
32
+ sectors, where real-time system uptime and data integrity are critical. He is
33
+ detail-oriented with strong analytical reasoning, proficient in SQL, SIEM tools,
34
+ and endpoint monitoring systems, and demonstrates precision in record-keeping
35
+ under strict audit timelines. His work demands rigorous logical deduction, hands-on
36
+ technical troubleshooting, and unwavering adherence to procedural precision.
37
+ - text: The Industrial Process Auditor evaluates the efficiency and safety of manufacturing
38
+ equipment on Singapore’s automated production floors, investigating deviations
39
+ in machine performance data to pinpoint mechanical or calibration inconsistencies.
40
+ He/She performs physical inspections of conveyor systems, pneumatic actuators,
41
+ and robotic arms, using precision measurement tools to verify tolerances and record
42
+ maintenance logs, while correlating observed wear patterns with digital sensor
43
+ outputs to forecast failures. He designs standardized inspection checklists and
44
+ updates machine-specific compliance templates to align with SCCS and MOM occupational
45
+ safety directives. He operates within Singapore’s high-density electronics and
46
+ pharmaceutical manufacturing hubs, where minute deviations impact batch quality
47
+ and regulatory certification. He is technically adept in PLC diagnostics, metrology
48
+ instruments, and CMMS platforms, with a methodical approach to data logging and
49
+ root-cause analysis. He combines intuitive mechanical insight with disciplined
50
+ documentation practices, ensuring operational integrity through empirical observation
51
+ and structured reporting.
52
+ - text: The Talent Strategy Analyst designs and evaluates human capital initiatives
53
+ by interrogating workforce data to uncover patterns in performance, mobility,
54
+ and attrition, providing strategic recommendations that shape hiring, development,
55
+ and retention policies. He/She collaborates with business units to identify emerging
56
+ skill demands in Singapore’s high-growth sectors, using advanced analytics to
57
+ model future workforce needs and align training investments with SkillsFuture
58
+ priorities. He/She ensures all talent frameworks comply with statutory reporting
59
+ requirements and internal audit controls, maintaining precise records of deployment
60
+ outcomes and metrics. He influences c-suite decisions through compelling narratives
61
+ grounded in statistical insights and market benchmarking, often negotiating resource
62
+ allocation for high-impact programmes. He operates with precision in structured
63
+ reporting environments yet thrives in ambiguity, turning ambiguous data into clear
64
+ strategic pathways that balance innovation with governance. His leadership is
65
+ trusted for its clarity, consistency, and commitment to evidence-based decision-making.
66
+ - text: The HR Transformation Lead drives organisational change by aligning talent
67
+ strategies with business objectives, leveraging data-driven insights to redesign
68
+ workforce models and influence senior leadership on people initiatives. He/She
69
+ leads cross-functional teams to implement HR technology platforms, piloting digital
70
+ tools such as AI-powered talent analytics and workforce planning systems prevalent
71
+ in Singapore’s tech-forward industries. He/She investigates talent trends, interprets
72
+ workforce metrics, and constructs predictive models to anticipate skill gaps and
73
+ retention risks in sectors like finance and logistics. He ensures strict adherence
74
+ to Singapore’s Tripartite Guidelines and Tripartite Standards while managing budgets
75
+ and stakeholder expectations to secure buy-in across departments. He is a persuasive
76
+ communicator who translates complex HR analytics into actionable business cases,
77
+ consistently delivering scalable solutions that enhance operational efficiency
78
+ and employee engagement. His approach is methodical, grounded in evidence, and
79
+ rigorously compliant with employment regulations.
80
+ metrics:
81
+ - accuracy
82
+ pipeline_tag: text-classification
83
+ library_name: setfit
84
+ inference: true
85
+ base_model: sentence-transformers/paraphrase-mpnet-base-v2
86
+ model-index:
87
+ - name: SetFit with sentence-transformers/paraphrase-mpnet-base-v2
88
+ results:
89
+ - task:
90
+ type: text-classification
91
+ name: Text Classification
92
+ dataset:
93
+ name: Unknown
94
+ type: unknown
95
+ split: test
96
+ metrics:
97
+ - type: accuracy
98
+ value: 0.6666666666666666
99
+ name: Accuracy
100
+ ---
101
+
102
+ # SetFit with sentence-transformers/paraphrase-mpnet-base-v2
103
+
104
+ This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [sentence-transformers/paraphrase-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-mpnet-base-v2) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification.
105
+
106
+ The model has been trained using an efficient few-shot learning technique that involves:
107
+
108
+ 1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
109
+ 2. Training a classification head with features from the fine-tuned Sentence Transformer.
110
+
111
+ ## Model Details
112
+
113
+ ### Model Description
114
+ - **Model Type:** SetFit
115
+ - **Sentence Transformer body:** [sentence-transformers/paraphrase-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-mpnet-base-v2)
116
+ - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
117
+ - **Maximum Sequence Length:** 512 tokens
118
+ - **Number of Classes:** 4 classes
119
+ <!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
120
+ <!-- - **Language:** Unknown -->
121
+ <!-- - **License:** Unknown -->
122
+
123
+ ### Model Sources
124
+
125
+ - **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
126
+ - **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
127
+ - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
128
+
129
+ ### Model Labels
130
+ | Label | Examples |
131
+ |:------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
132
+ | IRC | <ul><li>'The Lab Equipment Technician-Analyst maintains and optimises high-precision laboratory instrumentation used in Singapore’s biotech and environmental testing facilities. He/She investigates irregular readings in chromatographs, spectrometers, and automated sample handlers, performs mechanical adjustments and component replacements, and validates recalibrations against metrological standards. He interprets performance anomalies through data trends, collaborates with researchers to refine testing parameters, and ensures all modifications are documented in compliance with ISO 17025 protocols. He works hands-on with vacuum systems, fluidic lines, and micro-sensors, often repairing or retrofitting hardware under strict cleanliness and contamination controls. He maintains digital logs of service history, calibration schedules, and consumable usage with meticulous accuracy. He is analytically driven, comfortable with technical manuals and schematics, and adept at balancing mechanical precision with data integrity in a regulated, high-stakes laboratory setting.'</li><li>'The Compliance Systems Operator analyses transactional patterns in financial compliance databases to detect anomalies in cross-border remittances, using automated query tools to flag deviations from MAS guidelines. He/She modifies script-based monitoring rules to adapt to evolving regulatory thresholds, executes manual verification of flagged cases by cross-referencing physical documents with electronic ledgers, and tracesAudit trails through multiple integrated systems to confirm accuracy. He operates terminal-based data entry platforms with strict input protocols, updates metadata tags in structured repositories, and ensures full traceability of all dataset modifications. He troubleshoots system errors in compliance dashboards, performs routine hardware checks on secure terminals, and liaises with compliance officers to refine detection logic. He combines investigative precision in pattern recognition with disciplined execution of procedural controls, maintaining tight alignment with Singapore’s stringent anti-money laundering and cyber-security standards in financial services.'</li><li>'The Environmental Monitoring Technologist collects, calibrates, and processes air and water quality data across Singapore’s industrial zones and water catchment areas using sensor arrays and field deployment kits. He/She analyses spatial-temporal trends in pollutant concentrations, identifies anomalies attributable to equipment drift or operational emissions, and performs mechanical maintenance on采样 rigs and dataloggers exposed to outdoor conditions. He installs and repairs sampling probes, replaces corrosion-resistant components, and ensures all hardware meets NEA’s calibration benchmarks. He documents findings in structured digital formats compliant with environmental reporting templates, cross-references field readings with regulatory thresholds, and prepares compliance reports for inspection. He is detail-oriented in data handling, mechanically proficient with field instruments, and skilled at isolating technical faults from environmental variability. His work supports Singapore’s green initiative through rigorous, hands-on environmental compliance and data-driven insight.'</li></ul> |
133
+ | EIC | <ul><li>'The Talent Acquisition Lead drives workforce planning and recruitment strategies across high-growth divisions, aligning hiring outcomes with business expansion targets in Singapore’s digital economy. He/She identifies talent gaps through data-driven analysis of market trends, workforce mobility patterns, and competitor benchmarking, then designs targeted sourcing campaigns to attract niche technical and managerial talent. He leverages HR analytics platforms to track funnel efficiency, time-to-hire metrics, and candidate conversion rates, continually refining processes to meet SLAs. He negotiates with recruitment agencies, manages vendor contracts, and presents cost-benefit analyses to senior leadership to secure budget approvals. He ensures strict compliance with MOM regulations and equitable hiring practices while advocating for inclusive employer branding. He is a persuasive communicator, comfortable influencing stakeholders without direct authority, and thrives in fast-paced environments where data interpretation and process optimization intersect. His decisiveness and structured approach ensure scalable, audit-ready hiring systems that support Singapore’s SkillsFuture workforce development goals.'</li><li>'The Training & Development Program Lead designs and scales corporate learning initiatives to close skills gaps in Singapore’s key economic sectors, from fintech to healthcare services. He/She analyzes workforce competency data, skills demand trends, and training completion rates to prioritize curriculum development and budget allocation. He negotiates partnerships with SkillsFuture-approved providers, secures funding through grants, and champions adoption of digital learning platforms across departments. He evaluates training efficacy through pre- and post-assessments, correlation with performance metrics, and retention analysis. He ensures all curricula align with WSQ standards and organizational compliance needs, maintaining accurate records of certifications and mandatory training. He is a self-driven motivator who influences department heads to invest in upskilling, combines interpretive insights with structured implementation, and thrives where data transparency meets organizational change. His work ensures Singapore’s workforce remains agile, future-ready, and aligned with national upskilling priorities.'</li><li>'The Operations Optimization Manager identifies inefficiencies in logistics and supply chain workflows by analyzing throughput data, vendor performance metrics, and warehouse cycle times across Singapore’s port and manufacturing ecosystems. He/She designs and pilots automation protocols, digital tracking systems, and throughput models to reduce delays and cut operational costs by 15% or more. He presents business cases to senior leadership, securing buy-in and budget allocation for process redesigns, and oversees change implementation across cross-functional teams. He negotiates service-level agreements with third-party providers, evaluates vendor compliance with ISO and SGMark standards, and maintains audit-ready documentation for regulatory inspections. He interprets KPIs from ERP and TMS platforms to forecast bottlenecks and recommend preemptive adjustments. He is strategic and assertive in driving adoption of new systems, balances quantitative analysis with stakeholder persuasion, and operates with precision in preserving continuity during transition periods. His work directly enhances Singapore’s position as a global logistics hub through data-backed innovation.'</li></ul> |
134
+ | ESC | <ul><li>'The Talent Acquisition Leader owns the end-to-end recruitment strategy for high-growth divisions, negotiating competitive offers with candidates while influencing hiring managers to adopt inclusive and skills-based selection criteria. He/She mentors recruiters and hiring teams on candidate experience best practices, encouraging empathetic engagement throughout the funnel, from outreach to onboarding. He/She maintains compliance with ETR and SFA guidelines by ensuring all job advertisements, background checks, and employment contracts adhere to local labour norms. He utilises ATS tools to analyse sourcing channels, time-to-hire metrics, and diversity benchmarks, refining processes to meet workforce demand in sectors like fintech and healthcare. He is a persuasive influencer who builds trust with stakeholders, a supportive guide to new hires navigating their first roles in Singapore, and a disciplined keeper of applicant data, interview logs, and recruitment KPIs.'</li><li>'The Learning & Development Manager drives the implementation of enterprise-wide upskilling programs, securing buy-in from business heads to prioritise training investments aligned with SkillsFuture priorities. He/She designs and delivers workshops on digital literacy, leadership competencies, and compliance standards, creating safe spaces for adult learners to grow while ensuring all materials meet MOE and CMD guidelines. He/She tracks course participation, feedback scores, and skill gain metrics using HRIS systems, ensuring accurate reporting for internal audits and government funding claims. He navigates competing departmental demands with strategic negotiation, aligning learning roadmaps with organisational goals. He balances motivational facilitation with administrative precision—managing schedules, vendor contracts, and certification records with rigour. His approach is outcome-driven, people-centred, and rooted in the Singapore context of lifelong learning and workforce resilience.'</li><li>'The HR Business Partner leads the design and delivery of people strategies that align with business growth targets, persuading senior leaders to adopt workforce initiatives that drive productivity and engagement. He/She fosters strong relationships across departments, coaching managers on talent development, conflict resolution, and performance improvement while ensuring strict adherence to Singapore’s Employment Act and CPF regulations. He/She leverages HR analytics platforms to track turnover trends, retention risks, and training ROI, translating data into actionable insights that influence budget allocations. He operates in a fast-paced, regulated environment where compliance deadlines and workforce planning cycles demand precision. He is a persuasive communicator who can influence without authority, a empathetic facilitator of team well-being, and a meticulous organiser of employee records, onboarding workflows, and policy documentation. His leadership is grounded in ethical practice, data integrity, and a relentless focus on employee experience within Singapore’s competitive labour market.'</li></ul> |
135
+ | SEC | <ul><li>'The Learning Experience Designer crafts immersive, game-based and peer-led training modules that improve staff competence and confidence in customer service, compliance, and digital tools. He/She collaborates with subject matter experts to align content with WSQ standards and SkillsFuture pathways, then pilots modules across departments, gathering feedback to refine delivery. He/She influences stakeholders by demonstrating program impact through pre- and post-assessment metrics and cost-benefit analyses. He ensures all training materials are documented, version-controlled, and archived per internal audit and regulatory requirements. He communicates with clarity and enthusiasm, adapting explanations for frontline staff, mid-managers, and corporate teams. He utilises LMS platforms and e-learning authoring tools to deliver scalable, mobile-friendly content. In Singapore’s service-driven economy, he prioritises experiential learning that enhances customer interaction skills, particularly in retail, aviation, and healthcare settings.'</li><li>'The Talent Engagement Lead drives initiatives that enhance employee satisfaction, retention, and organisational culture within Singapore’s competitive job market. He/She designs and implements wellness campaigns, recognition programs, and feedback loops that directly support staff well-being and morale. He/She influences senior leadership to allocate resources for employee experience improvements, negotiates with vendors for staff benefits, and presents data-driven business cases to justify HR investments. He/She manages attendance records, program budgets, and compliance documentation with precision under the Employment Act and Tripartite Guidelines. He communicates with authenticity, listens actively to pitch ideas to diverse stakeholders, and builds trust through consistent follow-through. He utilises HR analytics tools to measure engagement trends and adjust strategies in real time. Operating within Singapore’s high-turnover industries such as retail and hospitality, he ensures initiatives are culturally relevant, inclusive, and aligned with national workforce development priorities.'</li><li>'The Employee Relations Advisor provides frontline support to staff and managers on workplace conflicts, performance concerns, and policy interpretation, fostering a respectful and inclusive environment. He/She conducts confidential mediations, facilitates restart conversations, and coaches leaders on constructive feedback techniques. He/She influences organisational change by advocating for equitable practices and aligning HR policies with evolving labour norms under the Singapore Tripartite Standards. He manages case documentation, incident logs, and resolution timelines with strict adherence to data privacy rules and compliance frameworks. He communicates with clarity and emotional intelligence, resolving tensions through persuasive dialogue while maintaining neutrality. He leverages digital case management systems to track trends and proactively identify systemic issues. Operating in Singapore’s multicultural workplaces, he interprets cultural nuances in communication styles and ensures interventions are sensitive, timely, and aligned with SkillsFuture’s emphasis on lifelong learning and harmonious work environments.'</li></ul> |
136
+
137
+ ## Evaluation
138
+
139
+ ### Metrics
140
+ | Label | Accuracy |
141
+ |:--------|:---------|
142
+ | **all** | 0.6667 |
143
+
144
+ ## Uses
145
+
146
+ ### Direct Use for Inference
147
+
148
+ First install the SetFit library:
149
+
150
+ ```bash
151
+ pip install setfit
152
+ ```
153
+
154
+ Then you can load this model and run inference.
155
+
156
+ ```python
157
+ from setfit import SetFitModel
158
+
159
+ # Download from the 🤗 Hub
160
+ model = SetFitModel.from_pretrained("dnth/setfit-riasec-classifier-subset")
161
+ # Run inference
162
+ preds = model("The HR Learning & Development Specialist designs and delivers workforce upskilling programmes that align with Singapore’s SkillsFuture initiatives, focusing on employee growth, well-being, and career progression. He/She facilitates training workshops, coaches managers on supportive leadership, and counsels staff on learning pathways, ensuring inclusive and accessible development opportunities across diverse teams. He/She collaborates with department heads to identify skill gaps and mobilises internal and external resources to deliver targeted interventions, including digital learning platforms and micro-credentialing solutions. He/She maintains accurate records of training participation and compliance with MOM’s Tripartite Guidelines, applying structured processes to track outcomes and report impact. He is empathetic, a natural communicator, and adept at building trust with employees at all levels. He thrives in Singapore’s dynamic labour landscape, where continuous learning is prioritised and HR practices must reflect both human-centric values and regulatory precision.")
163
+ ```
164
+
165
+ <!--
166
+ ### Downstream Use
167
+
168
+ *List how someone could finetune this model on their own dataset.*
169
+ -->
170
+
171
+ <!--
172
+ ### Out-of-Scope Use
173
+
174
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
175
+ -->
176
+
177
+ <!--
178
+ ## Bias, Risks and Limitations
179
+
180
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
181
+ -->
182
+
183
+ <!--
184
+ ### Recommendations
185
+
186
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
187
+ -->
188
+
189
+ ## Training Details
190
+
191
+ ### Training Set Metrics
192
+ | Training set | Min | Median | Max |
193
+ |:-------------|:----|:-------|:----|
194
+ | Word count | 104 | 133.45 | 153 |
195
+
196
+ | Label | Training Sample Count |
197
+ |:------|:----------------------|
198
+ | IRC | 10 |
199
+ | EIC | 10 |
200
+ | ESC | 10 |
201
+ | SEC | 10 |
202
+
203
+ ### Training Hyperparameters
204
+ - batch_size: (8, 8)
205
+ - num_epochs: (4, 4)
206
+ - max_steps: -1
207
+ - sampling_strategy: oversampling
208
+ - body_learning_rate: (2e-05, 1e-05)
209
+ - head_learning_rate: 0.01
210
+ - loss: CosineSimilarityLoss
211
+ - distance_metric: cosine_distance
212
+ - margin: 0.25
213
+ - end_to_end: False
214
+ - use_amp: False
215
+ - warmup_proportion: 0.1
216
+ - l2_weight: 0.01
217
+ - seed: 42
218
+ - eval_max_steps: -1
219
+ - load_best_model_at_end: True
220
+
221
+ ### Training Results
222
+ | Epoch | Step | Training Loss | Validation Loss |
223
+ |:------:|:----:|:-------------:|:---------------:|
224
+ | 0.0067 | 1 | 0.1658 | - |
225
+ | 0.3333 | 50 | 0.15 | - |
226
+ | 0.6667 | 100 | 0.059 | - |
227
+ | 1.0 | 150 | 0.004 | 0.1356 |
228
+ | 1.3333 | 200 | 0.0005 | - |
229
+ | 1.6667 | 250 | 0.0003 | - |
230
+ | 2.0 | 300 | 0.0002 | 0.1236 |
231
+ | 2.3333 | 350 | 0.0002 | - |
232
+ | 2.6667 | 400 | 0.0001 | - |
233
+ | 3.0 | 450 | 0.0001 | 0.1201 |
234
+ | 3.3333 | 500 | 0.0001 | - |
235
+ | 3.6667 | 550 | 0.0001 | - |
236
+ | 4.0 | 600 | 0.0001 | 0.1187 |
237
+
238
+ ### Framework Versions
239
+ - Python: 3.12.8
240
+ - SetFit: 1.1.3
241
+ - Sentence Transformers: 5.1.1
242
+ - Transformers: 4.56.2
243
+ - PyTorch: 2.8.0+cu128
244
+ - Datasets: 4.1.1
245
+ - Tokenizers: 0.22.1
246
+
247
+ ## Citation
248
+
249
+ ### BibTeX
250
+ ```bibtex
251
+ @article{https://doi.org/10.48550/arxiv.2209.11055,
252
+ doi = {10.48550/ARXIV.2209.11055},
253
+ url = {https://arxiv.org/abs/2209.11055},
254
+ author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
255
+ keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
256
+ title = {Efficient Few-Shot Learning Without Prompts},
257
+ publisher = {arXiv},
258
+ year = {2022},
259
+ copyright = {Creative Commons Attribution 4.0 International}
260
+ }
261
+ ```
262
+
263
+ <!--
264
+ ## Glossary
265
+
266
+ *Clearly define terms in order to be accessible across audiences.*
267
+ -->
268
+
269
+ <!--
270
+ ## Model Card Authors
271
+
272
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
273
+ -->
274
+
275
+ <!--
276
+ ## Model Card Contact
277
+
278
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
279
+ -->
config.json ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "MPNetModel"
4
+ ],
5
+ "attention_probs_dropout_prob": 0.1,
6
+ "bos_token_id": 0,
7
+ "dtype": "float32",
8
+ "eos_token_id": 2,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 3072,
14
+ "layer_norm_eps": 1e-05,
15
+ "max_position_embeddings": 514,
16
+ "model_type": "mpnet",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 12,
19
+ "pad_token_id": 1,
20
+ "relative_attention_num_buckets": 32,
21
+ "transformers_version": "4.56.2",
22
+ "vocab_size": 30527
23
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "5.1.1",
4
+ "transformers": "4.56.2",
5
+ "pytorch": "2.8.0+cu128"
6
+ },
7
+ "model_type": "SentenceTransformer",
8
+ "prompts": {
9
+ "query": "",
10
+ "document": ""
11
+ },
12
+ "default_prompt_name": null,
13
+ "similarity_fn_name": "cosine"
14
+ }
config_setfit.json ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "labels": [
3
+ "IRC",
4
+ "EIC",
5
+ "ESC",
6
+ "SEC"
7
+ ],
8
+ "normalize_embeddings": false
9
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b7621e37ad4d4f6fcbc2dd378a0cd54809d95ed5dc696b74601388aad8fd29c1
3
+ size 437967672
model_head.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:da8b5145b52abf8c6b75b9176a27f7835aadf54858f97d1969f1d38bb201580d
3
+ size 25495
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "cls_token": {
10
+ "content": "<s>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "eos_token": {
17
+ "content": "</s>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "mask_token": {
24
+ "content": "<mask>",
25
+ "lstrip": true,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "pad_token": {
31
+ "content": "<pad>",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ },
37
+ "sep_token": {
38
+ "content": "</s>",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false
43
+ },
44
+ "unk_token": {
45
+ "content": "[UNK]",
46
+ "lstrip": false,
47
+ "normalized": false,
48
+ "rstrip": false,
49
+ "single_word": false
50
+ }
51
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,60 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "<s>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "<pad>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "</s>",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "104": {
28
+ "content": "[UNK]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "30526": {
36
+ "content": "<mask>",
37
+ "lstrip": true,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "bos_token": "<s>",
45
+ "clean_up_tokenization_spaces": false,
46
+ "cls_token": "<s>",
47
+ "do_basic_tokenize": true,
48
+ "do_lower_case": true,
49
+ "eos_token": "</s>",
50
+ "extra_special_tokens": {},
51
+ "mask_token": "<mask>",
52
+ "model_max_length": 512,
53
+ "never_split": null,
54
+ "pad_token": "<pad>",
55
+ "sep_token": "</s>",
56
+ "strip_accents": null,
57
+ "tokenize_chinese_chars": true,
58
+ "tokenizer_class": "MPNetTokenizer",
59
+ "unk_token": "[UNK]"
60
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff