Ambika14 commited on
Commit
d48703d
·
verified ·
1 Parent(s): 06f9196

Upload folder using huggingface_hub

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,761 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - sentence-similarity
5
+ - feature-extraction
6
+ - dense
7
+ - generated_from_trainer
8
+ - dataset_size:88
9
+ - loss:CachedMultipleNegativesRankingLoss
10
+ base_model: sentence-transformers/all-mpnet-base-v2
11
+ widget:
12
+ - source_sentence: udyam registration cancel <udyam_no> is still pending for cancellation
13
+ and its passing more than <NUM> days issue delayed cancellation of udyam registration
14
+ context the user is reporting that the cancellation of udyam registration for
15
+ udyam-up- <NUM> - <NUM> is still pending and has been delayed for more than <NUM>
16
+ days. details - udyam registration no udyam-up- <NUM> - <NUM> cancellation status
17
+ pending
18
+ sentences:
19
+ - Policy and Schemes. Definition of MSMEs (Clarifications related to definition)
20
+ Policy. this category pertains to grievances seeking policy interpretation and
21
+ clarification regarding the definition and classification of micro small and medium
22
+ enterprises msmes under the micro small and medium enterprises development msmed
23
+ act <NUM> as amended . the category encompasses disputes or doubts related to
24
+ the application of turnover investment and structural factors to specific enterprise
25
+ cases. key issues include turnover and investment threshold calculations treatment
26
+ of export turnover or goods and services tax gst classification of enterprises
27
+ as micro small or medium clubbing of multiple units or related businesses under
28
+ a single msme identity the category also captures concerns arising from the transition
29
+ between old and revised msme definitions including impact of reclassification
30
+ on eligibility continuity of benefits already availed applicable financial year
31
+ for revised criteria grievances in this category are clarification-driven rather
32
+ than system-error driven arising from the intersection of policy intent numerical
33
+ calculations and enterprise structure. example issues include turnover classification
34
+ discrepancies my turnover is within limits but udyam shows a higher msme category
35
+ please clarify the correct classification as per policy. export turnover treatment
36
+ export turnover has been included while determining msme status kindly clarify
37
+ whether it should be excluded. post-migration classification changes after migration
38
+ from uam to udyam my enterprise category has changed despite no change in investment
39
+ please confirm if this is correct. revised definition impact on eligibility due
40
+ to the revised msme definition my eligibility under schemes is affected kindly
41
+ clarify whether benefits already availed will continue. the operational procedural
42
+ policy and institutional causes of these grievances include
43
+ - Policy and Schemes. DBT / IT desk including Annual Report. dbt it desk including
44
+ the annual report in msme refers to the data dbt wing functioning under the office
45
+ of the development commissioner msme which is responsible for administering direct
46
+ benefit transfer dbt of subsidies under msme schemes managing it and digital infrastructure
47
+ and compiling the ministry s annual report. the wing oversees end-to-end dbt processes
48
+ for scheme reimbursements such as ict and cloud computing subsidies where msmes
49
+ initially incur eligible expenses and subsequently receive reimbursements directly
50
+ into aadhaar-linked bank accounts through the public financial management system
51
+ often after technical verification by agencies like telecommunications consultants
52
+ india limited. it ensures compliance with national dbt standards in coordination
53
+ with the dbt mission and national informatics centre maintains and upgrades msme
54
+ it systems including the udyam registration portal supports cloud-based it adoption
55
+ for msmes undertakes data analytics and mis reporting and onboards schemes to
56
+ the national dbt framework. the wing also prepares the annual report of the ministry
57
+ of msme consolidating performance indicators financial outlays scheme outcomes
58
+ udyam registration trends and macro-level contributions such as msme share in
59
+ gdp and employment which are used for parliament cabinet briefings and policy
60
+ evaluation. while this framework promotes transparency leak-proof subsidy delivery
61
+ evidence-based policymaking and digital efficiency stakeholders frequently raise
62
+ grievances related to dbt execution data accuracy it reliability and reporting
63
+ quality. examples of grievances include msmes experiencing delays in receipt of
64
+ approved ict or cloud service subsidies due to pfms transaction or verification
65
+ glitches reimbursement failures arising from aadhaar bank account linkage mismatches
66
+ despite valid udyam registration inaccuracies or under-reporting of scheme achievements
67
+ udyam registrations or msme gdp contribution in the annual report affecting policy
68
+ advocacy and planning temporary downtime or access issues on udyam or other msme
69
+ it portals during registration or subsidy claim periods and gaps in mis capture
70
+ where scheme data duplications or leakages are not properly reflected in dbt dashboards
71
+ or the annual report prompting appeals for correction and system strengthening.
72
+ - UAM/Udyam Registration/Certificate related issues. Time Taken for Cancellation
73
+ of UDYAM Certificate (Technical). this category refers to grievances concerning
74
+ delays in processing requests for cancellation of an existing udyam registration.
75
+ when a business owner submits a request to cancel a registration the request is
76
+ expected to be processed within a reasonable timeframe. however in some cases
77
+ users report that the cancellation request remains pending for an extended period.
78
+ grievances under this category usually involve complaints where the enterprise
79
+ owner has already submitted a cancellation request but the status continues to
80
+ show as pending or unprocessed. entrepreneurs may also report that they cannot
81
+ proceed with other actions related to their registration because the cancellation
82
+ has not yet been completed. in some situations users may have submitted the request
83
+ multiple times or may be seeking clarification about the delay in processing the
84
+ cancellation. these grievances are typically raised by msme proprietors partners
85
+ company directors or authorized representatives who previously requested cancellation
86
+ of their enterprise registration. business owners who closed their operations
87
+ or who submitted cancellation due to incorrect registration details may seek updates
88
+ on the status of their request. compliance managers accountants or consultants
89
+ handling enterprise registrations may also raise grievances when the cancellation
90
+ process takes longer than expected or prevents further registration-related actions
91
+ from being completed.
92
+ - source_sentence: sri fund for new unit iron handicrafts manufacturing unit issue
93
+ application for self reliant fund sri fund for new unit context the user is requesting
94
+ application for self reliant fund sri fund for a new unit specifically for an
95
+ iron handicrafts manufacturing unit. details - fund type sri fund unit type new
96
+ unit industry iron handicrafts manufacturing
97
+ sentences:
98
+ - UAM/Udyam Registration/Certificate related issues. Updation of Email ID/Mobile
99
+ No. Linked to UDYAM Certificate. this category includes grievances related to
100
+ updating or correcting the email id or mobile number associated with an existing
101
+ udyam registration. contact details provided during registration are used for
102
+ communication verification and authentication when accessing the enterprise profile
103
+ on the portal. if these contact details become outdated incorrect or inaccessible
104
+ the enterprise owner may face difficulty receiving otps accessing the portal or
105
+ managing the registration information. common grievances under this category include
106
+ requests to change the registered mobile number or email address because the original
107
+ number is no longer active the sim card has been lost the email account is no
108
+ longer accessible or the contact details were entered incorrectly during registration.
109
+ some complaints arise when the registered contact details belong to an employee
110
+ or consultant who is no longer associated with the enterprise preventing the current
111
+ owner from receiving verification messages. in other cases entrepreneurs report
112
+ that they cannot update contact details because the system requires authentication
113
+ through the old mobile number or email which they no longer have access to. these
114
+ grievances are typically raised by msme owners proprietors partners directors
115
+ of companies or authorized representatives responsible for managing business registrations.
116
+ small business owners who registered their enterprise personally may request updates
117
+ when their phone number or email changes. in some cases accountants consultants
118
+ or administrative staff handling compliance activities may also submit grievances
119
+ when they cannot access the registration due to outdated contact details. this
120
+ category therefore represents issues related specifically to correcting or updating
121
+ communication details associated with an existing udyam certificate.
122
+ - Marketing and Skilling. Export Promotion/WTO. the export promotion and wto-related
123
+ initiatives for msmes comprise a set of integrated measures under india s ministry
124
+ of msme and the ministry of commerce to strengthen the export ecosystem for micro
125
+ small and medium enterprises. these include the export promotion mission with
126
+ a long-term financial outlay to support msme exports interest subvention on pre-
127
+ and post-shipment export credit to reduce borrowing costs credit guarantee coverage
128
+ for collateral-free export finance and reimbursement support for participation
129
+ in international trade fairs buyer meets and market development activities with
130
+ higher assistance for first-time exporters and priority groups. the initiatives
131
+ also provide policy and legal support to msmes in matters related to wto compliance
132
+ trade remedies such as anti-dumping cases and dispute settlement issues. together
133
+ with complementary export facilitation instruments these measures aim to enhance
134
+ msme competitiveness diversify export markets integrate enterprises into global
135
+ value chains support labour-intensive sectors and sustain msmes significant contribution
136
+ to india s overall exports. examples of common grievances under these initiatives
137
+ include interest subvention limitations an msme exporter reaches the prescribed
138
+ annual credit ceiling midway through the year resulting in partial interest relief
139
+ despite continued export shipments. credit guarantee shortfall an exporter seeking
140
+ higher-value export finance receives lower-than-expected guarantee coverage due
141
+ to risk assessment norms increasing collateral or margin requirements. trade fair
142
+ reimbursement rejection a first-time exporter is denied marketing assistance reimbursement
143
+ because the overseas exhibition attended was not on the approved list despite
144
+ generating confirmed buyer interest. wto-related support inadequacy an msme facing
145
+ an anti-dumping investigation receives limited financial assistance for legal
146
+ and advisory expenses leaving a large portion of costs uncovered. implementation
147
+ or rollout delays eligible exporters are unable to access benefits during pilot
148
+ or initial phases due to delays by banks or implementing agencies in operationalising
149
+ scheme guidelines.
150
+ - Starter, Credit and Finance. Self Reliant Fund (SRI Fund). the self reliant india
151
+ sri fund is a category designed to address grievances raised by growth-stage msme
152
+ owners manufacturing or technology-focused enterprises and startups transitioning
153
+ into a scale phase. the primary purpose of the sri fund is to provide equity support
154
+ to scalable msmes. however several issues and challenges hinder its effective
155
+ implementation leading to grievances from the target beneficiaries. key issues
156
+ and scenarios <NUM> . inability to access daughter funds lack of clarity on which
157
+ funds to approach no publicly available list or contact details referred by sidbi
158
+ to fund managers who do not respond no acknowledgment after submitting an expression
159
+ of interest <NUM> . excessive delays in the investment process due diligence stretching
160
+ over many months without a clear decision repeated requests for similar documents
161
+ frequent postponement of investment committee meetings leaving enterprises in
162
+ prolonged uncertainty <NUM> . rejection without transparency proposals declined
163
+ without stated reasons applications marked unsuitable despite meeting published
164
+ eligibility criteria verbal assurances of support later withdrawn without formal
165
+ communication <NUM> . eligibility and interpretation disputes fund managers applying
166
+ scheme guidelines inconsistently disputes over turnover thresholds treating registered
167
+ msmes as ineligible startups applying unclear sectoral restrictions unevenly across
168
+ applicants <NUM> . post-approval or post-commitment issues term sheets issued
169
+ but funds not disbursed conditions altered after approval funds backing out due
170
+ to internal policy changes operational procedural policy or institutional causes
171
+ - source_sentence: recently we registered ourselves for udyam registration but after
172
+ scanning the qr code verification is failing. issue qr code verification failure
173
+ for udyam registration context the user is reporting that qr code verification
174
+ is failing after registering for udyam registration. details - registration type
175
+ udyam registration verification issue qr code verification failure
176
+ sentences:
177
+ - Technology, Quality and Institutions. Related to NSIC. this category encompasses
178
+ grievances related to the support and facilitation services provided by the national
179
+ small industries corporation nsic to micro small and medium enterprises msmes
180
+ . the scope of this category includes issues arising from the areas of raw material
181
+ assistance market access and risk mitigation through guarantees. specifically
182
+ it covers situations where approved raw material assistance is not released on
183
+ time supplier coordination fails after nsic approval material supplied through
184
+ nsic is delayed or does not meet specifications or documentation and regional
185
+ office processes stall procurement. the category also captures failures in marketing
186
+ support including - delayed or missing inclusion in tenders gem or psu vendor
187
+ listings - late communication of bid opportunities - problems in nsic-sponsored
188
+ exhibitions or buyer-connect programs additionally it includes issues related
189
+ to performance and emd guarantees such as - delayed issuance - incorrect formats
190
+ - non-renewal despite payment - rejection by psus - lack of response when guarantees
191
+ are invoked these grievances typically result in missed orders blocked working
192
+ capital contract delays or loss of business credibility and arise from execution
193
+ coordination or service delivery breakdowns rather than policy interpretation.
194
+ the category is further divided into the following subcategories <NUM> . corporate
195
+ communication single point registration scheme and exhibition consortia and tender
196
+ marketing <NUM> . internal audit and law recovery <NUM> . human resource <NUM>
197
+ . vigilance law recovery <NUM> . international cooperation <NUM> . bank guarantee
198
+ monitoring <NUM> . finance accounts <NUM> . national sc st hub <NUM> . chief vigilance
199
+ officer <NUM> . contract procurement grievance officer <NUM> . digital services
200
+ facilitation and training <NUM> .space marketing cell event management cell <NUM>
201
+ .raw material assistance bank guarantee bill discounting bank tieup csr administration
202
+ <NUM> .technology liaison officer for sc st pwd cmr <NUM> .epf trust superannuation
203
+ pension trust <NUM> .center public information officers cpio <NUM> .company secretary
204
+ - Policy and Schemes. Definition of MSMEs (Clarifications related to definition)
205
+ Policy. this category pertains to grievances seeking policy interpretation and
206
+ clarification regarding the definition and classification of micro small and medium
207
+ enterprises msmes under the micro small and medium enterprises development msmed
208
+ act <NUM> as amended . the category encompasses disputes or doubts related to
209
+ the application of turnover investment and structural factors to specific enterprise
210
+ cases. key issues include turnover and investment threshold calculations treatment
211
+ of export turnover or goods and services tax gst classification of enterprises
212
+ as micro small or medium clubbing of multiple units or related businesses under
213
+ a single msme identity the category also captures concerns arising from the transition
214
+ between old and revised msme definitions including impact of reclassification
215
+ on eligibility continuity of benefits already availed applicable financial year
216
+ for revised criteria grievances in this category are clarification-driven rather
217
+ than system-error driven arising from the intersection of policy intent numerical
218
+ calculations and enterprise structure. example issues include turnover classification
219
+ discrepancies my turnover is within limits but udyam shows a higher msme category
220
+ please clarify the correct classification as per policy. export turnover treatment
221
+ export turnover has been included while determining msme status kindly clarify
222
+ whether it should be excluded. post-migration classification changes after migration
223
+ from uam to udyam my enterprise category has changed despite no change in investment
224
+ please confirm if this is correct. revised definition impact on eligibility due
225
+ to the revised msme definition my eligibility under schemes is affected kindly
226
+ clarify whether benefits already availed will continue. the operational procedural
227
+ policy and institutional causes of these grievances include
228
+ - UAM/Udyam Registration/Certificate related issues. QR Code Printed on UDYAM Certificate
229
+ Not Readable (Technical). this category includes grievances related to qr codes
230
+ printed on the udyam certificate that cannot be scanned or read properly. the
231
+ qr code is intended to allow quick verification of the certificate and its associated
232
+ enterprise information. if the qr code cannot be scanned users may face difficulty
233
+ verifying the certificate or sharing it for official purposes. grievances under
234
+ this category typically involve situations where the qr code on the downloaded
235
+ or printed certificate appears blurred distorted or unresponsive when scanned
236
+ with a qr reader. some users report that the qr code does not open any verification
237
+ page after scanning while others find that the scanning application fails to recognize
238
+ the code at all. these issues may arise due to errors during certificate generation
239
+ problems with the downloaded file or printing-related distortions that make the
240
+ qr code unreadable. these grievances are generally raised by msme owners proprietors
241
+ partners directors or authorized representatives who use the udyam certificate
242
+ as official documentation for their enterprise. small business owners who attempt
243
+ to share the certificate for verification purposes may discover that the qr code
244
+ is not functioning correctly. consultants accountants or administrative staff
245
+ responsible for maintaining business documentation may also submit grievances
246
+ when they identify that the qr code on the certificate cannot be scanned or verified.
247
+ - source_sentence: insurancy company national insurance company limited branch name
248
+ of insurance company branch if other khamgaon branch date of application <NUM>
249
+ - <NUM> - <NUM> policy number <NUM> my claim is kept pending even after submitting
250
+ all the documents after changing all the requirements as changed by various surveyors.
251
+ issue delayed insurance claim under national insurance company limited context
252
+ the user is reporting that the insurance claim submitted on <NUM> - <NUM> - <NUM>
253
+ with policy number <NUM> is still pending despite submission of all required documents
254
+ as per changes made by various surveyors. details - policy number <NUM> claim
255
+ submission date <NUM> - <NUM> - <NUM> branch khamgaon
256
+ sentences:
257
+ - Technology, Quality and Institutions. Official Language Related Issues. official
258
+ language related issues in msme administration concern the implementation of hindi
259
+ rajbhasha in accordance with the official languages act <NUM> as amended across
260
+ the ministry of msme its development institutes field offices and attached organizations.
261
+ this framework mandates progressive use of hindi in official work bilingual hindi
262
+ english documentation replies in hindi to communications received in hindi availability
263
+ of hindi-enabled software on computers and regular training in hindi typing and
264
+ computing for officials. the ministry monitors compliance through official language
265
+ implementation committees quarterly progress reviews rajbhasha inspections and
266
+ conferences while ensuring that citizens charters schemes portals and public-facing
267
+ information are available bilingually. these measures aim to improve accessibility
268
+ for hindi-speaking msmes enhance transparency and inclusiveness strengthen regional
269
+ outreach especially in hindi-belt states and fulfill constitutional and administrative
270
+ obligations without restricting the use of english where required. examples of
271
+ grievances include non-hindi reply an msme submits an application or grievance
272
+ in hindi to a development institute but receives a response only in english contrary
273
+ to official language correspondence rules. bilingual documentation gap key documents
274
+ such as annual reports scheme guidelines or notices are issued only in english
275
+ or with incomplete hindi translations limiting accessibility for hindi-speaking
276
+ stakeholders. training shortfall field office staff are unable to type or process
277
+ files in hindi despite mandated hindi software and training provisions causing
278
+ delays in rajbhasha compliance. portal language issue hindi versions of portals
279
+ like udyam or champions contain missing pages partial translations or technical
280
+ glitches preventing rural or hindi-only users from completing registrations or
281
+ filing grievances. awareness and communication lapse regional msmes are not informed
282
+ in hindi about official language conferences workshops or policy updates leading
283
+ to missed participation and reduced stakeholder engagement.
284
+ - Starter, Credit and Finance. Insurance Claim related issues. this category encompasses
285
+ grievances related to insurance claims associated with various government-backed
286
+ and private insurance products. the scope includes <NUM> . esic employees state
287
+ insurance corporation insurance benefits <NUM> . epfo employees provident fund
288
+ organisation -linked insurance benefits including edli employees deposit linked
289
+ insurance <NUM> . cgtmse credit guarantee fund trust for micro and small enterprises
290
+ -linked insurance elements <NUM> . private or general business insurance products
291
+ where a government department psu public sector undertaking or bank acts as an
292
+ intermediary or implementing authority the category covers a range of issues including
293
+ opaque rejection decisions undocumented policy exclusions administrative closure
294
+ without explanation shifting of risk and liability onto msmes micro small and
295
+ medium enterprises or employees document and data mismatches across multiple systems
296
+ such as aadhaar uan universal account number employer filings bank records insurance
297
+ portals delays and non-responsiveness at esic epfo insurer field office levels
298
+ manual bottlenecks officer transfers lack of accountability jurisdictional overlaps
299
+ involving labour compliance banking conditions inter-agency disputes between insurers
300
+ banks employers and labour authorities example issues include rejected esic medical
301
+ reimbursement claims due to ineligibility despite continuous contribution history
302
+ denied epfo edli insurance claims due to alleged break in service caused by employer-side
303
+ portal errors rejected bank-linked business insurance claims based on undisclosed
304
+ policy clauses unhonoured cg
305
+ - Technology, Quality and Institutions. Support for entrepreneurial and managerial
306
+ development of SMEs through incubators- an NMCP Scheme. the support for entrepreneurial
307
+ and managerial development of smes through incubators scheme under the national
308
+ manufacturing competitiveness programme nmcp is an initiative of the ministry
309
+ of msme designed to nurture innovative technology-driven and knowledge-based ideas
310
+ by providing structured incubation support through approved business incubators
311
+ hosted in technical academic or research institutions. under the scheme financial
312
+ assistance of up to <NUM> lakh is provided per idea or incubated unit for product
313
+ development testing validation and commercialisation with an overall ceiling of
314
+ <NUM> . <NUM> lakh per incubator to support up to <NUM> ventures. in addition
315
+ host institutions may receive up to <NUM> . <NUM> lakh for minor infrastructure
316
+ and facility upgrades to strengthen incubation capabilities. the scheme follows
317
+ a tripartite arrangement among the ministry the host institution and the incubatee
318
+ with incubated enterprises contributing <NUM> to <NUM> of project costs depending
319
+ on their category. through access to laboratories workshops shared infrastructure
320
+ mentoring technical guidance and early-stage seed funding the scheme aims to transform
321
+ innovative ideas into viable and sustainable micro and small enterprises expand
322
+ the base of innovation-led entrepreneurship and move msmes beyond traditional
323
+ manufacturing and service activities. examples of common grievances under the
324
+ incubator scheme include instalment release delay after approval the host institution
325
+ receives only a partial initial instalment delaying laboratory setup and stalling
326
+ progress for multiple approved incubated ventures. idea selection bias a technically
327
+ sound student or individual entrepreneur proposal is rejected despite meeting
328
+ eligibility criteria due to preference given to existing msmes by the host incubator.
329
+ mentoring shortfall an incubated unit receives sanctioned financial assistance
330
+ but does not get the promised industry mentoring technical handholding or market
331
+ linkage support needed for commercialization. infrastructure inadequacy the limited
332
+ infrastructure grant is insufficient to procure essential workshop or testing
333
+ equipment restricting practical experimentation and prototype development. contribution
334
+ dispute a micro enterprise is asked to contribute a higher percentage of project
335
+ cost applicable to small enterprises creating financial strain and disputes during
336
+ project execution.
337
+ - source_sentence: dear sir mam i am trying to register udyam with my pan but error
338
+ showing udyam registration has already done through this pan and i have not registered
339
+ earlier so please guide me aadhaar <uam_no> pan <pan_no> mobile <phone_no> issue
340
+ clarification on existing udyam registration context the user is requesting clarification
341
+ as the udyam registration portal indicates that registration has already been
342
+ done through the pan although the user states that no registration was made. details
343
+ - aadhar no <NUM> pan no gnips2021g mobile no <NUM>
344
+ sentences:
345
+ - UAM/Udyam Registration/Certificate related issues. Cancellation of UDYAM Certificate
346
+ Request. this category includes grievances related to requests for cancellation
347
+ or deactivation of an existing udyam registration. in some cases businesses that
348
+ were previously registered as msmes may no longer operate may have undergone structural
349
+ changes or may have been registered incorrectly. when such situations occur the
350
+ enterprise owner may wish to cancel the existing udyam certificate to prevent
351
+ incorrect records or to allow proper registration in the future. grievances under
352
+ this category typically include requests to cancel a registration because the
353
+ business has permanently closed the enterprise was registered by mistake or the
354
+ registration was created with incorrect information. some entrepreneurs also request
355
+ cancellation when duplicate registrations exist for the same enterprise and they
356
+ want only one valid record to remain. another common grievance arises when the
357
+ enterprise was registered earlier under outdated or incorrect details and the
358
+ owner wants the registration cancelled before creating a new one with correct
359
+ information. these grievances are usually raised by proprietors partners directors
360
+ of companies or authorized representatives of msmes who are responsible for maintaining
361
+ the official records of the enterprise. small business owners who registered their
362
+ enterprises earlier but later discontinued operations may also request cancellation
363
+ to avoid confusion or misuse of the registration. in some cases accountants consultants
364
+ or compliance officers working on behalf of the enterprise may submit the grievance
365
+ if they identify that the existing udyam registration is no longer valid or should
366
+ be removed from the records.
367
+ - Marketing and Skilling. National SC ST HUB. national sc-st hub nssh is a central
368
+ sector scheme launched in <NUM> by the ministry of micro small and medium enterprises
369
+ and implemented by the national small industries corporation to empower scheduled
370
+ caste and scheduled tribe entrepreneurs and strengthen their participation in
371
+ the msme ecosystem. the scheme focuses on capacity building market access financial
372
+ facilitation and handholding support while also operationalizing the mandatory
373
+ <NUM> procurement target for sc st owned mses under the public procurement policy
374
+ for mses <NUM> . through a network of national sc-st hub offices across the country
375
+ the hub assists eligible sc st entrepreneurs holding at least <NUM> ownership
376
+ and control in activities such as udyam and gem registration participation in
377
+ government tenders access to credit and skill upgradation. financial support is
378
+ provided in the form of reimbursements for testing and certification charges from
379
+ recognized laboratories bank loan processing and bank guarantee fees membership
380
+ fees of export promotion councils onboarding costs for e-commerce and government
381
+ procurement platforms and fees for short-term skill and management training programs
382
+ at reputed institutions. by reducing entry barriers and providing structured handholding
383
+ nssh aims to enhance competitiveness ensure inclusive growth and enable sc st
384
+ entrepreneurs to scale up operations and integrate with formal supply chains.
385
+ examples of grievances reported under the scheme include rejection of reimbursement
386
+ claims where testing or certification expenses exceed the prescribed financial
387
+ ceiling despite compliance with quality standards blockage of financial assistance
388
+ due to delays or discrepancies in caste certificate verification even when enterprises
389
+ are otherwise registered as sc st-owned instances where sc st msmes fail to secure
390
+ tenders despite the mandated procurement quota because of non-compliance by procuring
391
+ cpses partial reimbursement of approved training or capacity-building expenses
392
+ owing to scheme-specific limits leading to out-of-pocket costs for entrepreneurs
393
+ and gaps in timely support from local nssh offices particularly in remote or north-eastern
394
+ regions affecting onboarding to procurement portals and access to scheme benefits.
395
+ - UAM/Udyam Registration/Certificate related issues. Existing / Unauthorized UDYAM
396
+ Registration Against PAN. this category refers to grievances where an entrepreneur
397
+ discovers that a udyam registration already exists against their pan either due
398
+ to duplicate registration or because someone else created the registration without
399
+ their authorization. since pan is used as a key identifier for enterprise registration
400
+ the presence of an existing registration can prevent the legitimate owner from
401
+ creating a new one or managing the enterprise details. grievances under this category
402
+ usually include complaints about duplicate registrations created for the same
403
+ enterprise or multiple registrations linked to the same pan. some business owners
404
+ report that when they attempt to register their enterprise the system indicates
405
+ that a registration already exists even though they are unaware of creating one
406
+ earlier. in other cases entrepreneurs may find that an employee consultant former
407
+ partner or third party registered the enterprise using the business pan without
408
+ informing the owner. there may also be situations where an earlier registration
409
+ contains incorrect enterprise information leading to confusion about the valid
410
+ record. such grievances are generally raised by business proprietors partners
411
+ of partnership firms directors of companies or authorized representatives responsible
412
+ for registering the enterprise under msme. these complaints may also be submitted
413
+ by compliance managers accountants or consultants who are attempting to complete
414
+ the msme registration process for the business but encounter an existing record
415
+ linked to the pan. the purpose of raising this grievance is to identify the existing
416
+ registration verify its legitimacy and resolve conflicts arising from duplicate
417
+ or unauthorized registrations associated with the enterprise s pan.
418
+ pipeline_tag: sentence-similarity
419
+ library_name: sentence-transformers
420
+ metrics:
421
+ - pearson_cosine
422
+ - spearman_cosine
423
+ model-index:
424
+ - name: SentenceTransformer based on sentence-transformers/all-mpnet-base-v2
425
+ results:
426
+ - task:
427
+ type: semantic-similarity
428
+ name: Semantic Similarity
429
+ dataset:
430
+ name: Unknown
431
+ type: unknown
432
+ metrics:
433
+ - type: pearson_cosine
434
+ value: .nan
435
+ name: Pearson Cosine
436
+ - type: spearman_cosine
437
+ value: .nan
438
+ name: Spearman Cosine
439
+ ---
440
+
441
+ # SentenceTransformer based on sentence-transformers/all-mpnet-base-v2
442
+
443
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
444
+
445
+ ## Model Details
446
+
447
+ ### Model Description
448
+ - **Model Type:** Sentence Transformer
449
+ - **Base model:** [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) <!-- at revision e8c3b32edf5434bc2275fc9bab85f82640a19130 -->
450
+ - **Maximum Sequence Length:** 128 tokens
451
+ - **Output Dimensionality:** 768 dimensions
452
+ - **Similarity Function:** Cosine Similarity
453
+ <!-- - **Training Dataset:** Unknown -->
454
+ <!-- - **Language:** Unknown -->
455
+ <!-- - **License:** Unknown -->
456
+
457
+ ### Model Sources
458
+
459
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
460
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/huggingface/sentence-transformers)
461
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
462
+
463
+ ### Full Model Architecture
464
+
465
+ ```
466
+ SentenceTransformer(
467
+ (0): Transformer({'max_seq_length': 128, 'do_lower_case': False, 'architecture': 'MPNetModel'})
468
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
469
+ (2): Normalize()
470
+ )
471
+ ```
472
+
473
+ ## Usage
474
+
475
+ ### Direct Usage (Sentence Transformers)
476
+
477
+ First install the Sentence Transformers library:
478
+
479
+ ```bash
480
+ pip install -U sentence-transformers
481
+ ```
482
+
483
+ Then you can load this model and run inference.
484
+ ```python
485
+ from sentence_transformers import SentenceTransformer
486
+
487
+ # Download from the 🤗 Hub
488
+ model = SentenceTransformer("sentence_transformers_model_id")
489
+ # Run inference
490
+ sentences = [
491
+ 'dear sir mam i am trying to register udyam with my pan but error showing udyam registration has already done through this pan and i have not registered earlier so please guide me aadhaar <uam_no> pan <pan_no> mobile <phone_no> issue clarification on existing udyam registration context the user is requesting clarification as the udyam registration portal indicates that registration has already been done through the pan although the user states that no registration was made. details - aadhar no <NUM> pan no gnips2021g mobile no <NUM>',
492
+ 'UAM/Udyam Registration/Certificate related issues. Existing / Unauthorized UDYAM Registration Against PAN. this category refers to grievances where an entrepreneur discovers that a udyam registration already exists against their pan either due to duplicate registration or because someone else created the registration without their authorization. since pan is used as a key identifier for enterprise registration the presence of an existing registration can prevent the legitimate owner from creating a new one or managing the enterprise details. grievances under this category usually include complaints about duplicate registrations created for the same enterprise or multiple registrations linked to the same pan. some business owners report that when they attempt to register their enterprise the system indicates that a registration already exists even though they are unaware of creating one earlier. in other cases entrepreneurs may find that an employee consultant former partner or third party registered the enterprise using the business pan without informing the owner. there may also be situations where an earlier registration contains incorrect enterprise information leading to confusion about the valid record. such grievances are generally raised by business proprietors partners of partnership firms directors of companies or authorized representatives responsible for registering the enterprise under msme. these complaints may also be submitted by compliance managers accountants or consultants who are attempting to complete the msme registration process for the business but encounter an existing record linked to the pan. the purpose of raising this grievance is to identify the existing registration verify its legitimacy and resolve conflicts arising from duplicate or unauthorized registrations associated with the enterprise s pan.',
493
+ 'Marketing and Skilling. National SC ST HUB. national sc-st hub nssh is a central sector scheme launched in <NUM> by the ministry of micro small and medium enterprises and implemented by the national small industries corporation to empower scheduled caste and scheduled tribe entrepreneurs and strengthen their participation in the msme ecosystem. the scheme focuses on capacity building market access financial facilitation and handholding support while also operationalizing the mandatory <NUM> procurement target for sc st owned mses under the public procurement policy for mses <NUM> . through a network of national sc-st hub offices across the country the hub assists eligible sc st entrepreneurs holding at least <NUM> ownership and control in activities such as udyam and gem registration participation in government tenders access to credit and skill upgradation. financial support is provided in the form of reimbursements for testing and certification charges from recognized laboratories bank loan processing and bank guarantee fees membership fees of export promotion councils onboarding costs for e-commerce and government procurement platforms and fees for short-term skill and management training programs at reputed institutions. by reducing entry barriers and providing structured handholding nssh aims to enhance competitiveness ensure inclusive growth and enable sc st entrepreneurs to scale up operations and integrate with formal supply chains. examples of grievances reported under the scheme include rejection of reimbursement claims where testing or certification expenses exceed the prescribed financial ceiling despite compliance with quality standards blockage of financial assistance due to delays or discrepancies in caste certificate verification even when enterprises are otherwise registered as sc st-owned instances where sc st msmes fail to secure tenders despite the mandated procurement quota because of non-compliance by procuring cpses partial reimbursement of approved training or capacity-building expenses owing to scheme-specific limits leading to out-of-pocket costs for entrepreneurs and gaps in timely support from local nssh offices particularly in remote or north-eastern regions affecting onboarding to procurement portals and access to scheme benefits.',
494
+ ]
495
+ embeddings = model.encode(sentences)
496
+ print(embeddings.shape)
497
+ # [3, 768]
498
+
499
+ # Get the similarity scores for the embeddings
500
+ similarities = model.similarity(embeddings, embeddings)
501
+ print(similarities)
502
+ # tensor([[1.0000, 0.7751, 0.1988],
503
+ # [0.7751, 1.0000, 0.2777],
504
+ # [0.1988, 0.2777, 1.0000]])
505
+ ```
506
+
507
+ <!--
508
+ ### Direct Usage (Transformers)
509
+
510
+ <details><summary>Click to see the direct usage in Transformers</summary>
511
+
512
+ </details>
513
+ -->
514
+
515
+ <!--
516
+ ### Downstream Usage (Sentence Transformers)
517
+
518
+ You can finetune this model on your own dataset.
519
+
520
+ <details><summary>Click to expand</summary>
521
+
522
+ </details>
523
+ -->
524
+
525
+ <!--
526
+ ### Out-of-Scope Use
527
+
528
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
529
+ -->
530
+
531
+ ## Evaluation
532
+
533
+ ### Metrics
534
+
535
+ #### Semantic Similarity
536
+
537
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
538
+
539
+ | Metric | Value |
540
+ |:--------------------|:--------|
541
+ | pearson_cosine | nan |
542
+ | **spearman_cosine** | **nan** |
543
+
544
+ <!--
545
+ ## Bias, Risks and Limitations
546
+
547
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
548
+ -->
549
+
550
+ <!--
551
+ ### Recommendations
552
+
553
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
554
+ -->
555
+
556
+ ## Training Details
557
+
558
+ ### Training Dataset
559
+
560
+ #### Unnamed Dataset
561
+
562
+ * Size: 88 training samples
563
+ * Columns: <code>sentence_0</code> and <code>sentence_1</code>
564
+ * Approximate statistics based on the first 88 samples:
565
+ | | sentence_0 | sentence_1 |
566
+ |:--------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
567
+ | type | string | string |
568
+ | details | <ul><li>min: 46 tokens</li><li>mean: 118.39 tokens</li><li>max: 128 tokens</li></ul> | <ul><li>min: 128 tokens</li><li>mean: 128.0 tokens</li><li>max: 128 tokens</li></ul> |
569
+ * Samples:
570
+ | sentence_0 | sentence_1 |
571
+ |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
572
+ | <code>sub - request for clarification on msme dev act <NUM> . dear sir <NUM> . your august office is kindly requested to define the specific word _x0080__x009c_tender _x0080__x009d_ as referred in the public procurement policy for micro and small enterprises mse _x0080__x0099_s order gazetted notification no. d.l.- <NUM> <NUM> dtd. <NUM> . <NUM> . <NUM> sub sec <NUM> heading as price quotation in tenders and further word _x0080__x009c_rate contract _x0080__x009d_ as referred in sub sec <NUM> developing micro and small enterprises vendors before substitution dt. <NUM> . <NUM> . <NUM> as extracted. _x0080__x009c_7. developing micro and small enterprise vendors. _x0080__x0093_the central ministries or departments or public sector undertakings shall take necessary steps to develop appropriate vendors by organizing vendor development programmes or buyer-seller meets and entering into rate contract with micro and small enterprises for a specified period in respect of periodic requirements also. _x...</code> | <code>Policy and Schemes. Related to Public Procurement by PSUs. this category pertains to grievances related to public sector undertakings psus violating or diluting mandatory msme procurement norms under the public procurement policy for msmes <NUM> and related guidelines including gem . the scope encompasses cases where psus fail to meet prescribed msme procurement quotas deny msmes their l1 price-matching rights bypass eligible msme vendors despite valid registration or design tenders with disproportionate eligibility conditions that effectively exclude msmes. key issues and scenarios within this category include failure to meet msme procurement quotas denial of l1 price-matching rights to msmes bypassing eligible msme vendors despite valid registration designing tenders with disproportionate eligibility conditions such as excessive turnover requirements prior psu experience requirements high emd pbg requirements unnecessary technical specifications post-award payment delays including wi...</code> |
573
+ | <code>banks approved clcs-tu loan for new machines but subsidy claim is rejected over minor tech list mismatch despite empanelled vendor. this ties up my finance without <NUM> aid. release subsidy and simplify verification for tech upgrades.special clcs-tu for sc st promises <NUM> subsidy but nodal agency delays processing my plant machinery finance claim for months with extra document demands. please fast-track special aid and approve higher subsidy for sc st beginners. issue delayed subsidy claim and non-approval under clcs-tu and special clcs for sc st context the user is reporting delayed subsidy claim and non-approval under clcs-tu and special clcs for sc st schemes citing minor technical list mismatch and extra document demands and requesting simplification of verification and fast-tracking of special aid. details - issue with clcs-tu loan minor tech list mismatch issue with special clcs for sc st delayed processing and extra document demands requested action simplify verification and ...</code> | <code>Starter, Credit and Finance. Credit Linked Capital Subsidy for Technology Upgradation (CLCS- TU) & Special CLCS for SC&ST. credit linked capital subsidy scheme for technology upgradation clcss tu and the special clcs for sc st entrepreneurs is a flagship technology modernisation program of the ministry of micro small and medium enterprises designed to help micro and small manufacturing enterprises upgrade to proven state-of-the-art technologies. under the standard clcss tu eligible mses receive an upfront capital subsidy of <NUM> on institutional term loans used for purchasing approved plant and machinery subject to a maximum subsidy of <NUM> lakh on an eligible investment ceiling of <NUM> crore across notified sub-sectors. the scheme is implemented through nodal agencies such as small industries development bank of india national bank for agriculture and rural development and national institute for entrepreneurship and small business development with technical vetting by expert bodies...</code> |
574
+ | <code>i am unable to change enterprise name or trade name in my udayam certificate pls give proper solution issue update of enterprise trade name in udyam certificate context the user is requesting an update of the enterprise trade name in the udyam certificate. details - enterprise trade name update required</code> | <code>UAM/Udyam Registration/Certificate related issues. Update Company/Owner Name Details. this category includes grievances related to corrections or updates to the name of the enterprise or the name of the owner associated with a udyam registration. accurate naming details are important for maintaining correct enterprise records and ensuring that the information recorded in the registration reflects the official business identity. grievances under this category typically arise when the name of the enterprise or the owner s name recorded during registration contains an error or needs to be updated due to changes in the business structure. for example the enterprise name may have been entered incorrectly during registration or the owner s name may not match official identification documents. in some cases the enterprise name may change due to business rebranding conversion of the business structure or correction of typographical errors made during the registration process. users may also re...</code> |
575
+ * Loss: [<code>CachedMultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedmultiplenegativesrankingloss) with these parameters:
576
+ ```json
577
+ {
578
+ "scale": 20.0,
579
+ "similarity_fct": "cos_sim",
580
+ "mini_batch_size": 32,
581
+ "gather_across_devices": false
582
+ }
583
+ ```
584
+
585
+ ### Training Hyperparameters
586
+ #### Non-Default Hyperparameters
587
+
588
+ - `per_device_train_batch_size`: 32
589
+ - `per_device_eval_batch_size`: 32
590
+ - `num_train_epochs`: 5
591
+ - `fp16`: True
592
+ - `multi_dataset_batch_sampler`: round_robin
593
+
594
+ #### All Hyperparameters
595
+ <details><summary>Click to expand</summary>
596
+
597
+ - `do_predict`: False
598
+ - `eval_strategy`: no
599
+ - `prediction_loss_only`: True
600
+ - `per_device_train_batch_size`: 32
601
+ - `per_device_eval_batch_size`: 32
602
+ - `gradient_accumulation_steps`: 1
603
+ - `eval_accumulation_steps`: None
604
+ - `torch_empty_cache_steps`: None
605
+ - `learning_rate`: 5e-05
606
+ - `weight_decay`: 0.0
607
+ - `adam_beta1`: 0.9
608
+ - `adam_beta2`: 0.999
609
+ - `adam_epsilon`: 1e-08
610
+ - `max_grad_norm`: 1
611
+ - `num_train_epochs`: 5
612
+ - `max_steps`: -1
613
+ - `lr_scheduler_type`: linear
614
+ - `lr_scheduler_kwargs`: None
615
+ - `warmup_ratio`: None
616
+ - `warmup_steps`: 0
617
+ - `log_level`: passive
618
+ - `log_level_replica`: warning
619
+ - `log_on_each_node`: True
620
+ - `logging_nan_inf_filter`: True
621
+ - `enable_jit_checkpoint`: False
622
+ - `save_on_each_node`: False
623
+ - `save_only_model`: False
624
+ - `restore_callback_states_from_checkpoint`: False
625
+ - `use_cpu`: False
626
+ - `seed`: 42
627
+ - `data_seed`: None
628
+ - `bf16`: False
629
+ - `fp16`: True
630
+ - `bf16_full_eval`: False
631
+ - `fp16_full_eval`: False
632
+ - `tf32`: None
633
+ - `local_rank`: -1
634
+ - `ddp_backend`: None
635
+ - `debug`: []
636
+ - `dataloader_drop_last`: False
637
+ - `dataloader_num_workers`: 0
638
+ - `dataloader_prefetch_factor`: None
639
+ - `disable_tqdm`: False
640
+ - `remove_unused_columns`: True
641
+ - `label_names`: None
642
+ - `load_best_model_at_end`: False
643
+ - `ignore_data_skip`: False
644
+ - `fsdp`: []
645
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
646
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
647
+ - `parallelism_config`: None
648
+ - `deepspeed`: None
649
+ - `label_smoothing_factor`: 0.0
650
+ - `optim`: adamw_torch_fused
651
+ - `optim_args`: None
652
+ - `group_by_length`: False
653
+ - `length_column_name`: length
654
+ - `project`: huggingface
655
+ - `trackio_space_id`: trackio
656
+ - `ddp_find_unused_parameters`: None
657
+ - `ddp_bucket_cap_mb`: None
658
+ - `ddp_broadcast_buffers`: False
659
+ - `dataloader_pin_memory`: True
660
+ - `dataloader_persistent_workers`: False
661
+ - `skip_memory_metrics`: True
662
+ - `push_to_hub`: False
663
+ - `resume_from_checkpoint`: None
664
+ - `hub_model_id`: None
665
+ - `hub_strategy`: every_save
666
+ - `hub_private_repo`: None
667
+ - `hub_always_push`: False
668
+ - `hub_revision`: None
669
+ - `gradient_checkpointing`: False
670
+ - `gradient_checkpointing_kwargs`: None
671
+ - `include_for_metrics`: []
672
+ - `eval_do_concat_batches`: True
673
+ - `auto_find_batch_size`: False
674
+ - `full_determinism`: False
675
+ - `ddp_timeout`: 1800
676
+ - `torch_compile`: False
677
+ - `torch_compile_backend`: None
678
+ - `torch_compile_mode`: None
679
+ - `include_num_input_tokens_seen`: no
680
+ - `neftune_noise_alpha`: None
681
+ - `optim_target_modules`: None
682
+ - `batch_eval_metrics`: False
683
+ - `eval_on_start`: False
684
+ - `use_liger_kernel`: False
685
+ - `liger_kernel_config`: None
686
+ - `eval_use_gather_object`: False
687
+ - `average_tokens_across_devices`: True
688
+ - `use_cache`: False
689
+ - `prompts`: None
690
+ - `batch_sampler`: batch_sampler
691
+ - `multi_dataset_batch_sampler`: round_robin
692
+ - `router_mapping`: {}
693
+ - `learning_rate_mapping`: {}
694
+
695
+ </details>
696
+
697
+ ### Training Logs
698
+ | Epoch | Step | spearman_cosine |
699
+ |:-----:|:----:|:---------------:|
700
+ | 1.0 | 3 | nan |
701
+ | 2.0 | 6 | nan |
702
+ | 3.0 | 9 | nan |
703
+ | 4.0 | 12 | nan |
704
+ | 5.0 | 15 | nan |
705
+
706
+
707
+ ### Framework Versions
708
+ - Python: 3.12.12
709
+ - Sentence Transformers: 5.2.3
710
+ - Transformers: 5.0.0
711
+ - PyTorch: 2.10.0+cu128
712
+ - Accelerate: 1.12.0
713
+ - Datasets: 4.0.0
714
+ - Tokenizers: 0.22.2
715
+
716
+ ## Citation
717
+
718
+ ### BibTeX
719
+
720
+ #### Sentence Transformers
721
+ ```bibtex
722
+ @inproceedings{reimers-2019-sentence-bert,
723
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
724
+ author = "Reimers, Nils and Gurevych, Iryna",
725
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
726
+ month = "11",
727
+ year = "2019",
728
+ publisher = "Association for Computational Linguistics",
729
+ url = "https://arxiv.org/abs/1908.10084",
730
+ }
731
+ ```
732
+
733
+ #### CachedMultipleNegativesRankingLoss
734
+ ```bibtex
735
+ @misc{gao2021scaling,
736
+ title={Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup},
737
+ author={Luyu Gao and Yunyi Zhang and Jiawei Han and Jamie Callan},
738
+ year={2021},
739
+ eprint={2101.06983},
740
+ archivePrefix={arXiv},
741
+ primaryClass={cs.LG}
742
+ }
743
+ ```
744
+
745
+ <!--
746
+ ## Glossary
747
+
748
+ *Clearly define terms in order to be accessible across audiences.*
749
+ -->
750
+
751
+ <!--
752
+ ## Model Card Authors
753
+
754
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
755
+ -->
756
+
757
+ <!--
758
+ ## Model Card Contact
759
+
760
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
761
+ -->
config.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "MPNetModel"
4
+ ],
5
+ "attention_probs_dropout_prob": 0.1,
6
+ "bos_token_id": 0,
7
+ "dtype": "float32",
8
+ "eos_token_id": 2,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 3072,
14
+ "layer_norm_eps": 1e-05,
15
+ "max_position_embeddings": 514,
16
+ "model_type": "mpnet",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 12,
19
+ "pad_token_id": 1,
20
+ "relative_attention_num_buckets": 32,
21
+ "tie_word_embeddings": true,
22
+ "transformers_version": "5.0.0",
23
+ "vocab_size": 30527
24
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "5.2.3",
4
+ "transformers": "5.0.0",
5
+ "pytorch": "2.10.0+cu128"
6
+ },
7
+ "model_type": "SentenceTransformer",
8
+ "prompts": {
9
+ "query": "",
10
+ "document": ""
11
+ },
12
+ "default_prompt_name": null,
13
+ "similarity_fn_name": "cosine"
14
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1106aaac2105664f8b6024d114980b3eeacaa356bbbb07efb7018473fb2c8c01
3
+ size 437967648
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 128,
3
+ "do_lower_case": false
4
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "backend": "tokenizers",
3
+ "bos_token": "<s>",
4
+ "cls_token": "<s>",
5
+ "do_lower_case": true,
6
+ "eos_token": "</s>",
7
+ "is_local": false,
8
+ "mask_token": "<mask>",
9
+ "model_max_length": 384,
10
+ "pad_token": "<pad>",
11
+ "sep_token": "</s>",
12
+ "strip_accents": null,
13
+ "tokenize_chinese_chars": true,
14
+ "tokenizer_class": "MPNetTokenizer",
15
+ "unk_token": "[UNK]"
16
+ }