andreinsardi commited on
Commit
6ae175a
·
verified ·
1 Parent(s): afc45b0

Initial release of SciBERT-SolarPhysics-Search (fine-tuned for solar physics)

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,655 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - sentence-similarity
5
+ - feature-extraction
6
+ - dense
7
+ - generated_from_trainer
8
+ - dataset_size:36416
9
+ - loss:MultipleNegativesRankingLoss
10
+ widget:
11
+ - source_sentence: deep learning; magnetic field measurement; muon g-2 experiment;
12
+ tracking reconstruction
13
+ sentences:
14
+ - 'in this study we test 18 versions of five fundamental energy scaling laws that
15
+ operate in large solar flares. we express scaling laws in terms of the magnetic
16
+ potential field energy e<inf>p</inf>, the mean potential field strength b<inf>p</inf>,
17
+ the free energy e<inf>free</inf>, the dissipated magnetic flare energy e<inf>diss</inf>,
18
+ the magnetic length scale l, the thermal length scale l<inf>th</inf>, the mean
19
+ helically twisted flux tube radius r, the sunspot radius r, the emission-measure-weighted
20
+ flare temperature t<inf>e</inf>, the electron density n<inf>e</inf>, and the total
21
+ emission measure em, measured from a data set of 173 goes m- and x-class flare
22
+ events. the five categories of physical scaling laws include (i) a scaling law
23
+ of the potential field energy, (ii) a scaling law for helical twisting, (iii)
24
+ a scaling law for petschek-type magnetic reconnection, (iv) the rosner–tucker–vaiana
25
+ scaling law, and (v) the shibata–yokoyama scaling law. we test the self-consistency
26
+ of these theoretical scaling laws with observed parameters by requiring two criteria:
27
+ a cross-correlation coefficient of ccc > 0.5 between the theoretically predicted
28
+ scaling laws and observed values, and a linear regression fit with a slope of
29
+ α ≈ 1 within one standard deviation σ. these two criteria enable us (i) to corroborate
30
+ some existing (or modified) scaling laws, (ii) to reject other scaling laws that
31
+ are not consistent with the observations, (iii) to probe the dimensionality of
32
+ flare geometries, and (iv) to predict various energy parameters based on tested
33
+ scaling laws. © 2020 elsevier b.v., all rights reserved.'
34
+ - the run1 result of the fermilab muon g-2 experiment have shown a 4.2 standard
35
+ deviation between the experimental measurement and theoretical prediction of a<inf>μ</inf>,
36
+ strongly indicating a new physics signal. the fermilab experiment already accumulated
37
+ 21 times more data compared to the bnl experiment. the j-parc muon g-2 experiment
38
+ will collect 3.5 times the statistics compared to fermilab. with the increases
39
+ in the collected data volume, and limited by the speed and accuracy, the existing
40
+ tracking reconstruction and magnetic field measurement method may not fully satisfy
41
+ the requirement of the experiment. the breakthrough of the deep learning inspires
42
+ new analysis method in the muon g-2 experiment. in this proceeding, we will present
43
+ some preliminary research on the tracking reconstruction based on recurrent neural
44
+ network (rnn), graph neural network (gnn) and the magnetic field measurement based
45
+ on physics informed neural network (pinn). the preliminary results show that the
46
+ deep learning method has enormous potential in these topics. © 2024 elsevier b.v.,
47
+ all rights reserved.
48
+ - derived from the boltzmann equation, the neutron transport equation describes
49
+ the motions and interactions of neutrons with nuclei in nuclear devices such as
50
+ nuclear reactors. the collision or fission effect are described as integral terms
51
+ which arrive in an integro-differential neutron transport equation (idnt). only
52
+ for mono-material or simple geometries conditions, elegant approximation can simplify
53
+ the transport equation to provide analytic solutions. to solve this integro-differential
54
+ equation becomes a practical engineering challenge. recent development of deep-learning
55
+ techniques provides a new approach to solve them but for some complicated conditions,
56
+ it is also time consuming. to optimize solving the integro-differential equation
57
+ particularly under the deep-learning method, we propose to convert the integral
58
+ terms in the integro-differential neutron transport equation into their corresponding
59
+ antiderivatives, providing a set of fixed solution constraint conditions for these
60
+ antiderivatives, thus yielding an exact differential neutron transport equation
61
+ (ednt). the paper elucidates the physical meaning of the antiderivatives and analyzes
62
+ the continuity and computational complexity of the new transport equation form.
63
+ to illustrate the significant advantage of endt, numerical validations have been
64
+ conducted using various numerical methods on typical benchmark problems. the numerical
65
+ experiments demonstrate that the ednt is compatible with various numerical methods,
66
+ including the finite difference method (fdm), finite volume method (fvm), and
67
+ pinn. compared to the idnt, the ednt offers significant efficiency advantages,
68
+ with reductions in computational time ranging from several times to several orders
69
+ of magnitude. this ednt approach may also be applicable for other integro-differential
70
+ transport theories such as radiative energy transport and has potential application
71
+ in astrophysics or other fields. © 2025 elsevier b.v., all rights reserved.
72
+ - source_sentence: observations of equatorial plasma bubbles using a low-cost 630.0-nm
73
+ all-sky imager in ishigaki island, japan
74
+ sentences:
75
+ - prediction of wave propagation in vegetated waters is crucial for the design and
76
+ maintenance of coastal ecological protection systems. in this study, we propose
77
+ a physics-informed neural network (pinn) model that incorporates physical constraints
78
+ from the boussinesq equations for modeling wave propagation processes in vegetated
79
+ waters. the results demonstrate that the pinn model effectively captures the evolution
80
+ of regular wave propagation in rigid, non-submerged vegetated waters. compared
81
+ to conventional numerical models, the pinn approach offers a more efficient preprocessing
82
+ framework while maintaining comparable simulation accuracy with an average coefficient
83
+ of determination (r<sup>2</sup>) of 0.942, an average root mean square error (rmse)
84
+ of 1.84 × 10<sup>−3</sup> m and an average mean absolute error (mae) of 1.19 ×
85
+ 10<sup>−3</sup> m. moreover, the parametric inference framework embedded within
86
+ pinn enables precise determination of the optimal drag coefficient (c<inf>d</inf>)
87
+ through systematic assimilation of experimental measurements. additionally, the
88
+ accuracy of both the simulation and the inferred c<inf>d</inf> improves as more
89
+ external data are integrated into the model. © 2025 elsevier b.v., all rights
90
+ reserved.
91
+ - 'here, we introduce a low-cost airglow imaging system developed for observing
92
+ plasma bubble signatures in 630.0-nm airglow emission from the f region of the
93
+ ionosphere. the system is composed of a small camera, optical filter, and fish-eye
94
+ lens, and is operated using free software that automatically records video from
95
+ the camera. a pilot system was deployed in ishigaki island in the southern part
96
+ of japan (lat 24.4, lon 124.4, mlat 19.6) and was operated for ~ 1.5 years from
97
+ 2014 to 2016 corresponding to the recent solar maximum period. the pilot observations
98
+ demonstrated that it was difficult to identify the plasma bubble signature in
99
+ the raw image captured every 4 s. however, the quality of the image could be improved
100
+ by reducing the random noise of instrumental origin through an integration of
101
+ 30 consecutive raw images obtained in 2 min and further by subtracting the 1-h
102
+ averaged background image. we compared the deviation images to those from a co-existing
103
+ airglow imager of omtis, which is equipped with a back-illuminated cooled ccd
104
+ camera with a high quantum efficiency of ~ 90%. it was confirmed that the low-cost
105
+ airglow imager is capable of imaging the spatial structure of plasma bubbles,
106
+ including their bifurcating traces. the results of these pilot observations in
107
+ ishigaki island will allow us to distribute the low-cost imager in a wide area
108
+ and construct a network for monitoring plasma bubbles and their space weather
109
+ impacts on satellite navigation systems.[figure not available: see fulltext.].
110
+ © 2020 elsevier b.v., all rights reserved.'
111
+ - physics-informed neural networks are used to characterize the mass transport to
112
+ the rotating disk electrode (rde), the most widely employed hydrodynamic electrode
113
+ in electroanalysis. the pinn approach was first quantitatively verified via 1d
114
+ simulations under the levich approximation for cyclic voltammetry and chronoamperometry,
115
+ allowing comparison of the results with finite difference simulations and analytical
116
+ equations. however, the levich approximation is only accurate for high schmidt
117
+ numbers (sc > 1000). the pinn approach allowed consideration of smaller sc, achieving
118
+ an analytical level of accuracy (error <0.1%) comparable with independent numerical
119
+ evaluation and confirming that the errors in the levich equation can be as high
120
+ as 3% when sc = 1000 for rapidly diffusing species in aqueous solution. entirely
121
+ novel, the pinns permit the solution of the 2d diffusion equation under cylindrical
122
+ geometry incorporating radial diffusion and reveal the rotating disk electrode
123
+ edge effect as a consequence of the nonuniform accessibility of the disc with
124
+ greater currents flowing near the extremities. the contribution to the total current
125
+ is quantified as a function of the rotation speed, disk radius, and analyte diffusion
126
+ coefficient. the success in extending the theory for the rotating disk electrode
127
+ beyond the levich equation shows that pinns can be an easier and more powerful
128
+ substitute for conventional methods, both analytical and simulation based. © 2023
129
+ elsevier b.v., all rights reserved.
130
+ - source_sentence: aerodynamics; channel estimation; channel flow; image enhancement;
131
+ optical flows; stream flow; velocimeters; vorticity; wakes; computer vision; fluid-dynamics;
132
+ generalisation; image velocimetry; learning approach; optical-; particle image
133
+ velocimetry; particle images; performance; streamflow monitoring; hydrodynamics;
134
+ computer vision; data set; fluid dynamics; hydrodynamics; machine learning; particle
135
+ image velocimetry; streamflow
136
+ sentences:
137
+ - we propose a novel approach for tackling scientific problems governed by differential
138
+ equations, based on the concept of a physics-informed neural networks (pinns).
139
+ the method involves evaluating the residuals of equations on subdomains of the
140
+ computational zone via numerical integration. test functions and integral weights
141
+ are embedded within convolutional filters to extract information from these residuals.
142
+ our approach demonstrates exceptional parallel abilities when dealing with computational
143
+ zones featuring large numbers of sub-domains, proving significantly more efficient
144
+ than variational physics-informed neural networks with domain decomposition (hp-vpinns).
145
+ by utilizing domain decomposition, we can further enhance the precision of our
146
+ predictions when dealing with complex functions. in comparison to pinns, our approach
147
+ boasts superior accuracy when fitting intricate functions. additionally, we showcase
148
+ the efficacy of our approach in solving inverse problems, such as identifying
149
+ nonuniform damage distributions within materials. our proposed approach offers
150
+ tremendous potential for physics-informed neural networks to solve problems with
151
+ complex geometries or nonlinearities that require decomposing the computational
152
+ zone into numerous sub-domains. © 2023 elsevier b.v., all rights reserved.
153
+ - the cantilever beam structures, like wind turbine towers, space masts, solar wings,
154
+ and high-rise chimneys and buildings, are widely used engineering structures.
155
+ it is crucial to fast and accurately predict their dynamic responses under complicated
156
+ excitations. this paper establishes an improved physics-informed neural network
157
+ (pinn) called fourier transformation-pinn (ft-pinn) for predicting the dynamic
158
+ response of a cantilever beam subject to different boundary constraints and excitation
159
+ conditions. the core idea of the ft-pinn is to use the latin hypercube sampling
160
+ strategy for generating model training points and introduce multiple sets of control
161
+ equations with different frequencies through fourier expansion to achieve high
162
+ solving accuracy and efficiency for partial differential equations. two loss functions,
163
+ including the mean square error and mean absolute error, are included in the ft-pinn
164
+ for comparison. four test cases are designed to evaluate the performance of the
165
+ ft-pinn and classic pinn in solving dynamic equations of a cantilever beam structure
166
+ with different boundary and excitation conditions. it is validated that the ft-pinn
167
+ model proposed in this paper has higher accuracy and efficiency than the classic
168
+ pinn. this also provides a new approach for using pinn to handle local sharp gradients
169
+ and complex high-frequency problems in vibration equations. © 2025 elsevier b.v.,
170
+ all rights reserved.
171
+ - the inference of velocity fields from the displacement of objects and/or fields
172
+ visible within a series of consecutive images over known time intervals has been
173
+ explored extensively within experimental fluid dynamics. real image sequences
174
+ of environmental hydrodynamic flows, however, pose additional challenges for velocity
175
+ field inference due to factors such as lighting inhomogeneity, particle density,
176
+ camera orientation and stability. here we investigate the performance of classical
177
+ and deep learning based velocity estimation methods on three experimental datasets;
178
+ a hydrodynamics laboratory dataset of different flow types and two open-source
179
+ datasets of aerial river footage from field campaigns. the river datasets are
180
+ accompanied by observational datasets of in-situ measurements. in particular,
181
+ we investigate the generalisation of deep learning based methods from ideal training
182
+ conditions to real images. we consider three deep learning approaches; recurrent
183
+ all-pairs-field transforms (raft), a physics-informed approach and an unsupervised
184
+ learning approach (unliteflownet-piv). results indicate that raft, which achieves
185
+ state-of-the-art performance on particle image datasets, showed good generalisation
186
+ to the laboratory dataset and field imagery. the physics-informed approach performed
187
+ similarly to raft across the laboratory dataset whilst generalisation to drone-based
188
+ data proved challenging. across the laboratory dataset, unliteflownet-piv showed
189
+ good performance within wake regions but an underestimation of channel flows and
190
+ freestream regions with limited vorticity, also suffering under poor seeding density.
191
+ limited fine-tuning of unliteflownet-piv on laboratory data, however, led to improved
192
+ performance in these regions, indicating the potential of the unsupervised learning
193
+ approach for environmental flows where 2d ground truth data sources are unavailable
194
+ for training. © 2024 elsevier b.v., all rights reserved.
195
+ - source_sentence: deep neural networks; remote sensing; risk perception; satellite
196
+ imagery; semantics; surface measurement; surface properties; temperature distribution;
197
+ urban planning; weather forecasting; atmospheric modeling; down-scaling; high
198
+ resolution; land surface; land surface temperature; multi-spectral; multi-spectral
199
+ satellite imagery; spectral satellites; urban areas; atmospheric temperature;
200
+ land surface; remote sensing; spatial resolution; surface structure; surface temperature;
201
+ upper atmosphere; urban planning; china
202
+ sentences:
203
+ - estimating urban surface temperature at high resolution is crucial for effective
204
+ urban planning for climate-driven risks. this high-resolution surface temperature
205
+ over broader scales can usually be obtained via satellite remote sensing for historical
206
+ period. however, it can be hard for future predictions. this article presents
207
+ a physics informed hierarchical perception (pihp) network, a novel approach for
208
+ accurate, high-resolution, and generalizable urban surface temperature estimation.
209
+ the key to our approach is leveraging the implied temperature-related physics
210
+ information of the land surface structure from high-resolution multispectral satellite
211
+ images, thus achieving precise estimation or prediction for high spatial resolution
212
+ urban surface temperature. specifically, a semantic category histogram is first
213
+ designed to describe the land surface structures. based on this, a hierarchical
214
+ urban surface perception network is proposed to capture the complex relationship
215
+ between the underlying land surface features, upper atmosphere conditions, and
216
+ the intracity temperature. the proposed pihp-net makes it possible to generate
217
+ models that can generalize across different cities, thus estimating or predicting
218
+ high-resolution urban surface temperature when the satellite land surface temperature
219
+ (lst) observation is not available. experiments over various cities in different
220
+ climate regions in china show, for the first time, errors less than 2 k (for most
221
+ of the cases) at the high resolution (60-by-60 meters grids), thus making it possible
222
+ to predict future intracity temperature from forcing meteorology and multispectral
223
+ satellite imagery. © 2022 elsevier b.v., all rights reserved.
224
+ - in recent years, the growing adoption of artificial intelligence across diverse
225
+ scientific fields has significantly increased demand for advanced semiconductor
226
+ chips, necessitating innovations in semiconductor material design. accurate prediction
227
+ of semiconductor material properties is essential for improving chip performance,
228
+ as these properties directly affect electrical, thermal, and mechanical characteristics.
229
+ traditionally, density functional theory has been the gold standard for atomic-scale
230
+ simulations in material property prediction; however, its high computational cost
231
+ limits scalability. molecular dynamics simulations provide a scalable alternative
232
+ by leveraging the power of machine learning force fields (mlffs); however, semiconductor
233
+ systems present unique challenges due to non-equilibrium dynamics, surface defects,
234
+ and impurities. these factors often result in out-of-distribution (ood) atomic
235
+ configurations, which can significantly degrade model performance. to address
236
+ this challenge, we propose physics-informed sharpness-aware minimization (pi-sam),
237
+ a novel framework designed to enhance the prediction of semiconductor material
238
+ properties across diverse datasets and challenging ood scenarios. specifically,
239
+ pi-sam leverages sharpness-aware minimization to achieve flatter loss minima,
240
+ improving the model's generalization. additionally, it incorporates physics-informed
241
+ regularizations to enforce energy-force consistency and account for potential
242
+ energy surface curvature, ensuring alignment with the underlying physical principles
243
+ governing semiconductor behavior. experimental results demonstrate that our pi-sam
244
+ outperforms competing methods, especially on ood datasets, underscoring its effectiveness
245
+ in improving generalization. © 2025 elsevier b.v., all rights reserved.
246
+ - this letter presents a novel convolutional neural network (cnn)-based methodology
247
+ for robust and accurate open-circuit fault detection and submodule (sm) localization
248
+ in modular multilevel converters. instead of an end-to-end classifier, the proposed
249
+ method employs the cnn as a physics-informed feature extractor to enhance a foundational
250
+ theoretical model by robustly estimating switching frequency harmonic components
251
+ from arm voltage measurements. crucially, the cnn effectively mitigates the detrimental
252
+ impacts of measurement noise and sampling frequency variations. this method offers
253
+ low sensor requirements, adaptability to diverse operating conditions, and high
254
+ computational efficiency. simulation results demonstrate a 61.3% overall performance
255
+ improvement, showcasing enhanced detection speed, sm localization accuracy, and
256
+ robustness compared to the theoretical model under practical constraints. experimental
257
+ validation on a laboratory prototype further substantiates these improvements,
258
+ achieving fault detection and localization on average 15 ms and 22.5 ms faster
259
+ than the baseline theoretical model respectively, showcasing its practical applicability.
260
+ © 2025 elsevier b.v., all rights reserved.
261
+ - source_sentence: data-driven ringed residual u-net scheme for full waveform inversion
262
+ sentences:
263
+ - physics-informed neural network (pinn) has aroused broad interest among fluid
264
+ simulation researchers in recent years, representing a novel paradigm in this
265
+ area where governing differential equations are encoded to provide a hybrid physics-based
266
+ and data-driven deep learning framework. however, the lack of enough validations
267
+ on more complex flow problems has restricted further development and application
268
+ of pinn. our research applies the pinn to simulate a two-dimensional indoor turbulent
269
+ airflow case to address the issue. although it is still quite challenging for
270
+ the pinn to reach an ideal accuracy for the problem through a single purely physics-driven
271
+ training, our research finds that the pinn prediction accuracy can be significantly
272
+ improved by exploiting its ability to assimilate high-fidelity data during training,
273
+ by which the prediction accuracy of pinn is enhanced by 53.2% for pressure, 34.6%
274
+ for horizontal velocity, and 40.4% for vertical velocity, respectively. meanwhile,
275
+ the influence of data points number is also studied, which suggests a balance
276
+ between prediction accuracy and data acquisition cost can be reached. last but
277
+ not least, applying reynolds-averaged navier-stokes (rans) equations and turbulence
278
+ model has also been proved to improve prediction accuracy remarkably. after embedding
279
+ the standard k-ε model to the pinn, the prediction accuracy was enhanced by 82.9%
280
+ for pressure, 59.4% for horizontal velocity, and 70.5% for vertical velocity,
281
+ respectively. these results suggest a promising step toward applications of pinn
282
+ to more complex flow configurations. © 2024 elsevier b.v., all rights reserved.
283
+ - amidst the increasing penetration of intermittent renewable generation and the
284
+ persistent growth of load demands, voltage stability assumes a pivotal concern
285
+ in smart grids. the real-time voltage stability assessment (vsa) under time-varying
286
+ operating conditions becomes paramount. recent strides in real-time vsa, utilizing
287
+ intelligent data-driven learning with measurements, mark significant progress.
288
+ however, a critical and unresolved challenge with purely data-driven methods is
289
+ their susceptibility to performance degradation, especially in out-of-sample scenarios.
290
+ to this end, this article presents a physics-informed guided deep learning (pgdl)
291
+ paradigm for the practical and accurate assessment of voltage stability margins
292
+ (vsms), leveraging both physics-based and data-driven techniques. the pgdl architecture
293
+ includes an improved temporal convolutional network (itcn) for the automatic extraction
294
+ of representative temporal features necessary for vsa from measurement data. additionally,
295
+ pgdl integrates physics-based features informed by domain-specific knowledge.
296
+ a feature fusion scheme is then devised to merge deep-learned features with pertinent
297
+ physics-based attributes. acknowledging the unique contributions of these feature
298
+ modalities to vsa, a novel twin attention mechanism (tam) is proposed to adaptively
299
+ adjust attention weights, prioritizing learned features and thus optimizing vsa
300
+ performance. substantial experiments on power systems of different scales, coupled
301
+ with comparative analyses against state-of-the-art benchmarks, illustrate the
302
+ efficacy and merits of the proposed approach. © 2025 elsevier b.v., all rights
303
+ reserved.
304
+ - full waveform inversion (fwi) is a powerful means for accurately reconstructing
305
+ subsurface velocity models at high resolution. yet it is nevertheless a nonlinear
306
+ and ill-posed problem. physics-driven fwi methods employ gradient-based optimization
307
+ algorithms to minimize the error between the observed seismic data and the synthetically
308
+ generated seismic data. the solution may converge to a local rather than global
309
+ minimum. the cycle-skipping problem occurs when the synthetic data exceed a half-wavelength
310
+ shift relative to the observed data. fwi relies on an accurate initial velocity
311
+ model to mitigate the cycle-skipping problem. moreover, due to the increasing
312
+ size and desired resolution of seismic data, fwi costs a great deal of computational
313
+ time. to obviate these problems, we present a data-driven fwi scheme based on
314
+ a deep learning architecture called u-net. the network consists of the ringed
315
+ residual unit, which integrates residual propagation and residual feedback. it
316
+ beneficially achieves correspondence between the seismic data domain and the velocity
317
+ model domain. the features of the shallow layers are connected with the deep layers
318
+ by a skip connection to facilitate seismic data spatial information propagation
319
+ and utilization. they improve inversion accuracy and make the network more generalizable
320
+ and robust. we utilize the society of exploration geophysicists (segs)/european
321
+ association of geoscientists and engineers (eage) overthrust and salt models to
322
+ verify our proposed method's impressive performance. the experimental results
323
+ clearly demonstrate that the proposed method can produce high-quality velocity
324
+ models. compared with the conventional physics-informed fwi, it has advantages
325
+ in both computational time and initial model dependence. © 2024 elsevier b.v.,
326
+ all rights reserved.
327
+ pipeline_tag: sentence-similarity
328
+ library_name: sentence-transformers
329
+ ---
330
+
331
+ # SentenceTransformer
332
+
333
+ This is a [sentence-transformers](https://www.SBERT.net) model trained. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
334
+
335
+ ## Model Details
336
+
337
+ ### Model Description
338
+ - **Model Type:** Sentence Transformer
339
+ <!-- - **Base model:** [Unknown](https://huggingface.co/unknown) -->
340
+ - **Maximum Sequence Length:** 512 tokens
341
+ - **Output Dimensionality:** 768 dimensions
342
+ - **Similarity Function:** Cosine Similarity
343
+ <!-- - **Training Dataset:** Unknown -->
344
+ <!-- - **Language:** Unknown -->
345
+ <!-- - **License:** Unknown -->
346
+
347
+ ### Model Sources
348
+
349
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
350
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/huggingface/sentence-transformers)
351
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
352
+
353
+ ### Full Model Architecture
354
+
355
+ ```
356
+ SentenceTransformer(
357
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'BertModel'})
358
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
359
+ (2): Normalize()
360
+ )
361
+ ```
362
+
363
+ ## Usage
364
+
365
+ ### Direct Usage (Sentence Transformers)
366
+
367
+ First install the Sentence Transformers library:
368
+
369
+ ```bash
370
+ pip install -U sentence-transformers
371
+ ```
372
+
373
+ Then you can load this model and run inference.
374
+ ```python
375
+ from sentence_transformers import SentenceTransformer
376
+
377
+ # Download from the 🤗 Hub
378
+ model = SentenceTransformer("andreinsardi/SciBERT-SolarPhysics-Search")
379
+ # Run inference
380
+ sentences = [
381
+ 'data-driven ringed residual u-net scheme for full waveform inversion',
382
+ "full waveform inversion (fwi) is a powerful means for accurately reconstructing subsurface velocity models at high resolution. yet it is nevertheless a nonlinear and ill-posed problem. physics-driven fwi methods employ gradient-based optimization algorithms to minimize the error between the observed seismic data and the synthetically generated seismic data. the solution may converge to a local rather than global minimum. the cycle-skipping problem occurs when the synthetic data exceed a half-wavelength shift relative to the observed data. fwi relies on an accurate initial velocity model to mitigate the cycle-skipping problem. moreover, due to the increasing size and desired resolution of seismic data, fwi costs a great deal of computational time. to obviate these problems, we present a data-driven fwi scheme based on a deep learning architecture called u-net. the network consists of the ringed residual unit, which integrates residual propagation and residual feedback. it beneficially achieves correspondence between the seismic data domain and the velocity model domain. the features of the shallow layers are connected with the deep layers by a skip connection to facilitate seismic data spatial information propagation and utilization. they improve inversion accuracy and make the network more generalizable and robust. we utilize the society of exploration geophysicists (segs)/european association of geoscientists and engineers (eage) overthrust and salt models to verify our proposed method's impressive performance. the experimental results clearly demonstrate that the proposed method can produce high-quality velocity models. compared with the conventional physics-informed fwi, it has advantages in both computational time and initial model dependence. © 2024 elsevier b.v., all rights reserved.",
383
+ 'amidst the increasing penetration of intermittent renewable generation and the persistent growth of load demands, voltage stability assumes a pivotal concern in smart grids. the real-time voltage stability assessment (vsa) under time-varying operating conditions becomes paramount. recent strides in real-time vsa, utilizing intelligent data-driven learning with measurements, mark significant progress. however, a critical and unresolved challenge with purely data-driven methods is their susceptibility to performance degradation, especially in out-of-sample scenarios. to this end, this article presents a physics-informed guided deep learning (pgdl) paradigm for the practical and accurate assessment of voltage stability margins (vsms), leveraging both physics-based and data-driven techniques. the pgdl architecture includes an improved temporal convolutional network (itcn) for the automatic extraction of representative temporal features necessary for vsa from measurement data. additionally, pgdl integrates physics-based features informed by domain-specific knowledge. a feature fusion scheme is then devised to merge deep-learned features with pertinent physics-based attributes. acknowledging the unique contributions of these feature modalities to vsa, a novel twin attention mechanism (tam) is proposed to adaptively adjust attention weights, prioritizing learned features and thus optimizing vsa performance. substantial experiments on power systems of different scales, coupled with comparative analyses against state-of-the-art benchmarks, illustrate the efficacy and merits of the proposed approach. © 2025 elsevier b.v., all rights reserved.',
384
+ ]
385
+ embeddings = model.encode(sentences)
386
+ print(embeddings.shape)
387
+ # [3, 768]
388
+
389
+ # Get the similarity scores for the embeddings
390
+ similarities = model.similarity(embeddings, embeddings)
391
+ print(similarities)
392
+ # tensor([[1.0000, 0.5779, 0.0253],
393
+ # [0.5779, 1.0000, 0.0727],
394
+ # [0.0253, 0.0727, 1.0000]])
395
+ ```
396
+
397
+ <!--
398
+ ### Direct Usage (Transformers)
399
+
400
+ <details><summary>Click to see the direct usage in Transformers</summary>
401
+
402
+ </details>
403
+ -->
404
+
405
+ <!--
406
+ ### Downstream Usage (Sentence Transformers)
407
+
408
+ You can finetune this model on your own dataset.
409
+
410
+ <details><summary>Click to expand</summary>
411
+
412
+ </details>
413
+ -->
414
+
415
+ <!--
416
+ ### Out-of-Scope Use
417
+
418
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
419
+ -->
420
+
421
+ <!--
422
+ ## Bias, Risks and Limitations
423
+
424
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
425
+ -->
426
+
427
+ <!--
428
+ ### Recommendations
429
+
430
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
431
+ -->
432
+
433
+ ## Training Details
434
+
435
+ ### Training Dataset
436
+
437
+ #### Unnamed Dataset
438
+
439
+ * Size: 36,416 training samples
440
+ * Columns: <code>sentence_0</code> and <code>sentence_1</code>
441
+ * Approximate statistics based on the first 1000 samples:
442
+ | | sentence_0 | sentence_1 |
443
+ |:--------|:-----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
444
+ | type | string | string |
445
+ | details | <ul><li>min: 4 tokens</li><li>mean: 46.47 tokens</li><li>max: 269 tokens</li></ul> | <ul><li>min: 90 tokens</li><li>mean: 292.29 tokens</li><li>max: 512 tokens</li></ul> |
446
+ * Samples:
447
+ | sentence_0 | sentence_1 |
448
+ |:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
449
+ | <code>digital twin; eddy current; electrical-mechanical response; mechanical property monitoring; multiscale modeling; plastic deformation; constitutive models; eddy current testing; electric network analysis; electric network parameters; plasticity testing; surface discharges; eddy-current; electrical-mechanical response; electromagnetics; mechanical; mechanical property monitoring; mechanical response; modelling framework; monitoring system; multiscale modeling; property; constitutive equations</code> | <code>this study aims to develop a thermodynamic modeling framework for the electromagnetic-plastic deformation response coupled with circuit analysis. to accomplish this objective, we derived the thermodynamic balance laws for materials exposed to electromagnetic fields while undergoing plastic deformation. the balance laws serve as the foundation for refining the connection between the plastic deformation and electrical conductivity of materials. this study also modeled the relationship between dislocation density and matthiessen's rule. the constitutive equations were subsequently implemented into a crystal plasticity model, thereby calibrating and validating the model. the derived modeling framework considers the 1st and 2nd laws of thermodynamics. the model was then transformed into a circuit model for a monitoring system by formulating equations to analyze the changes in material impedance resulting from the evolution of plastic deformation. this lays the groundwork for creating a moni...</code> |
450
+ | <code>mechanism of the failed eruption of an intermediate solar filament</code> | <code>solar filament eruptions can generate coronal mass ejections (cmes), which are huge threats to space weather. thus, we need to understand their underlying mechanisms. although many authors have studied the mechanisms for several decades, we still do not fully understand in what conditions a filament can erupt to become a cme or not. previous studies have discussed extensively why a highly twisted and already erupted filament will be interrupted and considered that a strong overlying constraint field seems to be the key factor. however, few of them study filaments in the weak field, namely, quiescent filaments, as it is too hard to reconstruct the magnetic configuration there. here we show a case study, in which we can fully reconstruct the configuration of an intermediate filament with the mhd-relaxation extrapolation model and discuss its initial eruption and eventual failure. by analyzing the magnetic configuration, we suggest that the reconnection between the erupting magnetic flux ...</code> |
451
+ | <code>long-term earth magnetosphere science orbit with earth-moon resonance orbit</code> | <code>we introduce the long-term earth magnetosphere science orbits designed to maintain a fixed orientation relative to earth's magnetosphere over extended durations. by leveraging the earth-moon resonant orbits, the spacecraft's argument of periapsis is aligned with the orientation of earth's magnetosphere, thereby enabling continuous observations. three specific earth–moon resonant orbits, characterized by distinct values of the jacobi integral, are identified to exhibit these properties of stable, magnetosphere-aligned evolution. this approach facilitates sustained monitoring of large-scale magnetospheric dynamics and opens new opportunities for focused science objectives. these include studying the interaction between the earth and the moon in shaping magnetospheric boundaries and probing magnetospheric vortices and other transient phenomena. the resultant long-term vantage point—achieved through careful resonance and orbital design—offers a platform for future space weather research, m...</code> |
452
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
453
+ ```json
454
+ {
455
+ "scale": 20.0,
456
+ "similarity_fct": "cos_sim",
457
+ "gather_across_devices": false
458
+ }
459
+ ```
460
+
461
+ ### Training Hyperparameters
462
+ #### Non-Default Hyperparameters
463
+
464
+ - `per_device_train_batch_size`: 64
465
+ - `per_device_eval_batch_size`: 64
466
+ - `num_train_epochs`: 2
467
+ - `multi_dataset_batch_sampler`: round_robin
468
+
469
+ #### All Hyperparameters
470
+ <details><summary>Click to expand</summary>
471
+
472
+ - `overwrite_output_dir`: False
473
+ - `do_predict`: False
474
+ - `eval_strategy`: no
475
+ - `prediction_loss_only`: True
476
+ - `per_device_train_batch_size`: 64
477
+ - `per_device_eval_batch_size`: 64
478
+ - `per_gpu_train_batch_size`: None
479
+ - `per_gpu_eval_batch_size`: None
480
+ - `gradient_accumulation_steps`: 1
481
+ - `eval_accumulation_steps`: None
482
+ - `torch_empty_cache_steps`: None
483
+ - `learning_rate`: 5e-05
484
+ - `weight_decay`: 0.0
485
+ - `adam_beta1`: 0.9
486
+ - `adam_beta2`: 0.999
487
+ - `adam_epsilon`: 1e-08
488
+ - `max_grad_norm`: 1
489
+ - `num_train_epochs`: 2
490
+ - `max_steps`: -1
491
+ - `lr_scheduler_type`: linear
492
+ - `lr_scheduler_kwargs`: {}
493
+ - `warmup_ratio`: 0.0
494
+ - `warmup_steps`: 0
495
+ - `log_level`: passive
496
+ - `log_level_replica`: warning
497
+ - `log_on_each_node`: True
498
+ - `logging_nan_inf_filter`: True
499
+ - `save_safetensors`: True
500
+ - `save_on_each_node`: False
501
+ - `save_only_model`: False
502
+ - `restore_callback_states_from_checkpoint`: False
503
+ - `no_cuda`: False
504
+ - `use_cpu`: False
505
+ - `use_mps_device`: False
506
+ - `seed`: 42
507
+ - `data_seed`: None
508
+ - `jit_mode_eval`: False
509
+ - `bf16`: False
510
+ - `fp16`: False
511
+ - `fp16_opt_level`: O1
512
+ - `half_precision_backend`: auto
513
+ - `bf16_full_eval`: False
514
+ - `fp16_full_eval`: False
515
+ - `tf32`: None
516
+ - `local_rank`: 0
517
+ - `ddp_backend`: None
518
+ - `tpu_num_cores`: None
519
+ - `tpu_metrics_debug`: False
520
+ - `debug`: []
521
+ - `dataloader_drop_last`: False
522
+ - `dataloader_num_workers`: 0
523
+ - `dataloader_prefetch_factor`: None
524
+ - `past_index`: -1
525
+ - `disable_tqdm`: False
526
+ - `remove_unused_columns`: True
527
+ - `label_names`: None
528
+ - `load_best_model_at_end`: False
529
+ - `ignore_data_skip`: False
530
+ - `fsdp`: []
531
+ - `fsdp_min_num_params`: 0
532
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
533
+ - `fsdp_transformer_layer_cls_to_wrap`: None
534
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
535
+ - `parallelism_config`: None
536
+ - `deepspeed`: None
537
+ - `label_smoothing_factor`: 0.0
538
+ - `optim`: adamw_torch_fused
539
+ - `optim_args`: None
540
+ - `adafactor`: False
541
+ - `group_by_length`: False
542
+ - `length_column_name`: length
543
+ - `project`: huggingface
544
+ - `trackio_space_id`: trackio
545
+ - `ddp_find_unused_parameters`: None
546
+ - `ddp_bucket_cap_mb`: None
547
+ - `ddp_broadcast_buffers`: False
548
+ - `dataloader_pin_memory`: True
549
+ - `dataloader_persistent_workers`: False
550
+ - `skip_memory_metrics`: True
551
+ - `use_legacy_prediction_loop`: False
552
+ - `push_to_hub`: False
553
+ - `resume_from_checkpoint`: None
554
+ - `hub_model_id`: None
555
+ - `hub_strategy`: every_save
556
+ - `hub_private_repo`: None
557
+ - `hub_always_push`: False
558
+ - `hub_revision`: None
559
+ - `gradient_checkpointing`: False
560
+ - `gradient_checkpointing_kwargs`: None
561
+ - `include_inputs_for_metrics`: False
562
+ - `include_for_metrics`: []
563
+ - `eval_do_concat_batches`: True
564
+ - `fp16_backend`: auto
565
+ - `push_to_hub_model_id`: None
566
+ - `push_to_hub_organization`: None
567
+ - `mp_parameters`:
568
+ - `auto_find_batch_size`: False
569
+ - `full_determinism`: False
570
+ - `torchdynamo`: None
571
+ - `ray_scope`: last
572
+ - `ddp_timeout`: 1800
573
+ - `torch_compile`: False
574
+ - `torch_compile_backend`: None
575
+ - `torch_compile_mode`: None
576
+ - `include_tokens_per_second`: False
577
+ - `include_num_input_tokens_seen`: no
578
+ - `neftune_noise_alpha`: None
579
+ - `optim_target_modules`: None
580
+ - `batch_eval_metrics`: False
581
+ - `eval_on_start`: False
582
+ - `use_liger_kernel`: False
583
+ - `liger_kernel_config`: None
584
+ - `eval_use_gather_object`: False
585
+ - `average_tokens_across_devices`: True
586
+ - `prompts`: None
587
+ - `batch_sampler`: batch_sampler
588
+ - `multi_dataset_batch_sampler`: round_robin
589
+ - `router_mapping`: {}
590
+ - `learning_rate_mapping`: {}
591
+
592
+ </details>
593
+
594
+ ### Training Logs
595
+ | Epoch | Step | Training Loss |
596
+ |:------:|:----:|:-------------:|
597
+ | 0.8787 | 500 | 0.216 |
598
+ | 1.7575 | 1000 | 0.0434 |
599
+
600
+
601
+ ### Framework Versions
602
+ - Python: 3.12.12
603
+ - Sentence Transformers: 5.1.2
604
+ - Transformers: 4.57.1
605
+ - PyTorch: 2.8.0+cu126
606
+ - Accelerate: 1.11.0
607
+ - Datasets: 4.0.0
608
+ - Tokenizers: 0.22.1
609
+
610
+ ## Citation
611
+
612
+ ### BibTeX
613
+
614
+ #### Sentence Transformers
615
+ ```bibtex
616
+ @inproceedings{reimers-2019-sentence-bert,
617
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
618
+ author = "Reimers, Nils and Gurevych, Iryna",
619
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
620
+ month = "11",
621
+ year = "2019",
622
+ publisher = "Association for Computational Linguistics",
623
+ url = "https://arxiv.org/abs/1908.10084",
624
+ }
625
+ ```
626
+
627
+ #### MultipleNegativesRankingLoss
628
+ ```bibtex
629
+ @misc{henderson2017efficient,
630
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
631
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
632
+ year={2017},
633
+ eprint={1705.00652},
634
+ archivePrefix={arXiv},
635
+ primaryClass={cs.CL}
636
+ }
637
+ ```
638
+
639
+ <!--
640
+ ## Glossary
641
+
642
+ *Clearly define terms in order to be accessible across audiences.*
643
+ -->
644
+
645
+ <!--
646
+ ## Model Card Authors
647
+
648
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
649
+ -->
650
+
651
+ <!--
652
+ ## Model Card Contact
653
+
654
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
655
+ -->
config.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "BertModel"
4
+ ],
5
+ "attention_probs_dropout_prob": 0.1,
6
+ "classifier_dropout": null,
7
+ "dtype": "float32",
8
+ "hidden_act": "gelu",
9
+ "hidden_dropout_prob": 0.1,
10
+ "hidden_size": 768,
11
+ "initializer_range": 0.02,
12
+ "intermediate_size": 3072,
13
+ "layer_norm_eps": 1e-12,
14
+ "max_position_embeddings": 512,
15
+ "model_type": "bert",
16
+ "num_attention_heads": 12,
17
+ "num_hidden_layers": 12,
18
+ "pad_token_id": 0,
19
+ "position_embedding_type": "absolute",
20
+ "transformers_version": "4.57.1",
21
+ "type_vocab_size": 2,
22
+ "use_cache": true,
23
+ "vocab_size": 31090
24
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_type": "SentenceTransformer",
3
+ "__version__": {
4
+ "sentence_transformers": "5.1.2",
5
+ "transformers": "4.57.1",
6
+ "pytorch": "2.8.0+cu126"
7
+ },
8
+ "prompts": {
9
+ "query": "",
10
+ "document": ""
11
+ },
12
+ "default_prompt_name": null,
13
+ "similarity_fn_name": "cosine"
14
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bf4c576d38f1ac2755d159b1a5fa118103dfe7ff2357004cb9eb739136df87e7
3
+ size 439696224
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "101": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "102": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "103": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "104": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "extra_special_tokens": {},
49
+ "mask_token": "[MASK]",
50
+ "max_length": 384,
51
+ "model_max_length": 512,
52
+ "never_split": null,
53
+ "pad_to_multiple_of": null,
54
+ "pad_token": "[PAD]",
55
+ "pad_token_type_id": 0,
56
+ "padding_side": "right",
57
+ "sep_token": "[SEP]",
58
+ "stride": 0,
59
+ "strip_accents": null,
60
+ "tokenize_chinese_chars": true,
61
+ "tokenizer_class": "BertTokenizer",
62
+ "truncation_side": "right",
63
+ "truncation_strategy": "longest_first",
64
+ "unk_token": "[UNK]"
65
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff