Initial release of SciBERT-SolarPhysics-Search (fine-tuned for solar physics)

Browse files

Files changed (11) hide show

1_Pooling/config.json +10 -0
README.md +655 -0
config.json +24 -0
config_sentence_transformers.json +14 -0
model.safetensors +3 -0
modules.json +20 -0
sentence_bert_config.json +4 -0
special_tokens_map.json +37 -0
tokenizer.json +0 -0
tokenizer_config.json +65 -0
vocab.txt +0 -0

1_Pooling/config.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+    "word_embedding_dimension": 768,
+    "pooling_mode_cls_token": false,
+    "pooling_mode_mean_tokens": true,
+    "pooling_mode_max_tokens": false,
+    "pooling_mode_mean_sqrt_len_tokens": false,
+    "pooling_mode_weightedmean_tokens": false,
+    "pooling_mode_lasttoken": false,
+    "include_prompt": true
+}

README.md ADDED Viewed

	@@ -0,0 +1,655 @@

+---
+tags:
+- sentence-transformers
+- sentence-similarity
+- feature-extraction
+- dense
+- generated_from_trainer
+- dataset_size:36416
+- loss:MultipleNegativesRankingLoss
+widget:
+- source_sentence: deep learning; magnetic field measurement; muon g-2 experiment;
+    tracking reconstruction
+  sentences:
+  - 'in this study we test 18 versions of five fundamental energy scaling laws that
+    operate in large solar flares. we express scaling laws in terms of the magnetic
+    potential field energy e<inf>p</inf>, the mean potential field strength b<inf>p</inf>,
+    the free energy e<inf>free</inf>, the dissipated magnetic flare energy e<inf>diss</inf>,
+    the magnetic length scale l, the thermal length scale l<inf>th</inf>, the mean
+    helically twisted flux tube radius r, the sunspot radius r, the emission-measure-weighted
+    flare temperature t<inf>e</inf>, the electron density n<inf>e</inf>, and the total
+    emission measure em, measured from a data set of 173 goes m- and x-class flare
+    events. the five categories of physical scaling laws include (i) a scaling law
+    of the potential field energy, (ii) a scaling law for helical twisting, (iii)
+    a scaling law for petschek-type magnetic reconnection, (iv) the rosner–tucker–vaiana
+    scaling law, and (v) the shibata–yokoyama scaling law. we test the self-consistency
+    of these theoretical scaling laws with observed parameters by requiring two criteria:
+    a cross-correlation coefficient of ccc > 0.5 between the theoretically predicted
+    scaling laws and observed values, and a linear regression fit with a slope of
+    α ≈ 1 within one standard deviation σ. these two criteria enable us (i) to corroborate
+    some existing (or modified) scaling laws, (ii) to reject other scaling laws that
+    are not consistent with the observations, (iii) to probe the dimensionality of
+    flare geometries, and (iv) to predict various energy parameters based on tested
+    scaling laws. © 2020 elsevier b.v., all rights reserved.'
+  - the run1 result of the fermilab muon g-2 experiment have shown a 4.2 standard
+    deviation between the experimental measurement and theoretical prediction of a<inf>μ</inf>,
+    strongly indicating a new physics signal. the fermilab experiment already accumulated
+    21 times more data compared to the bnl experiment. the j-parc muon g-2 experiment
+    will collect 3.5 times the statistics compared to fermilab. with the increases
+    in the collected data volume, and limited by the speed and accuracy, the existing
+    tracking reconstruction and magnetic field measurement method may not fully satisfy
+    the requirement of the experiment. the breakthrough of the deep learning inspires
+    new analysis method in the muon g-2 experiment. in this proceeding, we will present
+    some preliminary research on the tracking reconstruction based on recurrent neural
+    network (rnn), graph neural network (gnn) and the magnetic field measurement based
+    on physics informed neural network (pinn). the preliminary results show that the
+    deep learning method has enormous potential in these topics. © 2024 elsevier b.v.,
+    all rights reserved.
+  - derived from the boltzmann equation, the neutron transport equation describes
+    the motions and interactions of neutrons with nuclei in nuclear devices such as
+    nuclear reactors. the collision or fission effect are described as integral terms
+    which arrive in an integro-differential neutron transport equation (idnt). only
+    for mono-material or simple geometries conditions, elegant approximation can simplify
+    the transport equation to provide analytic solutions. to solve this integro-differential
+    equation becomes a practical engineering challenge. recent development of deep-learning
+    techniques provides a new approach to solve them but for some complicated conditions,
+    it is also time consuming. to optimize solving the integro-differential equation
+    particularly under the deep-learning method, we propose to convert the integral
+    terms in the integro-differential neutron transport equation into their corresponding
+    antiderivatives, providing a set of fixed solution constraint conditions for these
+    antiderivatives, thus yielding an exact differential neutron transport equation
+    (ednt). the paper elucidates the physical meaning of the antiderivatives and analyzes
+    the continuity and computational complexity of the new transport equation form.
+    to illustrate the significant advantage of endt, numerical validations have been
+    conducted using various numerical methods on typical benchmark problems. the numerical
+    experiments demonstrate that the ednt is compatible with various numerical methods,
+    including the finite difference method (fdm), finite volume method (fvm), and
+    pinn. compared to the idnt, the ednt offers significant efficiency advantages,
+    with reductions in computational time ranging from several times to several orders
+    of magnitude. this ednt approach may also be applicable for other integro-differential
+    transport theories such as radiative energy transport and has potential application
+    in astrophysics or other fields. © 2025 elsevier b.v., all rights reserved.
+- source_sentence: observations of equatorial plasma bubbles using a low-cost 630.0-nm
+    all-sky imager in ishigaki island, japan
+  sentences:
+  - prediction of wave propagation in vegetated waters is crucial for the design and
+    maintenance of coastal ecological protection systems. in this study, we propose
+    a physics-informed neural network (pinn) model that incorporates physical constraints
+    from the boussinesq equations for modeling wave propagation processes in vegetated
+    waters. the results demonstrate that the pinn model effectively captures the evolution
+    of regular wave propagation in rigid, non-submerged vegetated waters. compared
+    to conventional numerical models, the pinn approach offers a more efficient preprocessing
+    framework while maintaining comparable simulation accuracy with an average coefficient
+    of determination (r<sup>2</sup>) of 0.942, an average root mean square error (rmse)
+    of 1.84 × 10<sup>−3</sup> m and an average mean absolute error (mae) of 1.19 ×
+    10<sup>−3</sup> m. moreover, the parametric inference framework embedded within
+    pinn enables precise determination of the optimal drag coefficient (c<inf>d</inf>)
+    through systematic assimilation of experimental measurements. additionally, the
+    accuracy of both the simulation and the inferred c<inf>d</inf> improves as more
+    external data are integrated into the model. © 2025 elsevier b.v., all rights
+    reserved.
+  - 'here, we introduce a low-cost airglow imaging system developed for observing
+    plasma bubble signatures in 630.0-nm airglow emission from the f region of the
+    ionosphere. the system is composed of a small camera, optical filter, and fish-eye
+    lens, and is operated using free software that automatically records video from
+    the camera. a pilot system was deployed in ishigaki island in the southern part
+    of japan (lat 24.4, lon 124.4, mlat 19.6) and was operated for ~ 1.5 years from
+    2014 to 2016 corresponding to the recent solar maximum period. the pilot observations
+    demonstrated that it was difficult to identify the plasma bubble signature in
+    the raw image captured every 4 s. however, the quality of the image could be improved
+    by reducing the random noise of instrumental origin through an integration of
+    30 consecutive raw images obtained in 2 min and further by subtracting the 1-h
+    averaged background image. we compared the deviation images to those from a co-existing
+    airglow imager of omtis, which is equipped with a back-illuminated cooled ccd
+    camera with a high quantum efficiency of ~ 90%. it was confirmed that the low-cost
+    airglow imager is capable of imaging the spatial structure of plasma bubbles,
+    including their bifurcating traces. the results of these pilot observations in
+    ishigaki island will allow us to distribute the low-cost imager in a wide area
+    and construct a network for monitoring plasma bubbles and their space weather
+    impacts on satellite navigation systems.[figure not available: see fulltext.].
+    © 2020 elsevier b.v., all rights reserved.'
+  - physics-informed neural networks are used to characterize the mass transport to
+    the rotating disk electrode (rde), the most widely employed hydrodynamic electrode
+    in electroanalysis. the pinn approach was first quantitatively verified via 1d
+    simulations under the levich approximation for cyclic voltammetry and chronoamperometry,
+    allowing comparison of the results with finite difference simulations and analytical
+    equations. however, the levich approximation is only accurate for high schmidt
+    numbers (sc > 1000). the pinn approach allowed consideration of smaller sc, achieving
+    an analytical level of accuracy (error <0.1%) comparable with independent numerical
+    evaluation and confirming that the errors in the levich equation can be as high
+    as 3% when sc = 1000 for rapidly diffusing species in aqueous solution. entirely
+    novel, the pinns permit the solution of the 2d diffusion equation under cylindrical
+    geometry incorporating radial diffusion and reveal the rotating disk electrode
+    edge effect as a consequence of the nonuniform accessibility of the disc with
+    greater currents flowing near the extremities. the contribution to the total current
+    is quantified as a function of the rotation speed, disk radius, and analyte diffusion
+    coefficient. the success in extending the theory for the rotating disk electrode
+    beyond the levich equation shows that pinns can be an easier and more powerful
+    substitute for conventional methods, both analytical and simulation based. © 2023
+    elsevier b.v., all rights reserved.
+- source_sentence: aerodynamics; channel estimation; channel flow; image enhancement;
+    optical flows; stream flow; velocimeters; vorticity; wakes; computer vision; fluid-dynamics;
+    generalisation; image velocimetry; learning approach; optical-; particle image
+    velocimetry; particle images; performance; streamflow monitoring; hydrodynamics;
+    computer vision; data set; fluid dynamics; hydrodynamics; machine learning; particle
+    image velocimetry; streamflow
+  sentences:
+  - we propose a novel approach for tackling scientific problems governed by differential
+    equations, based on the concept of a physics-informed neural networks (pinns).
+    the method involves evaluating the residuals of equations on subdomains of the
+    computational zone via numerical integration. test functions and integral weights
+    are embedded within convolutional filters to extract information from these residuals.
+    our approach demonstrates exceptional parallel abilities when dealing with computational
+    zones featuring large numbers of sub-domains, proving significantly more efficient
+    than variational physics-informed neural networks with domain decomposition (hp-vpinns).
+    by utilizing domain decomposition, we can further enhance the precision of our
+    predictions when dealing with complex functions. in comparison to pinns, our approach
+    boasts superior accuracy when fitting intricate functions. additionally, we showcase
+    the efficacy of our approach in solving inverse problems, such as identifying
+    nonuniform damage distributions within materials. our proposed approach offers
+    tremendous potential for physics-informed neural networks to solve problems with
+    complex geometries or nonlinearities that require decomposing the computational
+    zone into numerous sub-domains. © 2023 elsevier b.v., all rights reserved.
+  - the cantilever beam structures, like wind turbine towers, space masts, solar wings,
+    and high-rise chimneys and buildings, are widely used engineering structures.
+    it is crucial to fast and accurately predict their dynamic responses under complicated
+    excitations. this paper establishes an improved physics-informed neural network
+    (pinn) called fourier transformation-pinn (ft-pinn) for predicting the dynamic
+    response of a cantilever beam subject to different boundary constraints and excitation
+    conditions. the core idea of the ft-pinn is to use the latin hypercube sampling
+    strategy for generating model training points and introduce multiple sets of control
+    equations with different frequencies through fourier expansion to achieve high
+    solving accuracy and efficiency for partial differential equations. two loss functions,
+    including the mean square error and mean absolute error, are included in the ft-pinn
+    for comparison. four test cases are designed to evaluate the performance of the
+    ft-pinn and classic pinn in solving dynamic equations of a cantilever beam structure
+    with different boundary and excitation conditions. it is validated that the ft-pinn
+    model proposed in this paper has higher accuracy and efficiency than the classic
+    pinn. this also provides a new approach for using pinn to handle local sharp gradients
+    and complex high-frequency problems in vibration equations. © 2025 elsevier b.v.,
+    all rights reserved.
+  - the inference of velocity fields from the displacement of objects and/or fields
+    visible within a series of consecutive images over known time intervals has been
+    explored extensively within experimental fluid dynamics. real image sequences
+    of environmental hydrodynamic flows, however, pose additional challenges for velocity
+    field inference due to factors such as lighting inhomogeneity, particle density,
+    camera orientation and stability. here we investigate the performance of classical
+    and deep learning based velocity estimation methods on three experimental datasets;
+    a hydrodynamics laboratory dataset of different flow types and two open-source
+    datasets of aerial river footage from field campaigns. the river datasets are
+    accompanied by observational datasets of in-situ measurements. in particular,
+    we investigate the generalisation of deep learning based methods from ideal training
+    conditions to real images. we consider three deep learning approaches; recurrent
+    all-pairs-field transforms (raft), a physics-informed approach and an unsupervised
+    learning approach (unliteflownet-piv). results indicate that raft, which achieves
+    state-of-the-art performance on particle image datasets, showed good generalisation
+    to the laboratory dataset and field imagery. the physics-informed approach performed
+    similarly to raft across the laboratory dataset whilst generalisation to drone-based
+    data proved challenging. across the laboratory dataset, unliteflownet-piv showed
+    good performance within wake regions but an underestimation of channel flows and
+    freestream regions with limited vorticity, also suffering under poor seeding density.
+    limited fine-tuning of unliteflownet-piv on laboratory data, however, led to improved
+    performance in these regions, indicating the potential of the unsupervised learning
+    approach for environmental flows where 2d ground truth data sources are unavailable
+    for training. © 2024 elsevier b.v., all rights reserved.
+- source_sentence: deep neural networks; remote sensing; risk perception; satellite
+    imagery; semantics; surface measurement; surface properties; temperature distribution;
+    urban planning; weather forecasting; atmospheric modeling; down-scaling; high
+    resolution; land surface; land surface temperature; multi-spectral; multi-spectral
+    satellite imagery; spectral satellites; urban areas; atmospheric temperature;
+    land surface; remote sensing; spatial resolution; surface structure; surface temperature;
+    upper atmosphere; urban planning; china
+  sentences:
+  - estimating urban surface temperature at high resolution is crucial for effective
+    urban planning for climate-driven risks. this high-resolution surface temperature
+    over broader scales can usually be obtained via satellite remote sensing for historical
+    period. however, it can be hard for future predictions. this article presents
+    a physics informed hierarchical perception (pihp) network, a novel approach for
+    accurate, high-resolution, and generalizable urban surface temperature estimation.
+    the key to our approach is leveraging the implied temperature-related physics
+    information of the land surface structure from high-resolution multispectral satellite
+    images, thus achieving precise estimation or prediction for high spatial resolution
+    urban surface temperature. specifically, a semantic category histogram is first
+    designed to describe the land surface structures. based on this, a hierarchical
+    urban surface perception network is proposed to capture the complex relationship
+    between the underlying land surface features, upper atmosphere conditions, and
+    the intracity temperature. the proposed pihp-net makes it possible to generate
+    models that can generalize across different cities, thus estimating or predicting
+    high-resolution urban surface temperature when the satellite land surface temperature
+    (lst) observation is not available. experiments over various cities in different
+    climate regions in china show, for the first time, errors less than 2 k (for most
+    of the cases) at the high resolution (60-by-60 meters grids), thus making it possible
+    to predict future intracity temperature from forcing meteorology and multispectral
+    satellite imagery. © 2022 elsevier b.v., all rights reserved.
+  - in recent years, the growing adoption of artificial intelligence across diverse
+    scientific fields has significantly increased demand for advanced semiconductor
+    chips, necessitating innovations in semiconductor material design. accurate prediction
+    of semiconductor material properties is essential for improving chip performance,
+    as these properties directly affect electrical, thermal, and mechanical characteristics.
+    traditionally, density functional theory has been the gold standard for atomic-scale
+    simulations in material property prediction; however, its high computational cost
+    limits scalability. molecular dynamics simulations provide a scalable alternative
+    by leveraging the power of machine learning force fields (mlffs); however, semiconductor
+    systems present unique challenges due to non-equilibrium dynamics, surface defects,
+    and impurities. these factors often result in out-of-distribution (ood) atomic
+    configurations, which can significantly degrade model performance. to address
+    this challenge, we propose physics-informed sharpness-aware minimization (pi-sam),
+    a novel framework designed to enhance the prediction of semiconductor material
+    properties across diverse datasets and challenging ood scenarios. specifically,
+    pi-sam leverages sharpness-aware minimization to achieve flatter loss minima,
+    improving the model's generalization. additionally, it incorporates physics-informed
+    regularizations to enforce energy-force consistency and account for potential
+    energy surface curvature, ensuring alignment with the underlying physical principles
+    governing semiconductor behavior. experimental results demonstrate that our pi-sam
+    outperforms competing methods, especially on ood datasets, underscoring its effectiveness
+    in improving generalization. © 2025 elsevier b.v., all rights reserved.
+  - this letter presents a novel convolutional neural network (cnn)-based methodology
+    for robust and accurate open-circuit fault detection and submodule (sm) localization
+    in modular multilevel converters. instead of an end-to-end classifier, the proposed
+    method employs the cnn as a physics-informed feature extractor to enhance a foundational
+    theoretical model by robustly estimating switching frequency harmonic components
+    from arm voltage measurements. crucially, the cnn effectively mitigates the detrimental
+    impacts of measurement noise and sampling frequency variations. this method offers
+    low sensor requirements, adaptability to diverse operating conditions, and high
+    computational efficiency. simulation results demonstrate a 61.3% overall performance
+    improvement, showcasing enhanced detection speed, sm localization accuracy, and
+    robustness compared to the theoretical model under practical constraints. experimental
+    validation on a laboratory prototype further substantiates these improvements,
+    achieving fault detection and localization on average 15 ms and 22.5 ms faster
+    than the baseline theoretical model respectively, showcasing its practical applicability.
+    © 2025 elsevier b.v., all rights reserved.
+- source_sentence: data-driven ringed residual u-net scheme for full waveform inversion
+  sentences:
+  - physics-informed neural network (pinn) has aroused broad interest among fluid
+    simulation researchers in recent years, representing a novel paradigm in this
+    area where governing differential equations are encoded to provide a hybrid physics-based
+    and data-driven deep learning framework. however, the lack of enough validations
+    on more complex flow problems has restricted further development and application
+    of pinn. our research applies the pinn to simulate a two-dimensional indoor turbulent
+    airflow case to address the issue. although it is still quite challenging for
+    the pinn to reach an ideal accuracy for the problem through a single purely physics-driven
+    training, our research finds that the pinn prediction accuracy can be significantly
+    improved by exploiting its ability to assimilate high-fidelity data during training,
+    by which the prediction accuracy of pinn is enhanced by 53.2% for pressure, 34.6%
+    for horizontal velocity, and 40.4% for vertical velocity, respectively. meanwhile,
+    the influence of data points number is also studied, which suggests a balance
+    between prediction accuracy and data acquisition cost can be reached. last but
+    not least, applying reynolds-averaged navier-stokes (rans) equations and turbulence
+    model has also been proved to improve prediction accuracy remarkably. after embedding
+    the standard k-ε model to the pinn, the prediction accuracy was enhanced by 82.9%
+    for pressure, 59.4% for horizontal velocity, and 70.5% for vertical velocity,
+    respectively. these results suggest a promising step toward applications of pinn
+    to more complex flow configurations. © 2024 elsevier b.v., all rights reserved.
+  - amidst the increasing penetration of intermittent renewable generation and the
+    persistent growth of load demands, voltage stability assumes a pivotal concern
+    in smart grids. the real-time voltage stability assessment (vsa) under time-varying
+    operating conditions becomes paramount. recent strides in real-time vsa, utilizing
+    intelligent data-driven learning with measurements, mark significant progress.
+    however, a critical and unresolved challenge with purely data-driven methods is
+    their susceptibility to performance degradation, especially in out-of-sample scenarios.
+    to this end, this article presents a physics-informed guided deep learning (pgdl)
+    paradigm for the practical and accurate assessment of voltage stability margins
+    (vsms), leveraging both physics-based and data-driven techniques. the pgdl architecture
+    includes an improved temporal convolutional network (itcn) for the automatic extraction
+    of representative temporal features necessary for vsa from measurement data. additionally,
+    pgdl integrates physics-based features informed by domain-specific knowledge.
+    a feature fusion scheme is then devised to merge deep-learned features with pertinent
+    physics-based attributes. acknowledging the unique contributions of these feature
+    modalities to vsa, a novel twin attention mechanism (tam) is proposed to adaptively
+    adjust attention weights, prioritizing learned features and thus optimizing vsa
+    performance. substantial experiments on power systems of different scales, coupled
+    with comparative analyses against state-of-the-art benchmarks, illustrate the
+    efficacy and merits of the proposed approach. © 2025 elsevier b.v., all rights
+    reserved.
+  - full waveform inversion (fwi) is a powerful means for accurately reconstructing
+    subsurface velocity models at high resolution. yet it is nevertheless a nonlinear
+    and ill-posed problem. physics-driven fwi methods employ gradient-based optimization
+    algorithms to minimize the error between the observed seismic data and the synthetically
+    generated seismic data. the solution may converge to a local rather than global
+    minimum. the cycle-skipping problem occurs when the synthetic data exceed a half-wavelength
+    shift relative to the observed data. fwi relies on an accurate initial velocity
+    model to mitigate the cycle-skipping problem. moreover, due to the increasing
+    size and desired resolution of seismic data, fwi costs a great deal of computational
+    time. to obviate these problems, we present a data-driven fwi scheme based on
+    a deep learning architecture called u-net. the network consists of the ringed
+    residual unit, which integrates residual propagation and residual feedback. it
+    beneficially achieves correspondence between the seismic data domain and the velocity
+    model domain. the features of the shallow layers are connected with the deep layers
+    by a skip connection to facilitate seismic data spatial information propagation
+    and utilization. they improve inversion accuracy and make the network more generalizable
+    and robust. we utilize the society of exploration geophysicists (segs)/european
+    association of geoscientists and engineers (eage) overthrust and salt models to
+    verify our proposed method's impressive performance. the experimental results
+    clearly demonstrate that the proposed method can produce high-quality velocity
+    models. compared with the conventional physics-informed fwi, it has advantages
+    in both computational time and initial model dependence. © 2024 elsevier b.v.,
+    all rights reserved.
+pipeline_tag: sentence-similarity
+library_name: sentence-transformers
+---
+# SentenceTransformer
+This is a [sentence-transformers](https://www.SBERT.net) model trained. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
+## Model Details
+### Model Description
+- **Model Type:** Sentence Transformer
+<!-- - **Base model:** [Unknown](https://huggingface.co/unknown) -->
+- **Maximum Sequence Length:** 512 tokens
+- **Output Dimensionality:** 768 dimensions
+- **Similarity Function:** Cosine Similarity
+<!-- - **Training Dataset:** Unknown -->
+<!-- - **Language:** Unknown -->
+<!-- - **License:** Unknown -->
+### Model Sources
+- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
+- **Repository:** [Sentence Transformers on GitHub](https://github.com/huggingface/sentence-transformers)
+- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
+### Full Model Architecture
+```
+SentenceTransformer(
+  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'BertModel'})
+  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+  (2): Normalize()
+)
+```
+## Usage
+### Direct Usage (Sentence Transformers)
+First install the Sentence Transformers library:
+```bash
+pip install -U sentence-transformers
+```
+Then you can load this model and run inference.
+```python
+from sentence_transformers import SentenceTransformer
+# Download from the 🤗 Hub
+model = SentenceTransformer("andreinsardi/SciBERT-SolarPhysics-Search")
+# Run inference
+sentences = [
+    'data-driven ringed residual u-net scheme for full waveform inversion',
+    "full waveform inversion (fwi) is a powerful means for accurately reconstructing subsurface velocity models at high resolution. yet it is nevertheless a nonlinear and ill-posed problem. physics-driven fwi methods employ gradient-based optimization algorithms to minimize the error between the observed seismic data and the synthetically generated seismic data. the solution may converge to a local rather than global minimum. the cycle-skipping problem occurs when the synthetic data exceed a half-wavelength shift relative to the observed data. fwi relies on an accurate initial velocity model to mitigate the cycle-skipping problem. moreover, due to the increasing size and desired resolution of seismic data, fwi costs a great deal of computational time. to obviate these problems, we present a data-driven fwi scheme based on a deep learning architecture called u-net. the network consists of the ringed residual unit, which integrates residual propagation and residual feedback. it beneficially achieves correspondence between the seismic data domain and the velocity model domain. the features of the shallow layers are connected with the deep layers by a skip connection to facilitate seismic data spatial information propagation and utilization. they improve inversion accuracy and make the network more generalizable and robust. we utilize the society of exploration geophysicists (segs)/european association of geoscientists and engineers (eage) overthrust and salt models to verify our proposed method's impressive performance. the experimental results clearly demonstrate that the proposed method can produce high-quality velocity models. compared with the conventional physics-informed fwi, it has advantages in both computational time and initial model dependence. © 2024 elsevier b.v., all rights reserved.",
+    'amidst the increasing penetration of intermittent renewable generation and the persistent growth of load demands, voltage stability assumes a pivotal concern in smart grids. the real-time voltage stability assessment (vsa) under time-varying operating conditions becomes paramount. recent strides in real-time vsa, utilizing intelligent data-driven learning with measurements, mark significant progress. however, a critical and unresolved challenge with purely data-driven methods is their susceptibility to performance degradation, especially in out-of-sample scenarios. to this end, this article presents a physics-informed guided deep learning (pgdl) paradigm for the practical and accurate assessment of voltage stability margins (vsms), leveraging both physics-based and data-driven techniques. the pgdl architecture includes an improved temporal convolutional network (itcn) for the automatic extraction of representative temporal features necessary for vsa from measurement data. additionally, pgdl integrates physics-based features informed by domain-specific knowledge. a feature fusion scheme is then devised to merge deep-learned features with pertinent physics-based attributes. acknowledging the unique contributions of these feature modalities to vsa, a novel twin attention mechanism (tam) is proposed to adaptively adjust attention weights, prioritizing learned features and thus optimizing vsa performance. substantial experiments on power systems of different scales, coupled with comparative analyses against state-of-the-art benchmarks, illustrate the efficacy and merits of the proposed approach. © 2025 elsevier b.v., all rights reserved.',
+]
+embeddings = model.encode(sentences)
+print(embeddings.shape)
+# [3, 768]
+# Get the similarity scores for the embeddings
+similarities = model.similarity(embeddings, embeddings)
+print(similarities)
+# tensor([[1.0000, 0.5779, 0.0253],
+#         [0.5779, 1.0000, 0.0727],
+#         [0.0253, 0.0727, 1.0000]])
+```
+<!--
+### Direct Usage (Transformers)
+<details><summary>Click to see the direct usage in Transformers</summary>
+</details>
+-->
+<!--
+### Downstream Usage (Sentence Transformers)
+You can finetune this model on your own dataset.
+<details><summary>Click to expand</summary>
+</details>
+-->
+<!--
+### Out-of-Scope Use
+*List how the model may foreseeably be misused and address what users ought not to do with the model.*
+-->
+<!--
+## Bias, Risks and Limitations
+*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
+-->
+<!--
+### Recommendations
+*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
+-->
+## Training Details
+### Training Dataset
+#### Unnamed Dataset
+* Size: 36,416 training samples
+* Columns: <code>sentence_0</code> and <code>sentence_1</code>
+* Approximate statistics based on the first 1000 samples:
+  |         | sentence_0                                                                         | sentence_1                                                                           |
+  |:--------|:-----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
+  | type    | string                                                                             | string                                                                               |
+  | details | <ul><li>min: 4 tokens</li><li>mean: 46.47 tokens</li><li>max: 269 tokens</li></ul> | <ul><li>min: 90 tokens</li><li>mean: 292.29 tokens</li><li>max: 512 tokens</li></ul> |
+* Samples:
+  | sentence_0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | sentence_1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
+  |:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+  | <code>digital twin; eddy current; electrical-mechanical response; mechanical property monitoring; multiscale modeling; plastic deformation; constitutive models; eddy current testing; electric network analysis; electric network parameters; plasticity testing; surface discharges; eddy-current; electrical-mechanical response; electromagnetics; mechanical; mechanical property monitoring; mechanical response; modelling framework; monitoring system; multiscale modeling; property; constitutive equations</code> | <code>this study aims to develop a thermodynamic modeling framework for the electromagnetic-plastic deformation response coupled with circuit analysis. to accomplish this objective, we derived the thermodynamic balance laws for materials exposed to electromagnetic fields while undergoing plastic deformation. the balance laws serve as the foundation for refining the connection between the plastic deformation and electrical conductivity of materials. this study also modeled the relationship between dislocation density and matthiessen's rule. the constitutive equations were subsequently implemented into a crystal plasticity model, thereby calibrating and validating the model. the derived modeling framework considers the 1st and 2nd laws of thermodynamics. the model was then transformed into a circuit model for a monitoring system by formulating equations to analyze the changes in material impedance resulting from the evolution of plastic deformation. this lays the groundwork for creating a moni...</code> |
+  | <code>mechanism of the failed eruption of an intermediate solar filament</code>                                                                                                                                                                                                                                                                                                                                                                                                                                              | <code>solar filament eruptions can generate coronal mass ejections (cmes), which are huge threats to space weather. thus, we need to understand their underlying mechanisms. although many authors have studied the mechanisms for several decades, we still do not fully understand in what conditions a filament can erupt to become a cme or not. previous studies have discussed extensively why a highly twisted and already erupted filament will be interrupted and considered that a strong overlying constraint field seems to be the key factor. however, few of them study filaments in the weak field, namely, quiescent filaments, as it is too hard to reconstruct the magnetic configuration there. here we show a case study, in which we can fully reconstruct the configuration of an intermediate filament with the mhd-relaxation extrapolation model and discuss its initial eruption and eventual failure. by analyzing the magnetic configuration, we suggest that the reconnection between the erupting magnetic flux ...</code> |
+  | <code>long-term earth magnetosphere science orbit with earth-moon resonance orbit</code>                                                                                                                                                                                                                                                                                                                                                                                                                                     | <code>we introduce the long-term earth magnetosphere science orbits designed to maintain a fixed orientation relative to earth's magnetosphere over extended durations. by leveraging the earth-moon resonant orbits, the spacecraft's argument of periapsis is aligned with the orientation of earth's magnetosphere, thereby enabling continuous observations. three specific earth–moon resonant orbits, characterized by distinct values of the jacobi integral, are identified to exhibit these properties of stable, magnetosphere-aligned evolution. this approach facilitates sustained monitoring of large-scale magnetospheric dynamics and opens new opportunities for focused science objectives. these include studying the interaction between the earth and the moon in shaping magnetospheric boundaries and probing magnetospheric vortices and other transient phenomena. the resultant long-term vantage point—achieved through careful resonance and orbital design—offers a platform for future space weather research, m...</code> |
+* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
+  ```json
+  {
+      "scale": 20.0,
+      "similarity_fct": "cos_sim",
+      "gather_across_devices": false
+  }
+  ```
+### Training Hyperparameters
+#### Non-Default Hyperparameters
+- `per_device_train_batch_size`: 64
+- `per_device_eval_batch_size`: 64
+- `num_train_epochs`: 2
+- `multi_dataset_batch_sampler`: round_robin
+#### All Hyperparameters
+<details><summary>Click to expand</summary>
+- `overwrite_output_dir`: False
+- `do_predict`: False
+- `eval_strategy`: no
+- `prediction_loss_only`: True
+- `per_device_train_batch_size`: 64
+- `per_device_eval_batch_size`: 64
+- `per_gpu_train_batch_size`: None
+- `per_gpu_eval_batch_size`: None
+- `gradient_accumulation_steps`: 1
+- `eval_accumulation_steps`: None
+- `torch_empty_cache_steps`: None
+- `learning_rate`: 5e-05
+- `weight_decay`: 0.0
+- `adam_beta1`: 0.9
+- `adam_beta2`: 0.999
+- `adam_epsilon`: 1e-08
+- `max_grad_norm`: 1
+- `num_train_epochs`: 2
+- `max_steps`: -1
+- `lr_scheduler_type`: linear
+- `lr_scheduler_kwargs`: {}
+- `warmup_ratio`: 0.0
+- `warmup_steps`: 0
+- `log_level`: passive
+- `log_level_replica`: warning
+- `log_on_each_node`: True
+- `logging_nan_inf_filter`: True
+- `save_safetensors`: True
+- `save_on_each_node`: False
+- `save_only_model`: False
+- `restore_callback_states_from_checkpoint`: False
+- `no_cuda`: False
+- `use_cpu`: False
+- `use_mps_device`: False
+- `seed`: 42
+- `data_seed`: None
+- `jit_mode_eval`: False
+- `bf16`: False
+- `fp16`: False
+- `fp16_opt_level`: O1
+- `half_precision_backend`: auto
+- `bf16_full_eval`: False
+- `fp16_full_eval`: False
+- `tf32`: None
+- `local_rank`: 0
+- `ddp_backend`: None
+- `tpu_num_cores`: None
+- `tpu_metrics_debug`: False
+- `debug`: []
+- `dataloader_drop_last`: False
+- `dataloader_num_workers`: 0
+- `dataloader_prefetch_factor`: None
+- `past_index`: -1
+- `disable_tqdm`: False
+- `remove_unused_columns`: True
+- `label_names`: None
+- `load_best_model_at_end`: False
+- `ignore_data_skip`: False
+- `fsdp`: []
+- `fsdp_min_num_params`: 0
+- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
+- `fsdp_transformer_layer_cls_to_wrap`: None
+- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
+- `parallelism_config`: None
+- `deepspeed`: None
+- `label_smoothing_factor`: 0.0
+- `optim`: adamw_torch_fused
+- `optim_args`: None
+- `adafactor`: False
+- `group_by_length`: False
+- `length_column_name`: length
+- `project`: huggingface
+- `trackio_space_id`: trackio
+- `ddp_find_unused_parameters`: None
+- `ddp_bucket_cap_mb`: None
+- `ddp_broadcast_buffers`: False
+- `dataloader_pin_memory`: True
+- `dataloader_persistent_workers`: False
+- `skip_memory_metrics`: True
+- `use_legacy_prediction_loop`: False
+- `push_to_hub`: False
+- `resume_from_checkpoint`: None
+- `hub_model_id`: None
+- `hub_strategy`: every_save
+- `hub_private_repo`: None
+- `hub_always_push`: False
+- `hub_revision`: None
+- `gradient_checkpointing`: False
+- `gradient_checkpointing_kwargs`: None
+- `include_inputs_for_metrics`: False
+- `include_for_metrics`: []
+- `eval_do_concat_batches`: True
+- `fp16_backend`: auto
+- `push_to_hub_model_id`: None
+- `push_to_hub_organization`: None
+- `mp_parameters`:
+- `auto_find_batch_size`: False
+- `full_determinism`: False
+- `torchdynamo`: None
+- `ray_scope`: last
+- `ddp_timeout`: 1800
+- `torch_compile`: False
+- `torch_compile_backend`: None
+- `torch_compile_mode`: None
+- `include_tokens_per_second`: False
+- `include_num_input_tokens_seen`: no
+- `neftune_noise_alpha`: None
+- `optim_target_modules`: None
+- `batch_eval_metrics`: False
+- `eval_on_start`: False
+- `use_liger_kernel`: False
+- `liger_kernel_config`: None
+- `eval_use_gather_object`: False
+- `average_tokens_across_devices`: True
+- `prompts`: None
+- `batch_sampler`: batch_sampler
+- `multi_dataset_batch_sampler`: round_robin
+- `router_mapping`: {}
+- `learning_rate_mapping`: {}
+</details>
+### Training Logs
+| Epoch  | Step | Training Loss |
+|:------:|:----:|:-------------:|
+| 0.8787 | 500  | 0.216         |
+| 1.7575 | 1000 | 0.0434        |
+### Framework Versions
+- Python: 3.12.12
+- Sentence Transformers: 5.1.2
+- Transformers: 4.57.1
+- PyTorch: 2.8.0+cu126
+- Accelerate: 1.11.0
+- Datasets: 4.0.0
+- Tokenizers: 0.22.1
+## Citation
+### BibTeX
+#### Sentence Transformers
+```bibtex
+@inproceedings{reimers-2019-sentence-bert,
+    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
+    author = "Reimers, Nils and Gurevych, Iryna",
+    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
+    month = "11",
+    year = "2019",
+    publisher = "Association for Computational Linguistics",
+    url = "https://arxiv.org/abs/1908.10084",
+}
+```
+#### MultipleNegativesRankingLoss
+```bibtex
+@misc{henderson2017efficient,
+    title={Efficient Natural Language Response Suggestion for Smart Reply},
+    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
+    year={2017},
+    eprint={1705.00652},
+    archivePrefix={arXiv},
+    primaryClass={cs.CL}
+}
+```
+<!--
+## Glossary
+*Clearly define terms in order to be accessible across audiences.*
+-->
+<!--
+## Model Card Authors
+*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
+-->
+<!--
+## Model Card Contact
+*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
+-->

config.json ADDED Viewed

	@@ -0,0 +1,24 @@

+{
+  "architectures": [
+    "BertModel"
+  ],
+  "attention_probs_dropout_prob": 0.1,
+  "classifier_dropout": null,
+  "dtype": "float32",
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.1,
+  "hidden_size": 768,
+  "initializer_range": 0.02,
+  "intermediate_size": 3072,
+  "layer_norm_eps": 1e-12,
+  "max_position_embeddings": 512,
+  "model_type": "bert",
+  "num_attention_heads": 12,
+  "num_hidden_layers": 12,
+  "pad_token_id": 0,
+  "position_embedding_type": "absolute",
+  "transformers_version": "4.57.1",
+  "type_vocab_size": 2,
+  "use_cache": true,
+  "vocab_size": 31090
+}

config_sentence_transformers.json ADDED Viewed

	@@ -0,0 +1,14 @@

+{
+  "model_type": "SentenceTransformer",
+  "__version__": {
+    "sentence_transformers": "5.1.2",
+    "transformers": "4.57.1",
+    "pytorch": "2.8.0+cu126"
+  },
+  "prompts": {
+    "query": "",
+    "document": ""
+  },
+  "default_prompt_name": null,
+  "similarity_fn_name": "cosine"
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:bf4c576d38f1ac2755d159b1a5fa118103dfe7ff2357004cb9eb739136df87e7
+size 439696224

modules.json ADDED Viewed

	@@ -0,0 +1,20 @@

+[
+  {
+    "idx": 0,
+    "name": "0",
+    "path": "",
+    "type": "sentence_transformers.models.Transformer"
+  },
+  {
+    "idx": 1,
+    "name": "1",
+    "path": "1_Pooling",
+    "type": "sentence_transformers.models.Pooling"
+  },
+  {
+    "idx": 2,
+    "name": "2",
+    "path": "2_Normalize",
+    "type": "sentence_transformers.models.Normalize"
+  }
+]

sentence_bert_config.json ADDED Viewed

	@@ -0,0 +1,4 @@

+{
+    "max_seq_length": 512,
+    "do_lower_case": false
+}

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,37 @@

+{
+  "cls_token": {
+    "content": "[CLS]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "mask_token": {
+    "content": "[MASK]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "[PAD]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "sep_token": {
+    "content": "[SEP]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "[UNK]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,65 @@

+{
+  "added_tokens_decoder": {
+    "0": {
+      "content": "[PAD]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "101": {
+      "content": "[UNK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "102": {
+      "content": "[CLS]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "103": {
+      "content": "[SEP]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "104": {
+      "content": "[MASK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "clean_up_tokenization_spaces": true,
+  "cls_token": "[CLS]",
+  "do_basic_tokenize": true,
+  "do_lower_case": true,
+  "extra_special_tokens": {},
+  "mask_token": "[MASK]",
+  "max_length": 384,
+  "model_max_length": 512,
+  "never_split": null,
+  "pad_to_multiple_of": null,
+  "pad_token": "[PAD]",
+  "pad_token_type_id": 0,
+  "padding_side": "right",
+  "sep_token": "[SEP]",
+  "stride": 0,
+  "strip_accents": null,
+  "tokenize_chinese_chars": true,
+  "tokenizer_class": "BertTokenizer",
+  "truncation_side": "right",
+  "truncation_strategy": "longest_first",
+  "unk_token": "[UNK]"
+}

vocab.txt ADDED Viewed

The diff for this file is too large to render. See raw diff