Instructions to use Sampath1987/EnergyEmbed-nv1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Sampath1987/EnergyEmbed-nv1 with sentence-transformers:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("Sampath1987/EnergyEmbed-nv1", trust_remote_code=True)

sentences = [
    "How does the volume and flow rate of cement affect the cementing process in oil and gas wells?",
    "Overview of International Offshore Decommissioning Regulations: Volume 1 – Facilities  \nThe Petroleum Code does not make any specific requirements in relation to whether\noffshore facilities need to be removed following cessation of production. However, as a\nsignatory to UNCLOS III/IMO and the Abidjan Convention, the Republic of Guinea is bound\nby these international and regional agreements.  \nThe Environment Code is enforced by the Ministry of Natural Resources, Energy and\nEnvironment. Its key aims are to protect the environment while promoting the use of\nnatural resources. Title 2/Chapter III of the Environment Code deals with maritime waters\nand their resources and Title 5 deals with EIA requirements for major projects.",
    "Well Cementing design is a critical component of Well engineering, as efficient cement design ensures the protection of the casing assemblies from fluid corrosion, and ensures the mechanical support of the well. It also ensures that hydraulic communication between different zones is prevented.\nWell abandonment is also critical as the design of the slurry required needs to be designed to efficiently keep hydrocarbons in the wellbore and prevent any immediate, short term or long term migration of hydrocarbons to surface.\nThere are numerous studies and publications discussing the causes of gas migration after primary cement jobs and well abandonment, some of the causes of gas migration have been linked to poor fluid loss control, poor drilling fluid displacement (reduces seal efficiency at the interfaces), and long cement setting times which allows time for gas to percolate through the partially set cement slurry.\nThis paper highlights the engineering methods, and how they can be used to properly evaluate the cement slurry design to ensure that gas flow through the cement lattice is completely prevented. It assumes that all other issues which involving poor execution (mud displacement, poor slurry mixing, use of low quality materials and chemicals, human errors), are annulled.\nThe correlations/equations discussed and used for the evaluation of the abandoned case study well (Well XRT) are the Gas Flow Potential, Slurry Performance Number, Hydrostatic Number and Pressure Decay Limit Parameter. Results from critical evaluation with these equations confirmed that the Well XRT was efficiently abandoned.\nThe paper further recommends that these equations should be used by Well Engineers be used to evaluate slurry designs for casing cementing and abandonment operations, as they will help ensure that the mechanical and hydraulic isolation is efficiently designed for and achieved.",
    "This article discusses the big volume top job of oil and gas wells, specifically wells A and B which were drilled in Kuwait. The process involves pumping a larger volume of mixture of cement, water, and other additives into the annulus to seal the wellbore, prevent fluid migration and provide structural support.\nThe article highlights the need for precision and control to ensure proper placement. The conventional methods like two stage method and lightweight systems used for the wells A and B were not sufficient to get the good zonal isolation throughout the well bore due to the lower fracture gradient observed in this well. The successful zonal isolation was achieved due to pumping large volumes from the annulus.\nThe wells were under losses before and during the primary cementing process, which was difficult to achieve the desired top of cement (up to surface). To overcome these challenges, the well was cemented in unique unconventional method which is pumping the bigger volumes from the annulus to cover up to loss zone and eliminate any other fluid column in between. Cement Bond Log (CBL) and Variable Density Log (VDL) were taken after a 24 Hrs wait on cement and the results were good, indicating that the wellbore is properly sealed, and the well is structurally stable.\nPumping large volumes of cement through the annulus can be challenging, as it requires a high level of precision and control to ensure that the cement is properly placed. This process is different to that of conventional top jobs carried out by installing cement baskets. The intention of conventional top job methods is to just seal the annulus at surface without paying any attention to mud caps left in the open hole. This has resulted in remedial jobs which has increased the cost or reduced the life span of wells.\nOne of the key considerations when pumping cement through the annulus is the volumes considered and thickening time. The rate of flow must be carefully controlled to ensure that the cement is properly mixed along with the additives and that it does not become too thick or too thin. In addition, the rate of flow must be adjusted to account for the variations in pressure and temperature that occur as the cement moves through the well.\nCementing also plays an important role in preventing fluid migration. If the well is not properly sealed, there might be inter communication of the fluids which affects the life of the well. The extremely lower frac gradient wells undergo losses Inspite of using the conventional methods (light weight systems and two stage method) and is the reason to follow the unconventional method of cementing from the annulus so that entire well bore from shoe to the surface is properly sealed with cement. This will result in reducing the unnecessary remedial jobs during the life of the well."
]
embeddings = model.encode(sentences)

similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]

Notebooks
Google Colab
Kaggle

SentenceTransformer based on Alibaba-NLP/gte-multilingual-base

This is a sentence-transformers model finetuned from Alibaba-NLP/gte-multilingual-base on the offshore_energy_v1 dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: Alibaba-NLP/gte-multilingual-base
Maximum Sequence Length: 8192 tokens
Output Dimensionality: 768 dimensions
Similarity Function: Cosine Similarity
Training Dataset:
- offshore_energy_v1

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False, 'architecture': 'NewModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Sampath1987/EnergyEmbed-nv1")
# Run inference
sentences = [
    'How does the predictive reservoir effectiveness model aid in the exploration of the Winduck Interval?',
    'The latest Silurian to Early Devonian Winduck Interval of the extensive but poorly exposed Neckarboo Sub-basin, consists of several thousands of metres of a quartzose siliciclastic sandstone succession that has been divided into three sequence divisions called (in ascending parasequence order) parasequence A (coarse-grained quartz sandstone), parasequence B (fining-upward succession of sandstone with siltstone and sandstone beds thicken upward) and parasequence C (coarse-grained quartz sandstone with siltstone and interbedded calcareous sandstones). These three geophysically defined parasequences are separated by slightly discordant erosion surfaces. The erosion surfaces are characterised by abrupt breaks at the top of parasequences A and B and the surface at the top of parasequence B represents relatively local erosion. The top of parasequence C is marked by a major unconformity with the Snake Cave Interval. Gamma ray and self-potential signatures within the parasequences can be correlated throughout the Neckarboo Sub-basin. The three sequence divisions are further subdivided into depositional parasequences, which are readily recognised from core sedimentology and electrofacies analysis. The parasequences provide the framework for a detailed sedimentological analysis, which focuses on the identification of lithofacies successions and parasequences. Petrophysical data are recorded and their relationships to the depositional parasequences are discussed. This paper presents a predictive reservoir effectiveness model that has been developed to aid exploration of the Winduck Interval. The aim is to find the distribution of parasequences (based on variations in porosity, net effective thickness and lithofacies with burial depth) and to provide a dataset for lithostratigraphic units within the Winduck Interval and parameter input for exploration prospect evaluation. Parasequence stratigraphic analyses were obtained where there is good lithofacies control. The porosity and permeability results have been analyzed in a number of parasequences and poor reservoir quality may be due to the effects of structure and fluid flow. This approach provides for better and more precise stratigraphic trap analysis.',
    'In this multi-Tcf subsea gas development off the North West coast of Australia, reservoir simulation supports the key business decisions and processes. An important factor when providing production forecasts is ensuring that a range of possible outcomes (low-mid-high) are captured accurately by the models. The output from these models may then be used by decision makers for evaluating different developments and scenarios. The design of experiments (DoE) is commonly employed to aid the evaluation of subsurface uncertainties and characterise the impact and influence to key model outcomes supporting development decisions.\nField production performance is often driven by uncertainty in reservoir outcome. This paper is helpful to practitioners involved in any computer modelling of petroleum reservoirs who are interested in capturing the uncertainty inherent in a field and building an appropriate workflow for the development and sensitivity of a range of models. Both model building and using DoE to evaluate developments and Value of Information (VoI) studies for reservoir management will be shared. Integrated DoE focusing on static, dynamic and well-based uncertainties will be illustrated.\nResults will cover:\n–\nLessons learned and best practices using ED (Experimental Design) to generate low-mid-high reservoir simulation models\n–\nUnderstanding reservoir and well based uncertainties separately\n–\nEvaluating incremental field developments using ED\n–\nUtilizing ED to anticipate range of surveillance responses\nFew papers exist on the integrated application of ED to giant gas fields using reservoir simulation. Firstly, this case study will highlight some pitfalls to avoid during the workflow. Secondly, the authors will discuss the important issue of how to integrate or separate static, dynamic, well and facility based uncertainties. Thirdly, the work will show the unique application of ED in VoI and field development scoping.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.6207, 0.1418],
#         [0.6207, 1.0000, 0.0860],
#         [0.1418, 0.0860, 1.0000]])

Evaluation

Metrics

Triplet

Dataset: ai-job-validation
Evaluated with TripletEvaluator

Metric	Value
cosine_accuracy	0.98

Training Details

Training Dataset

offshore_energy_v1

Dataset: offshore_energy_v1 at d4682d4
Size: 44,838 training samples
Columns: anchor, positive, and negative

Approximate statistics based on the first 1000 samples:

	anchor	positive	negative
type	string	string	string
details	min: 13 tokens mean: 24.54 tokens max: 46 tokens	min: 33 tokens mean: 430.25 tokens max: 1027 tokens	min: 45 tokens mean: 423.92 tokens max: 1204 tokens

Samples:

anchor	positive	negative
`What benefits were realized through the adoption of remote operations services in the North Sea?`	The North Sea has always been a pioneer for the adoption of remote operations services (ROS) in offshore drilling applications. Drilling services such as Measurement While Drilling (MWD), Logging While Drilling (LWD) and/or mud logging (ML) have been performed with an element of ROS for over the last two decades. Early adoption of these remote services delivered initial benefits to operators such as reducing HSE risks related to the travel and accommodation of field service employees at offshore rig sites. Meanwhile service companies were able to explore the added efficiencies gained by having multi-skilled employees providing a higher level of support to customers while also gaining additional agility to manage their personnel through tighter market cycles. The mutual benefit of this early adoption created a solid foundation for ROS to expand the scope of influence in drilling operations to include Directional Drilling (DD). Despite the maturity of ROS within a select community of ope...	A new program for the development of graduate engineers has been implemented in Denmark on a stimulation vessel in the North Sea. It is designed to provide graduate engineers with a three-year period of extensive experience in offshore operations, knowledge of equipment and designing effective stimulation jobs. There are many components to the program that address training, skills, demonstration of capabilities and evidence of competence. These are essential components that ultimately lead to improved operational performance and highlights. The North Sea oil and gas industry requires a constant effort to maintain the engineering skills of its offshore workers so vital to continued success. Paradoxically, there are numerous factors that hinder on site development of young engineering talent in the North Sea. There is a lack of offshore accommodation that often restricts onsite time for trainees. This is exacerbated by a low frequency of many operations compared to other provinces in the...
`What is the estimated storage capacity for CO2 in the analyzed study area?`	The oil and gas industry is a significant contributor to carbon dioxide (CO2) emissions, which have a major impact on climate change. Geoscientists in the industry play a crucial role in mitigating climate change by identifying and evaluating potential CO2 storage sites, monitoring CO2 behavior after injection, and exploring CO2 enhanced oil recovery (EOR) techniques. CO2 -EOR involves injecting CO2 into depleted oil reservoirs to increase oil production. Reservoir characterization using well log and seismic data analysis helps determine storage capacity, containment, and injectivity of reservoirs for CO2 sequestration and EOR. In this study, two sand reservoirs (RES 1 and RES 2) were analyzed, with RES 2 being considered more suitable for CO2 sequestration and CO2 -EOR. The estimated storage capacity of the study area was approximately 40 million metric tons (MT). Assessments of fault sealing capacity and reservoir properties were conducted to validate storage potential. Further inves...	Transported and geologically stored CO2 contains several impurities that depend on its source and associated capture technology. Impurities in anthropogenic CO2 can have damaging impacts on the different elements of a CCS system, which must be considered when developing a CO2 specification (Table 1). Thus, characterising all the impurities and determining the required purity of the CO2 mixture is critically important for the safe design and operation of CCS transport and storage systems. It is important to note that CO2 specifications relate to normal operations. Short-term excursions outside of the recommended maximum concentrations for each impurity may be permissible provided they do not lead to health and safety risks and / or risks to the mechanical integrity of the asset.
`What is the role of a Preventive Maintenance Program (PMP) in enhancing the reliability of Electrical Submersible Pumps (ESPs)?`	The reliability of Electrical Submersible Pumps (ESPs) is a critical target for companies managing artificially lifted fields. While efforts to continuously improve the reliability in the downhole system are crucial, it is necessary to focus on the health and long-term reliability of the ESP surface equipment. One effective approach toward achieving this goal is through conducting a comprehensive Preventive Maintenance Program (PMP) for the different components of the ESP surface system. An ESP PMP should be managed without jeopardizing production strategy. The design of the PMP must meet the production demand while maintaining the best-in-class PMP practices. The well operating condition, frequency, weather, well location, required periodic inspection and preemptive servicing and replacement of surface equipment components must be considered, based on studied criterion. The design of the PMP considers equipment upgrades and thermal imaging surveillance to guarantee healthy electrical ...	A family of exciting new Electric Submersible Pump (ESP) technologies promises to radically improve the development economics of many oilfields and field extensions. This technology is particularly relevant to prospects in the range 5-100 million barrels reserves, which are located greater than 15 kilometres from existing platforms and often suffer uncertainties on reservoir performance (pressure, sweep, heterogeneities inflow performance etc.). Prospects in that category generally offer mediocre to inadequate economics or unacceptable risks of ‘downside’ potential. Platform development entails untenable capex exposure, whereas conventional subsea development (e.g. by gas lift) will result in very inferior production performance. The new technologies which ‘unlock’ the economics of such fields are: Viable subsea ESP technology is available now and will be field proven during 1994/95. Proven high reliability pump systems are now available, underwritten by performance contract. Bottom di...

Loss: MultipleNegativesRankingLoss with these parameters:

{
    "scale": 20.0,
    "similarity_fct": "cos_sim",
    "gather_across_devices": false
}

Evaluation Dataset

offshore_energy_v1

Dataset: offshore_energy_v1 at d4682d4
Size: 5,604 evaluation samples
Columns: anchor, positive, and negative

Approximate statistics based on the first 1000 samples:

	anchor	positive	negative
type	string	string	string
details	min: 14 tokens mean: 24.45 tokens max: 41 tokens	min: 47 tokens mean: 440.51 tokens max: 1091 tokens	min: 56 tokens mean: 426.21 tokens max: 1152 tokens

Samples:

anchor	positive	negative
`What is the role of nanocrystalline cellulose (NCC) in the formulation of hydraulic fracturing fluids?`	Guar gum and its derivative based-gels cross-linked with boron have been used in hydraulic fracturing for decades. In order to achieve gel strength requirements, conventional fracturing requires the use of a large amount of thickener and cross-linking agent, which results in more residue and difficulty in the recovery of permeability. At the same time, the gel can be used to achieve the best thermal stability in a high pH environment. Therefore, we proposed a highly efficient organoboron nanocellulose cross-linker for low polymer loading fracturing fluids. Nanocrystalline cellulose (NCC) resulted from sulfuric acid hydrolysis of cellulose microciystalline. Boron-modified nanoparticles were synthesized by one-pot reaction as nano boron cross-linker (NBC). Nanocrystalline cellulose (NCC), (3-Aminopropyl) triethoxysilane, Organic boron (OBC) was mixed at a ratio of 1:4:4 and stirred at a constant temperature of 85°C for 5 hours. The presence of surface modification was shown with FTIR spe...	The unstable wellbore created by the infiltration of drilling fluids into the reservoir formation is a great challenge in drilling operations. Reducing the fluid infiltration using nanoparticles (NPs) brings about a significant improvement in drilling operation. Herein, a mixture of iron oxide nanoparticle (IONP) and polyanionic cellulose nanoparticle (nano-PAC) additives were added to water-based mud (WBM) to determine their impact on rheological and filtration properties measured at 80 °F, 100 °F, and 250 °F. Polyanionic cellulose (PAC-R) was processed into nano-PAC by wet ball-milling process. The rheological behaviour, low-pressure low-temperature (LPLT), and high-pressure high-temperature (HPHT) filtration properties performance of IONP, nano-PAC, and IONP and nano-PAC mixtures were compared in the WBM. The results showed that IONP, nano-PAC, and synergy effect of IONP and nano-PAC in WBM at temperatures of 80 °F and 250 °F improved the density, 10-s and 10-min gel strength (10-s ...
`What is the definition of tail gas in oil and gas engineering processes?`	#### T Tail gas Effluent gas at the end of a process. Technical Potential The amount by which it is possible to reduce greenhouse gas emissions by implementing a technology or practice that has reached the demonstration phase. Tectonically active area Area of the Earth where deformation is presently causing structural changes. Thermocline The ocean phenomenon characterized by a sharp change in temperature with depth. Thermohaline The vertical overturning of water masses due to seasonal heating, evaporation, and cooling. Third party Entity that is independent of the parties involved with the issues in question Top-down model. A model based on applying macro-economic theory and econometric techniques to historical data about consumption, prices, etc. Tracer A chemical compound or isotope added in small quantities to trace flow patterns. 36	SUSTAINABILITY REPORTING GUIDANCE FOR THE OIL AND GAS INDUSTRY Particulate matter: A complex mixture of small particles or droplets such as salts, organic chemicals, metals and soil particles [ENV-5]. Petrochemicals: Chemical products derived from oil and gas. Pipelines: Construction and use of facilities to transport liquid or gaseous hydrocarbons over long distances in above-ground, below-ground or underwater pipes. Primary containment: The vessel, pipe, barrel, equipment or other barrier that is designed to keep a material within it [ENV-6, ENV-7, SHS-6]. Primary energy: The energy content of a hydrocarbon fuel or other energy source used to produce power, usually in the form of electricity, heat or steam [CCE-6]. Process safety: A systematic approach to ensuring the safe containment of hazardous materials or energy by applying good design, construction and operating principles [SHS-6]. In this Guidance, this term is used synonymously with Asset i...
`How is dense phase acid gas injected back into the formation to mitigate environmental impacts?`	A systematic hazard management approach was used to identify, assess and mitigate hazards at the conceptual design stage of a large onshore sour gas development in Abu Dhabi. The potential environmental impact of sulphur block production and poor prospects of a sulphur market led to a concept involving injection of dense phase acid gas back into the formation. Significant Health, Safety and Environmental (HSE) challenges were addressed relating to the scale of the sour gas development which included the gathering, processing and injection of sour/acid gas containing 33% – 80% H2S. Quantitative Risk Assessment and H2S dispersion calculations were performed to evaluate the risk reduction effectiveness of specific HSE design considerations including material selection, pipeline design, pipeline routing, well design and the location of the processing facility and sour/acid gas wells. These HSE design considerations were integrated into the concept selection. Best industry practices in desi...	Nowadays, as the deep gas reservoirs in Daqing are explored, the complex volcanic reservoirs have been the major reservoirs in deep natural gas exploration and production. The reserves of volcanic gas reservoirs take up 88% of the total gas reserves. However, the deep complex gas reservoirs may cause heavy pollution during the drilling completion, and some of the barriers between target zones of the wells are very thin, leading to a poor stability. Additionally, because of the complex water/gas relations in the formation, such as appearance of bottom water and water and gas sharing the same formation in some wells, the fracturing operations will induce water channeling. All these facts may cause the failure of the fracturing operations. Especially, when the fractured formation is close to the water/gas interface, the fractures will easily extend into the water layer. The existence of water in the gas wells directly leads to the reduction of production and recovery rate of the gas reser...

Loss: MultipleNegativesRankingLoss with these parameters:

{
    "scale": 20.0,
    "similarity_fct": "cos_sim",
    "gather_across_devices": false
}

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: steps
per_device_train_batch_size: 16
per_device_eval_batch_size: 16
learning_rate: 2e-05
num_train_epochs: 1
warmup_ratio: 0.1

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: steps
prediction_loss_only: True
per_device_train_batch_size: 16
per_device_eval_batch_size: 16
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 2e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1.0
num_train_epochs: 1
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: {}
warmup_ratio: 0.1
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: False
fp16: False
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: False
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
hub_revision: None
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
include_for_metrics: []
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
liger_kernel_config: None
eval_use_gather_object: False
average_tokens_across_devices: False
prompts: None
batch_sampler: batch_sampler
multi_dataset_batch_sampler: proportional
router_mapping: {}
learning_rate_mapping: {}

Training Logs

Epoch	Step	Validation Loss	ai-job-validation_cosine_accuracy
0.3568	1000	0.0982	0.9764
0.7135	2000	0.0870	0.9800

Framework Versions

Python: 3.10.12
Sentence Transformers: 5.1.0
Transformers: 4.53.3
PyTorch: 2.8.0+cu128
Accelerate: 1.9.0
Datasets: 4.0.0
Tokenizers: 0.21.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

Downloads last month: 3

Safetensors

Model size

0.3B params

Tensor type

F32

Model tree for Sampath1987/EnergyEmbed-nv1

Base model

Alibaba-NLP/gte-multilingual-base

Finetuned

(103)

this model

Dataset used to train Sampath1987/EnergyEmbed-nv1

Papers for Sampath1987/EnergyEmbed-nv1

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

Paper • 1908.10084 • Published Aug 27, 2019 • 15

Efficient Natural Language Response Suggestion for Smart Reply

Paper • 1705.00652 • Published May 1, 2017

Evaluation results

Cosine Accuracy on ai job validation
self-reported

0.980