Sampath1987/offshore_energy_v1
Viewer • Updated • 67.4k • 120
How to use Sampath1987/EnergyEmbed-v2-e3 with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("Sampath1987/EnergyEmbed-v2-e3", trust_remote_code=True)
sentences = [
"How does the monitoring system for well integrity function after CO2 injection?",
"Drilling is a complex process and delivering a successful well requires identifying proper technologies and utilizing them efficiently to save time & cost. Today in Oil & Gas industry there is a huge focus on digital technologies to improve Drilling Process efficiency and PDO decided to implement an innovative approach of process optimization by implementing a unique project \"electronically Delivering the Limit (eDtL)\".\nThe overall approach with eDtL project was to implement a platform which can provide Drilling Operations team the technical limit for all Drilling Activities, which is the theoretical minimum time required to perform an activity, based on available knowledge and technology.\neDtL system utilizes rig sensors data transmitted in Real-Time from Drilling Rigs to automatically detect the Rig Activity and focus on identifying the areas of Drilling Performance Improvements and minimizing redundant tasks for rig and office teams. The identified opportunities are communicated with rig team for implementation and the performance is tracked again to highlight the improvements.\neDtL system also provides capability for continuous improvement of organizational processes by introducing automation of redundant tasks. One of such improvement was partial automation of Daily Drilling Report which was historically manually recorded by rig team daily.",
"ADNOC has embarked on a major Carbon Capture and Storage (CCS) project where large quantities of CO2 are injected into deep saline aquifers for permanent storage instead of releasing into the atmosphere.\nAn advanced chemical tracer technology was deployed in the first CCS project in the UAE for continuous CO2 monitoring to ensure permanent and safe CO2 storage. In case of containment breach, the chemical tracer technology can confirm the leakage and identify its source.\nAfter CO2 injection for permanent storage, any containment breaching would be detected in the shallow soil monitoring borehole. Few soil monitoring boreholes were excavated across the field in which Capillary Adsorption Tubes (CAT) were inserted for some time and replaced by another according to the sampling frequency plan. The tube is sent to the lab for CO2 leak detection and reporting. The high detection resolution is in the order of 0.1 parts per trillion (ppt). This has a positive impact on the system economics because smaller quantities of chemical tracer material are required.\nThe tracer injection monitoring system is ongoing in the first CO2 storage area of Abu Dhabi. The monitoring includes soil monitoring which are shallow boreholes. The soil monitoring boreholes were excavated close to the CO2 injection well to ensure that there are no well integrity issues developed due to thermal effects by CO2 injection. The soil monitoring boreholes to be verified by surface gas CO2 monitors. Soil monitors were located around the radial storage area, to detect CO2 leakage and to understand CO2 migration to the soil through the cap rock (in case of leakage). The monitoring system for caprock and well integrity will provide: Surface soil monitoring for cap rock integrity, integrity confirmation for legacy wells, integrity confirmation of injection well in the post-injection monitoring period, leakage quantification, leakage origin if multiple injectors. The monitoring system can continue for up to 30 years of the operational period as well as the full post-injection monitoring, measurement and verification horizon.\nThis paper presents a description of a sophisticated CO2 monitoring technology that is being deployed in UAE's first CCS project. CO2 tracer technology is considered as one of the most accurate methods to detect CO2 leakage at surface. Its high-detection resolution allows early leakage identification and early mitigation action. In addition, it proves to be relatively low cost, operationally easy to execute, and requires a small operational footprint.",
"Carbon Capture and Storage, as a solution to mitigate the increase in greenhouse gases emissions in the atmosphere, is still bringing intensive worldwide R&D activities. In particular, significant acceleration of in situ CCS experiments supports technical developments as well as acceptability of this technology. Among the major risks identified to this technology, wells are often considered to be the weakest spots with respect to CO2 confinement in the geological reservoir. Therefore, long-term well integrity performance assessment is one of the critical steps that must be addressed before large scale CCS technology deployment is accepted as a safe solution to reduce CO2 emissions.\nA risk-based methodology associated with well integrity is proposed within CO2 geological storage. The main objectives of this approach are to identify and quantify risks associated with CO2 leakages along wells over time (from tens to thousands of years), to evaluate risks and to propose relevant actions to reduce unacceptable risks. The methodological framework emphasized the use of the risk concept as a relevant criterion to (i) evaluate the overall performance of well confinement with respect to different stakes, (ii) include different levels of uncertainty associated to the studied system, and (iii) provide a reliable decision making support. For the quantification of risk, a coupled CO2 flow model (gas flow and degradation processes) was used to identify possible leakage pathways along the wellbore and quantify possible CO2 leakage towards sensitive targets (surface, fresh water, any aquifers…) for different scenarios. This approach offers an operational response to some of the challenges inherent to well integrity management over well lifecycle.\nThis paper focuses on the application of the methodology to a synthetic case based on an existing well. The practical outcomes and the added values will be presented: (i) an objective and structured process, (ii) scenarios identification and quantification of CO2 migration along the wellbore for each scenario, (iii) risk mapping, (iv) and operational action plans for risk treatment of well integrity."
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]This is a sentence-transformers model finetuned from Alibaba-NLP/gte-multilingual-base on the offshore_energy_v1 dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 8192, 'do_lower_case': False, 'architecture': 'NewModel'})
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("Sampath1987/EnergyEmbed-v2-e3")
# Run inference
sentences = [
'What occupational health hazards are anticipated with large construction projects during the energy transition?',
'endotoxins and fungi. The authors recommended that\nongoing real–time measurement of these exposures be\ncarried out to identify boundary conditions, phases, and\nsettings with the highest pollutant release. \n12 — Health in the energy transition \nGood quality studies are needed on the health effects of\nrenewable energy sources. Such studies should include\npopulations and patients with well-characterized exposure,\nhigh-quality information on outcome, and assessment of\npotential confounders. While retrospective (e.g., case-control)\nstudies might produce useful results, prospective longitudinal\nstudies would provide the strongest evidence. \nSeveral LCA studies have been conducted for the different\ntechnologies. These LCAs reported relative low levels of\nemissions during the lifecycle of renewable sources of\nenergy. Few of these studies included a comparison with\nfossil-based technologies. When more life cycle studies\nbecome available it would be important to include them\nin the literature review. While looking at the life cycle of a\ncertain technology, other health effects in the value chain\ncould potentially be identified (reference: UNECE on Carbon\nNeutrality in the UNECE Region: Integrated Life-cycle\nAssessment of Electricity Sources). \nAs of December 2024, very few occupational and public\nhealth hazards specific to energy transition technologies\nhave been identified. The energy transition is in an early stage\nand will evolve quickly, and additional hazards unique to\nenergy transition activities may emerge; the specifics of this\nare, at this time, uncertain. \nWhat is certain is that the energy transition will involve large\nconstruction projects whose risks (and effective methods to\nmanage those risks) are well-known and understood. Existing\noccupational health approaches will be able to manage\nthese risks effectively, provided the correct assessments are\nconducted properly.',
'institutionalized political structures to realize particular social objectives or serve particular\nconstituencies. \n**Non-hazardous waste:** Waste, other than Hazardous waste, resulting from company\noperations, including process and oil field wastes disposed of, on site or off site, as well as\noffice, commercial or packaging related wastes [ENV-7]. \n**Normalization:** The ratio of a quantitative indicator output (e.g. emissions) to an\naggregated measure of another output (e.g. oil and gas production or refinery throughput) \n[Module 1 _Reporting process_ ]. \n**Occupational illness:** An Employee or Contractor health condition or disorder requiring\nmedical treatment due to a workplace Incident, typically involving multiple exposures to\nhazardous substances or to physical agents. Examples include noise-induced hearing loss,\nrespiratory disease, and contact dermatitis [SHS-3]. \n**Occupational injury:** Harm of an Employee or Contractor resulting from a single\ninstantaneous workplace incident that results in medical treatment (beyond simple first aid),\nwork restrictions, days away from work (lost time) or a Fatality [SHS-3]. \n**Operating area:** An area where business activities take place with potential to interact with\nthe adjacent environment [ENV-4]. \n**Operation:** A generic term used to denote any kind of business activity involving productrelated processes, such as production, manufacturing and transport. Note: the term oil and\ngas operations used in the Guidance is intended to be broad and inclusive of other types of\nproduct, such as chemicals. \n**7.5**',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.5463, 0.1943],
# [0.5463, 1.0000, 0.1698],
# [0.1943, 0.1698, 1.0000]])
ai-job-validationTripletEvaluator| Metric | Value |
|---|---|
| cosine_accuracy | 0.97 |
anchor, positive, and negative| anchor | positive | negative | |
|---|---|---|---|
| type | string | string | string |
| details |
|
|
|
| anchor | positive | negative |
|---|---|---|
What statistical methods were employed to enhance the accuracy of comparisons in the field testing of shaped cutters? |
As shaped polycrystalline diamond compact (PDC) cutter geometries become more prevalent across the industry, this paper statistically reviews field testing of novel shaped PDC cutters in a variety of challenging applications. Firstly, the paper identifies the improvement in efficiency when compared with conventional PDC cutter geometries. Secondly, it confirms the reliability and robustness of the aforementioned shaped cutter geometries. |
This paper details the improvements to drilling performance and torsional response of fixed cutter bits when changing from a conventional 19-mm cutter diameter configuration to 25-mm cutter diameters for similar blade counts in two different hole sizes. Key performance metrics include rate of penetration (ROP), rerun-ability, torsional response, and ability to maintain tool-face control during directional drilling. |
What are vapor recovery units (VRU) used for in oil and gas operations? |
## 4. Vapour recovery units |
##### 3.1.2 Reduction and recovery of glycol dehydration flash gas |
What challenges are posed by fractures and faults in the completion of MRC wells? |
The Maximum Reservoir Contact (MRC) concept was developed to improve well productivity and sustainability by maximizing the contact area with target reservoirs. MRC is a proven technology for the development of tight/non-economical reservoirs. Completion design for MRC wells plays a vital role in enhancing well deliverability, monitoring and accessibility. |
The Clair field is the largest discovered oilfield on the UK continental shelf (UKCS) but has high reservoir uncertainty associated with a complex natural fracture network. The field area covers over 200 sq km with an estimated STOIIP of 7 billion barrels. The scale and complexity of the reservoir has led to a phased multi-platform development. |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim",
"gather_across_devices": false
}
anchor, positive, and negative| anchor | positive | negative | |
|---|---|---|---|
| type | string | string | string |
| details |
|
|
|
| anchor | positive | negative |
|---|---|---|
What is the importance of quantifying carbon emissions during cementing operations in decarbonization? |
An important step in decarbonization is using an end-to-end approach to quantify carbon emissions during cementing operations. By careful analysis of the entire cementing operations process, it is then possible to measure and compare carbon emissions at various stages of the operation. Understanding and isolating the main drivers of the carbon emissions footprint enables making better choices and developing best alternatives with lower environmental impact. |
Objectives/Scope |
What factors must engineers consider during the drilling design phase? |
The drilling of oil and gas wells involves several stages including the exploration phase, drilling design, and perforation techniques. In the exploration phase, geologists use seismic surveys to identify potential drilling locations. During the drilling design phase, engineers must consider factors such as wellbore stability, fluid mechanics, and formation pressures. Once the well is drilled, perforation techniques are applied to enhance the flow of hydrocarbons into the wellbore. The effectiveness of these techniques can significantly impact production rates and overall project success. |
The extraction of crude oil and natural gas is typically carried out through drilling. Drilling uses different techniques to reach the petroleum reservoirs located deep underground. One key method is rotary drilling, where a drill bit is rotated while cutting through the earth's layers to create a wellbore. Rotary drilling is favored for its efficiency in penetrating hard rock layers. Another method is directional drilling, which allows operators to drill at various angles to reach reservoirs that are not directly beneath the drilling platform. This technique increases the area covered by the well and can optimize production. In addition, hydraulic fracturing enhances recovery rates by injecting fluids under high pressure to create fractures in the rock, increasing the permeability and allowing oil and gas to flow more freely. Lastly, the safety and environmental impacts of drilling techniques are a growing concern, and advancements are continually being sought to mitigate these effect... |
How does the 'Dissolved pore network' concept enhance matrix permeability in the modeling of carbonate oil reservoirs? |
In this paper, we present a case study of using dual porosity dual permeability (DPDP) simulation for an offshore Abu Dhabi carbonate oil reservoir exhibiting complex flow behavior through matrix, fracture system and conductive faults. The main objective of the study is to present and explain the reservoir flow behaviors by constructing and using advanced reservoir geologic and simulation models. The results of the study will be utilized as part of the inputs for full field development plan. |
Integration of pressure-derived permeability thickness with other geological data plays a crucial role in estimating the apparent reservoir permeability, which is a key reservoir property required for reliable reservoir characterization as it governs fluid flow and greatly impacts decisions related to production, field development, and reservoir management. The geological model provides a representation of the subsurface reservoir, capturing the spatial distribution of lithology, porosity, permeability, and other geological properties. Analysis of pressure data provides valuable information on well condition, reservoir extent, and dynamic reservoir parameters. Integrating such data with the geological model is an enabler to better quantify and manage the uncertainty in the spatial 3D distribution of permeability away from well control. |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim",
"gather_across_devices": false
}
eval_strategy: stepsper_device_train_batch_size: 16per_device_eval_batch_size: 16learning_rate: 2e-05warmup_ratio: 0.1overwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 16per_device_eval_batch_size: 16per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 2e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 3max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportionalrouter_mapping: {}learning_rate_mapping: {}| Epoch | Step | Training Loss | Validation Loss | ai-job-validation_cosine_accuracy |
|---|---|---|---|---|
| 0.2967 | 1000 | - | 0.1458 | 0.9605 |
| 0.5935 | 2000 | - | 0.1217 | 0.9665 |
| 0.8902 | 3000 | - | 0.1095 | 0.9711 |
| 1.1869 | 4000 | - | 0.1131 | 0.9682 |
| 1.4837 | 5000 | 0.1672 | 0.1107 | 0.9687 |
| 1.7804 | 6000 | - | 0.1030 | 0.9709 |
| 2.0772 | 7000 | - | 0.1081 | 0.9693 |
| 2.3739 | 8000 | - | 0.1091 | 0.9691 |
| 2.6706 | 9000 | - | 0.1098 | 0.9691 |
| 2.9674 | 10000 | 0.0678 | 0.1065 | 0.9700 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Base model
Alibaba-NLP/gte-multilingual-base