EnergyEmbed-v2-e2 / README.md
Sampath1987's picture
fine-tuned EnergyEmbed-v2-e2 2 epochs
ab447bd verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - dense
  - generated_from_trainer
  - dataset_size:53913
  - loss:MultipleNegativesRankingLoss
base_model: Alibaba-NLP/gte-multilingual-base
widget:
  - source_sentence: >-
      How does the monitoring system for well integrity function after CO2
      injection?
    sentences:
      - >-
        Drilling is a complex process and delivering a successful well requires
        identifying proper technologies and utilizing them efficiently to save
        time & cost. Today in Oil & Gas industry there is a huge focus on
        digital technologies to improve Drilling Process efficiency and PDO
        decided to implement an innovative approach of process optimization by
        implementing a unique project "electronically Delivering the Limit
        (eDtL)".

        The overall approach with eDtL project was to implement a platform which
        can provide Drilling Operations team the technical limit for all
        Drilling Activities, which is the theoretical minimum time required to
        perform an activity, based on available knowledge and technology.

        eDtL system utilizes rig sensors data transmitted in Real-Time from
        Drilling Rigs to automatically detect the Rig Activity and focus on
        identifying the areas of Drilling Performance Improvements and
        minimizing redundant tasks for rig and office teams. The identified
        opportunities are communicated with rig team for implementation and the
        performance is tracked again to highlight the improvements.

        eDtL system also provides capability for continuous improvement of
        organizational processes by introducing automation of redundant tasks.
        One of such improvement was partial automation of Daily Drilling Report
        which was historically manually recorded by rig team daily.
      - >-
        ADNOC has embarked on a major Carbon Capture and Storage (CCS) project
        where large quantities of CO2 are injected into deep saline aquifers for
        permanent storage instead of releasing into the atmosphere.

        An advanced chemical tracer technology was deployed in the first CCS
        project in the UAE for continuous CO2 monitoring to ensure permanent and
        safe CO2 storage. In case of containment breach, the chemical tracer
        technology can confirm the leakage and identify its source.

        After CO2 injection for permanent storage, any containment breaching
        would be detected in the shallow soil monitoring borehole. Few soil
        monitoring boreholes were excavated across the field in which Capillary
        Adsorption Tubes (CAT) were inserted for some time and replaced by
        another according to the sampling frequency plan. The tube is sent to
        the lab for CO2 leak detection and reporting. The high detection
        resolution is in the order of 0.1 parts per trillion (ppt). This has a
        positive impact on the system economics because smaller quantities of
        chemical tracer material are required.

        The tracer injection monitoring system is ongoing in the first CO2
        storage area of Abu Dhabi. The monitoring includes soil monitoring which
        are shallow boreholes. The soil monitoring boreholes were excavated
        close to the CO2 injection well to ensure that there are no well
        integrity issues developed due to thermal effects by CO2 injection. The
        soil monitoring boreholes to be verified by surface gas CO2 monitors.
        Soil monitors were located around the radial storage area, to detect CO2
        leakage and to understand CO2 migration to the soil through the cap rock
        (in case of leakage). The monitoring system for caprock and well
        integrity will provide: Surface soil monitoring for cap rock integrity,
        integrity confirmation for legacy wells, integrity confirmation of
        injection well in the post-injection monitoring period, leakage
        quantification, leakage origin if multiple injectors. The monitoring
        system can continue for up to 30 years of the operational period as well
        as the full post-injection monitoring, measurement and verification
        horizon.

        This paper presents a description of a sophisticated CO2 monitoring
        technology that is being deployed in UAE's first CCS project. CO2 tracer
        technology is considered as one of the most accurate methods to detect
        CO2 leakage at surface. Its high-detection resolution allows early
        leakage identification and early mitigation action. In addition, it
        proves to be relatively low cost, operationally easy to execute, and
        requires a small operational footprint.
      - >-
        Carbon Capture and Storage, as a solution to mitigate the increase in
        greenhouse gases emissions in the atmosphere, is still bringing
        intensive worldwide R&D activities. In particular, significant
        acceleration of in situ CCS experiments supports technical developments
        as well as acceptability of this technology. Among the major risks
        identified to this technology, wells are often considered to be the
        weakest spots with respect to CO2 confinement in the geological
        reservoir. Therefore, long-term well integrity performance assessment is
        one of the critical steps that must be addressed before large scale CCS
        technology deployment is accepted as a safe solution to reduce CO2
        emissions.

        A risk-based methodology associated with well integrity is proposed
        within CO2 geological storage. The main objectives of this approach are
        to identify and quantify risks associated with CO2 leakages along wells
        over time (from tens to thousands of years), to evaluate risks and to
        propose relevant actions to reduce unacceptable risks. The
        methodological framework emphasized the use of the risk concept as a
        relevant criterion to (i) evaluate the overall performance of well
        confinement with respect to different stakes, (ii) include different
        levels of uncertainty associated to the studied system, and (iii)
        provide a reliable decision making support. For the quantification of
        risk, a coupled CO2 flow model (gas flow and degradation processes) was
        used to identify possible leakage pathways along the wellbore and
        quantify possible CO2 leakage towards sensitive targets (surface, fresh
        water, any aquifers…) for different scenarios. This approach offers an
        operational response to some of the challenges inherent to well
        integrity management over well lifecycle.

        This paper focuses on the application of the methodology to a synthetic
        case based on an existing well. The practical outcomes and the added
        values will be presented: (i) an objective and structured process, (ii)
        scenarios identification and quantification of CO2 migration along the
        wellbore for each scenario, (iii) risk mapping, (iv) and operational
        action plans for risk treatment of well integrity.
  - source_sentence: >-
      How did the detailed pre-survey planning impact the success of the
      offshore seismic acquisition campaign?
    sentences:
      - >-
        We showcase an innovative campaigning and business-focused approach to
        reservoir monitoring of multiple fields using 4D (time-lapse) seismic.
        Benefits obtained in terms of cost, speed and the quality of insights
        gained are discussed, in comparison with a piecemeal approach.
        Challenges and lessons learned are described, with a view to this
        approach becoming more widely adopted and allowing 4D monitoring to be
        extended to smaller or more marginal fields.

        An offshore seismic acquisition campaign was planned and successfully
        executed for a sequence of four 4D monitor surveys for fields located
        within 250 km of each other on the greater Northwest Shelf of Australia.
        The four monitors were acquired in H1 2020 comprising (in this order):
        Pluto Gas Field M2 (second monitor), Brunello Gas Field M1 (first
        monitor), Laverda Oil Field M1 and Cimatti Oil Field M1.

        Cost savings expected from campaigning were realised, despite three
        cyclones during operations, with success largely attributed to detailed
        pre-survey planning. Also important were the choice of vessel and
        planning for operational flexibility. The baseline surveys were diverse
        and required careful planning to achieve repeatability between vintages
        over each field, and to optimise the acquisition sequence  minimising
        time required to reconfigure the streamer spreads between surveys. The
        Cimatti baseline survey was acquired using a dual-vessel operation;
        modelling, combined with now-standard steerable streamers, showed a
        single-vessel monitor survey was feasible. These optimisations provided
        cost savings incremental to the principal economy of sharing vessel
        mobilisation costs across the whole campaign.

        Both processing and evaluation (ongoing at the time of writing) are
        essentially separate per field, but follow a consistent approach.
        Processing is carried out by more than one contractor to debottleneck
        this phase, with products, including intermediate quality control (QC)
        volumes, delivered as pre-stack depth migrations. While full evaluation
        of the monitor surveys to static and dynamic reservoir model updates
        will continue beyond 2020, key initial reservoir insights are expected
        to emerge within days of processing completion, with some even earlier
        from QC volumes. Furthermore, concurrent 4D evaluations are expected to
        result in fruitful exchanges of ideas and technologies between fields.
      - >-
        Advances in seismic acquisition, processing, computing hardware and
        theory continue to enhance seismic-image quality. However, an investment
        decision on seismic projects should be based not only technical criteria
        but a quantifiable expected value above all currently available field
        data including well information. This presentation will include a case
        history of a major carbonate oil field demonstrating how this value was
        estimated before a major reprocessing project and how this value is
        being achieved.

        This field contains over 1000 wellbore penetrations. A 3D seismic survey
        was acquired over the field during 2001–2002, but the reservoir
        development team believed that these data to date had added limited
        value. The motivation for evaluating the potential for further
        investment in seismic data was a multi-billion-dollar
        field-redevelopment plan.

        The Value of Information (VOI) exercise to justify a seismic project
        began with an evaluation of technical issues that limited the use of
        existing seismic data. Through a targeted fast track reprocessing effort
        it was determined that the existing survey had been designed and
        acquired adequately, and that the deficiencies in the dataset at the
        reservoir level are primarily caused by near-surface and overburden
        effects. The first-order impact is that mapped seismic surfaces exhibit
        a "roughness" primarily from the overlying "non-geologic" noise. There
        was concern that many subtle faults interpreted at the reservoir level
        could be "non-geologic" artifacts which resulted in reluctance to
        incorporate these into the reservoir model. Amplitude balancing issues
        in the original data precluded quantitative assessments such as porosity
        prediction. The targeted reprocessing also verified that existing
        algorithms and traditional workflows alone were insufficient to resolve
        the technical issues.

        Working with the reservoir development team the key business drivers for
        reprocessing were identified as follows:

        Increase individual well productivity and recovery

        Image and define new opportunities in current poor-data areas

        Save on well cost by preventing re-drills

        Improve overall field development plan

        Specific expected value metrics and risks were assigned to the above
        objectives and a VOI assessment was completed. It was estimated that
        successfully achieving the above business objectives would result in a
        potential value at least 15 times the cost of the reprocessing. This
        resulted in management approval of the full field reprocessing.

        Following completion of the seismic reprocessing, the project team
        objectively assessed whether the technical criteria had been achieved
        and if the business criteria will be achieved. In both cases the team
        determined that value metrics will be met. The reprocessing has impacted
        drill-wells as well as field development planning. In addition, the
        reprocessed seismic data will produce additional potential value as a
        result of opportunities not recognized at the start of the project.
      - >-
        Sabiriyah Mauddud is a giant reservoir in NK under active water flood
        with about 200 producers and 32 injectors. The reservoir has no aquifer
        or insignificant energy support and had been on production since 1960s.
        Water flood started in the year 1997, initially with a pilot and later
        on expanded to the full field in a phased manner. Initial development
        was on pattern flood concept with all vertical injectors & producers
        which has now been replaced with Produce High-Inject Low (PHIL) concept
        using horizontal wells. In light of the significance of this reservoir
        for Kuwait's production, all efforts are made to optimize the
        performance of this reservoir. To achieve this objective, Pressure
        monitoring & performance analysis is considered to be the backbone of
        all production as well as injection activities.

        This paper presents the methodology conceived and implemented to assess
        the reservoir pressure performance and estimate the current reservoir
        pressure in different segments/ blocks in an innovative way so as to
        maximize the value of "Water flooding" in North Kuwait area along with
        the meeting of production aspirations using ESP as artificial lift
        system in an optimized manner. Except the RFT data in newly drilled
        wells, the availability of pressure data was limited during recent past,
        making it necessary to integrate all the available information so as to
        build a powerful tool to be used for water flood monitoring. All
        available information  Repeat Formation Tester "RFT", Static Bottom
        Hole Pressure "SBHP" and Pump intake pressure "PIP" under dynamic and
        static conditions, were collected & analyzed. An initial study related
        to compartmentalization showed two main areas, north and south, based on
        comprehensive analysis of all the pressure points. The analysis also
        helped for identifying areas with good vertical connectivity and
        understanding segments with vertical barriers matching with the
        geological description. In order to have the latest pressure mapping,
        data were combined to have an integrated imagery of the pressure
        distribution across the reservoir. During the exercise, "Gaps" were
        identified which were filled in by the intake pressure live data as well
        as shut in data to have a meaningful mimic of the reservoir pressure to
        help the ongoing production as well as injection activities.

        Based on the innovative approach as above, surveillance plan has been
        made to further enhance the quality of the mapping. Several maps such as
        opportunity map; PVT properties map; layer wise pressure maps etc. have
        been generated for ready-to-use information to facilitate daily
        operations.

        The objective of the paper is to share the innovative, simple, smart and
        very useful approach adopted by North Kuwait to manage the giant
        Mauddud. This paper presents the methodology conceived and implemented
        to assess the reservoir pressure performance and estimate the current
        reservoir pressure in difference segments/ blocks in an innovative way
        so as to maximize the value of "Water flooding" in North Kuwait area
        along with the meeting the production aspirations using ESP as
        artificial lift system in an optimized manner.
  - source_sentence: What are the primary recovery techniques used in oil and gas extraction?
    sentences:
      - >-
        The extraction of oil and gas involves various techniques to enhance
        recovery rates. Primary recovery relies on the natural pressure of the
        reservoir, while secondary recovery techniques such as water flooding
        and gas injection are employed to increase output after primary methods
        become inefficient. Tertiary recovery methods, also known as enhanced
        oil recovery (EOR), use thermal, gas, or chemical injection to further
        improve extraction rates. Each method comes with its own cost
        implications and efficiency rates, which can significantly affect the
        overall economics of an oilfield development project.
      - >-
        In this paper one of the areas of conflicts observed with the
        performance of horizontal wells standoff with respect to development of
        thin oil rim reservoirs is examined.

        In a technical paper as part of the critical review of literature on the
        exploitation of thin oil rim reservoirs with large gas cap and aquifer,
        this author had highlighted the problem. As part of sensitives in
        horizontal well standoff, Cosmos and Fatoke (2004) tested three
        positions; one-third, centre and two-third positions from the GOC in a
        Niger Delta field. They concluded that the landing closest to the GOC
        (one-third position) yielded lowest Oil compared to the centre and
        two-third positions. Surprisingly the work done by Sai Garimella et al
        (2011) in a 60ft Ghariff & Al Khlata shallow marine low permeability
        sandstone reservoirs in a field in Oman showed a different result with
        the one-third position indicating an optimum recovery from a horizontal
        well. Interestingly both authors positions on the performance had
        support from other authors.

        This study used a 3D reservoir model, investigated different horizontal
        well standoff performances and applied permeability reduction to
        simulate different reservoir quality. The objective was to see if the
        reservoir quality was a factor in the different horizontal well standoff
        performance seen from different regions of the world while noting their
        different depositional environments. Results from the investigation is
        presented in this paper and shows a different trend from both authors
        mentioned above.
      - >-
        The oil extraction process typically involves drilling a well into the
        earth's crust where oil deposits are located. The well is often lined
        with casing to prevent collapse and water intrusion. Once the well is
        drilled, various techniques such as primary recovery, secondary
        recovery, and tertiary recovery can be employed. Primary recovery uses
        natural reservoir pressure to extract oil, while secondary recovery
        employs water or gas injection to maintain pressure. Tertiary recovery,
        also known as enhanced oil recovery, uses techniques like thermal
        injection or chemicals to further reduce the viscosity of oil and
        increase extraction rates. Each of these methods has distinct
        implications on the yield and economic viability of oil extraction
        operations.
  - source_sentence: >-
      What advantages do helicopters have over fixed-wing aircraft for leak
      detection surveys?
    sentences:
      - >-
        The reservoir characteristics such as porosity and permeability are
        crucial for evaluating the potential of oil and gas fields. Porosity
        refers to the void spaces within rocks that can hold hydrocarbons, while
        permeability measures how easily fluids can flow through rock
        formations. These two properties significantly influence the extraction
        methods used and the overall productivity of a reservoir. Enhancing
        permeability through hydraulic fracturing has become a common technique
        in unconventional resource extraction, allowing for more efficient
        recovery of oil and gas from low-permeability reservoirs.
      - >-
        BP gas production operations in North America manages over 15,000 miles
        of onshore pipelines that make up our vast, complex, and aging gas
        gathering networks. Surveying these for leaks presents a huge resource
        challenge using current ground based technology and, in turn, impacts
        the assurance of the safety and integrity of these operations.

        The Exploration and Production Technology Group evaluated new leak
        detection technologies using laser, thermal imaging camera and a high
        speed gas sampling detector that were deployed on aircraft and used
        global positioning systems coordinates to survey gas gathering
        pipelines. Field trials on gas gathering systems in the North Texas,
        Anadarko asset showed that the laser and gas sampling based leak
        detection systems were the most accurate, but the video imaging from the
        thermal camera made a powerful statement. Helicopters proved to be more
        suitable in leak detection surveys on gas gathering pipelines than that
        of fixed-wing aircraft.

        The aerial leak detection technologies produce a significant increase in
        efficiency and productivity in managing the integrity of BP's gas
        gathering systems. While that improves business performance, perhaps
        more importantly is the fact that small gas leaks can be easily found
        before they become big ones. That reduces environmental damage and the
        potential for leaks to impact the public. The development and
        implementation of aerial leak detection in BP is being recognized as an
        integrity tool in providing a significantly improved integrity assurance
        to its gas gathering operations.
      - >-
        One of prerequisite of any detection system is to get the requirement
        the risk analysis that estimates mainly the safety and environmental
        impacts of a loss of containment. From this prerequisite it is possible
        to consider a strategy for an early detection of a loss of containment,
        and to choose a method or a technology. Methods of detection belong to
        two main families:

        External based Leak Detection System which used local leak sensors to
        generate a leak alarm. The main External based Leak Detection Systems
        are acoustic emission detectors, pressure detectors, fiber optic cable,
        vapor and / or liquid sensing cables;

        Internal based Leak Detection Systems which used normal field sensors
        (e.g. pressure transmitters, flowmeters) for leak detection and leak
        localization. The main internal Leak Detection Systems are:

        

        balancing systems (line balance, volume balance, compensated mass
        balance etc.);

        

        Real Time Transient Model;

        

        pressure/ flow monitoring;

        

        statistical analysis…

        The main External based Leak Detection Systems was studied internally
        through different evaluation and development programs and for some of
        them in operation.

        The main findings were the followings:

        The acoustic based detection is sensitive to external noises as well as
        some pipeline fluid (multi-phase, critical flow, transit phase) and
        pipeline elements (e.g. elbows, valves). This technology requires the
        management of high quantity of data, a significant tuning period, and
        many sensors connected to the pipeline. Distributed Acoustic Sensing
        (DAS) using the fiber optic cable media is currently used internally to
        detect real time intrusion.

        The pressure emission detectors may be insensitive and require accurate
        pressure measurement. This technology is difficultly practical on short
        lines, gas or multi-phase pipelines with transient phases.

        The vapor / liquid sensing cable technology needs to be physically close
        to the pipe to become wet in case of leakage. These sensitive cables
        should be replaced or cleaned after a leak. This technology is ne
        suitable easily for long distance application. Their retrievable
        capability with the implementation of pulling chamber every few hundred
        meters needs to be carefully considered. In addition, this technology is
        highly sensitive. This implies that false alarms may occurred in case of
        former contamination (presence of hydrocarbon). This technology is also
        sensitive to the soil disruption, fluid properties and is affected by
        the ageing (sensitive polymer alteration). However, this technology is
        suitable for short distance and for some leaks detection when there is
        no temperature variation between the fluid and the soil.

        The Fiber optic solution was highly considered for a leak detection
        through several evaluation programs and, in particular two PIT (Projet
        d’Innovation Technologique) Projects. These two PIT projects were
        performed between 2015 and 2019 and presented to the following ADIPEC
        sessions

        

        (Baque, 2017) 2017 Abu Dhabi International Petroleum Exhibition &
        Conference SPE-188669-MS Early Gas Detection

        

        (Baque, 2020) 2020 Abu Dhabi International Petroleum Exhibition &
        Conference SPE-203293-MS Fiber Optic Liquid Leakage Detection

        Note: Some of the paragraph parts of this manuscript are extracted from
        these two SPE documents referred (Baque, 2017) and (Baque, 2020). Other
        evaluation and development programs not presented previously are also
        presented in this manuscript.
  - source_sentence: >-
      What occupational health hazards are anticipated with large construction
      projects during the energy transition?
    sentences:
      - >-
        institutionalized political structures to realize particular social
        objectives or serve particular

        constituencies.  

        **Non-hazardous waste:** Waste, other than Hazardous waste, resulting
        from company

        operations, including process and oil field wastes disposed of, on site
        or off site, as well as

        office, commercial or packaging related wastes [ENV-7].  

        **Normalization:** The ratio of a quantitative indicator output (e.g.
        emissions) to an

        aggregated measure of another output (e.g. oil and gas production or
        refinery throughput)  

        [Module 1 _Reporting process_ ].  

        **Occupational illness:** An Employee or Contractor health condition or
        disorder requiring

        medical treatment due to a workplace Incident, typically involving
        multiple exposures to

        hazardous substances or to physical agents. Examples include
        noise-induced hearing loss,

        respiratory disease, and contact dermatitis [SHS-3].  

        **Occupational injury:** Harm of an Employee or Contractor resulting
        from a single

        instantaneous workplace incident that results in medical treatment
        (beyond simple first aid),

        work restrictions, days away from work (lost time) or a Fatality
        [SHS-3].  

        **Operating area:** An area where business activities take place with
        potential to interact with

        the adjacent environment [ENV-4].  

        **Operation:** A generic term used to denote any kind of business
        activity involving productrelated processes, such as production,
        manufacturing and transport. Note: the term oil and

        gas operations used in the Guidance is intended to be broad and
        inclusive of other types of

        product, such as chemicals.  

        **7.5**
      - |-
        The broader work of the Directorate is carried out  
        by its four standing committees.  
        Safety Committee: This committee’s objective is the  
        core of the Directorate: to eliminate fatalities and  
        catastrophic process safety events in our industry.
        In pursuit of this aim, the committee develops
        and promotes the adoption of recommended
        practices – a task it performs both on its own and
        with partners and trade associations. The resulting
        publications lay a foundation for both safety
        and efficiency, and develops the motivated and
        empowered workforce needed to provide the world
        with clean, affordable energy.  
        Highlights of the committee’s 2023 activities
        include participation in events, the creation
        of expert groups, issuing of publications, and
        engagement in data reviews.  
        - Events: In 2023, in addition to the regular
        committee and subcommittee meetings,
        the committee held diving workshops in
        Rio De Janeiro and Paris. These meetings
        championed local stakeholders and sought
        to improve local diving performance. It also
        hosted two Aviation Procurement Managers
        Forums – one in London, the other in Houston  
        – to address industry contracting behaviours  
        and its impact on contractor resilience and
        safety. In addition, the committee conducted a
        Process Safety Workshop at the IOGP Summit
        in Indonesia. Finally, at this year’s Offshore
        Europe conference, IOGP Safety Director Steve
        Norton moderated a panel on learning from, and
        sharing, safety lessons.  
        - Expert Groups: The committee established
        three expert groups in 2023: two to revise
        existing Reports (365 on land transportation
        safety and 365-12 on in-vehicle monitoring), and
        one to consider adoption and implementation of
        recommended safety practices.  
        - Publications: The committee issued ten  
        guidance documents in 2023, covering critical
        areas such as diving, aviation, and process
        safety; see page 34 for a full list of publications.  
        - Data reviews: The committee published its
        annual compilations of safety performance data,
        covering occupational, process, aviation, and
        land transportation safety. IOGP has collected
        safety performance data from its Members
        since 1985 and our database is the largest in
        the upstream industry, providing companies
        with valuable information for benchmarking and
        performance improvement.  
        17
      - |-
        endotoxins and fungi. The authors recommended that
        ongoing real–time measurement of these exposures be
        carried out to identify boundary conditions, phases, and
        settings with the highest pollutant release.  
        12 — Health in the energy transition  
        Good quality studies are needed on the health effects of
        renewable energy sources. Such studies should include
        populations and patients with well-characterized exposure,
        high-quality information on outcome, and assessment of
        potential confounders. While retrospective (e.g., case-control)
        studies might produce useful results, prospective longitudinal
        studies would provide the strongest evidence.  
        Several LCA studies have been conducted for the different
        technologies. These LCAs reported relative low levels of
        emissions during the lifecycle of renewable sources of
        energy. Few of these studies included a comparison with
        fossil-based technologies. When more life cycle studies
        become available it would be important to include them
        in the literature review. While looking at the life cycle of a
        certain technology, other health effects in the value chain
        could potentially be identified (reference: UNECE on Carbon
        Neutrality in the UNECE Region: Integrated Life-cycle
        Assessment of Electricity Sources).  
        As of December 2024, very few occupational and public
        health hazards specific to energy transition technologies
        have been identified. The energy transition is in an early stage
        and will evolve quickly, and additional hazards unique to
        energy transition activities may emerge; the specifics of this
        are, at this time, uncertain.  
        What is certain is that the energy transition will involve large
        construction projects whose risks (and effective methods to
        manage those risks) are well-known and understood. Existing
        occupational health approaches will be able to manage
        these risks effectively, provided the correct assessments are
        conducted properly.
datasets:
  - Sampath1987/offshore_energy_v1
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - cosine_accuracy
model-index:
  - name: SentenceTransformer based on Alibaba-NLP/gte-multilingual-base
    results:
      - task:
          type: triplet
          name: Triplet
        dataset:
          name: ai job validation
          type: ai-job-validation
        metrics:
          - type: cosine_accuracy
            value: 0.9713607430458069
            name: Cosine Accuracy

SentenceTransformer based on Alibaba-NLP/gte-multilingual-base

This is a sentence-transformers model finetuned from Alibaba-NLP/gte-multilingual-base on the offshore_energy_v1 dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False, 'architecture': 'NewModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Sampath1987/EnergyEmbed-v2-e2")
# Run inference
sentences = [
    'What occupational health hazards are anticipated with large construction projects during the energy transition?',
    'endotoxins and fungi. The authors recommended that\nongoing real–time measurement of these exposures be\ncarried out to identify boundary conditions, phases, and\nsettings with the highest pollutant release.  \n12 — Health in the energy transition  \nGood quality studies are needed on the health effects of\nrenewable energy sources. Such studies should include\npopulations and patients with well-characterized exposure,\nhigh-quality information on outcome, and assessment of\npotential confounders. While retrospective (e.g., case-control)\nstudies might produce useful results, prospective longitudinal\nstudies would provide the strongest evidence.  \nSeveral LCA studies have been conducted for the different\ntechnologies. These LCAs reported relative low levels of\nemissions during the lifecycle of renewable sources of\nenergy. Few of these studies included a comparison with\nfossil-based technologies. When more life cycle studies\nbecome available it would be important to include them\nin the literature review. While looking at the life cycle of a\ncertain technology, other health effects in the value chain\ncould potentially be identified (reference: UNECE on Carbon\nNeutrality in the UNECE Region: Integrated Life-cycle\nAssessment of Electricity Sources).  \nAs of December 2024, very few occupational and public\nhealth hazards specific to energy transition technologies\nhave been identified. The energy transition is in an early stage\nand will evolve quickly, and additional hazards unique to\nenergy transition activities may emerge; the specifics of this\nare, at this time, uncertain.  \nWhat is certain is that the energy transition will involve large\nconstruction projects whose risks (and effective methods to\nmanage those risks) are well-known and understood. Existing\noccupational health approaches will be able to manage\nthese risks effectively, provided the correct assessments are\nconducted properly.',
    'institutionalized political structures to realize particular social objectives or serve particular\nconstituencies.  \n**Non-hazardous waste:** Waste, other than Hazardous waste, resulting from company\noperations, including process and oil field wastes disposed of, on site or off site, as well as\noffice, commercial or packaging related wastes [ENV-7].  \n**Normalization:** The ratio of a quantitative indicator output (e.g. emissions) to an\naggregated measure of another output (e.g. oil and gas production or refinery throughput)  \n[Module 1 _Reporting process_ ].  \n**Occupational illness:** An Employee or Contractor health condition or disorder requiring\nmedical treatment due to a workplace Incident, typically involving multiple exposures to\nhazardous substances or to physical agents. Examples include noise-induced hearing loss,\nrespiratory disease, and contact dermatitis [SHS-3].  \n**Occupational injury:** Harm of an Employee or Contractor resulting from a single\ninstantaneous workplace incident that results in medical treatment (beyond simple first aid),\nwork restrictions, days away from work (lost time) or a Fatality [SHS-3].  \n**Operating area:** An area where business activities take place with potential to interact with\nthe adjacent environment [ENV-4].  \n**Operation:** A generic term used to denote any kind of business activity involving productrelated processes, such as production, manufacturing and transport. Note: the term oil and\ngas operations used in the Guidance is intended to be broad and inclusive of other types of\nproduct, such as chemicals.  \n**7.5**',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.5659, 0.2068],
#         [0.5659, 1.0000, 0.1987],
#         [0.2068, 0.1987, 1.0000]])

Evaluation

Metrics

Triplet

Metric Value
cosine_accuracy 0.9714

Training Details

Training Dataset

offshore_energy_v1

  • Dataset: offshore_energy_v1 at 4e9339c
  • Size: 53,913 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 14 tokens
    • mean: 23.77 tokens
    • max: 42 tokens
    • min: 36 tokens
    • mean: 392.08 tokens
    • max: 961 tokens
    • min: 45 tokens
    • mean: 389.63 tokens
    • max: 1109 tokens
  • Samples:
    anchor positive negative
    What statistical methods were employed to enhance the accuracy of comparisons in the field testing of shaped cutters? As shaped polycrystalline diamond compact (PDC) cutter geometries become more prevalent across the industry, this paper statistically reviews field testing of novel shaped PDC cutters in a variety of challenging applications. Firstly, the paper identifies the improvement in efficiency when compared with conventional PDC cutter geometries. Secondly, it confirms the reliability and robustness of the aforementioned shaped cutter geometries.
    After several years of field testing shaped PDC cutter geometries, the question of how they hold up against conventional cylinder-shaped cutters remains unanswered. This study looks at drill bits that have the same overall design; however, each bit has different shape configurations that are deployed in a range of hole sizes and drilling applications. Data was collected from more than 100 runs and included advanced dull evaluation techniques, data mining, and comparative analyses. During data collation and interpretation, several statistical methods we...
    This paper details the improvements to drilling performance and torsional response of fixed cutter bits when changing from a conventional 19-mm cutter diameter configuration to 25-mm cutter diameters for similar blade counts in two different hole sizes. Key performance metrics include rate of penetration (ROP), rerun-ability, torsional response, and ability to maintain tool-face control during directional drilling.
    A high-performance drilling application was selected with several existing offset wells using a 12¼-in., five-bladed, 19-mm (519) drill bit design, and a concept bit developed using 25-mm diameter cutters while maintaining comparable ancillary features. This was tested in the same field on both vertical and S-shape sections using the same bent-housing motor assembly and drilling performance compared to the existing offsets. A 17½-in. hole size application that experiences high drillstring vibration was also selected, and a 25-mm cutter diameter drill bit was designed with co...
    What are vapor recovery units (VRU) used for in oil and gas operations? ## 4. Vapour recovery units
    Vapor recovery units (VRU) are used to prevent emissions by capturing the streams and
    re-routing them either back to the process or for use as fuel. More details on the
    components, installation, and operation of VRU are captured in the following sections.
    ##### 3.1.2 Reduction and recovery of glycol dehydration flash gas
    Gas from the flash vessel will consist primarily of hydrocarbons and is continuously
    produced. If installed, a flash vessel will typically remove 90% or more of the entrained
    hydrocarbon gas and dissolved gases in the glycol leaving the contactor column.
    Glycol flash vessels typically operate at 3-7 barg [18], meaning there is generally a sufficient
    pressure drop for the flash gas to commonly be routed to flare or a low-pressure fuel gas
    system. If the composition of the flash gas prevents this, or there is no fuel gas system,
    then a Vapour Recovery Unit (VRU) may be needed for recovery into other process units.
    Minimization of the flash gas itself is also possible by optimizing the glycol flowrate,
    such as by adjusting the dry gas water temperature specification based on accurate site
    conditions because the water dew point needed could vary seasonally or from site to
    site by using more accurate ambient temperatur...
    What challenges are posed by fractures and faults in the completion of MRC wells? The Maximum Reservoir Contact (MRC) concept was developed to improve well productivity and sustainability by maximizing the contact area with target reservoirs. MRC is a proven technology for the development of tight/non-economical reservoirs. Completion design for MRC wells plays a vital role in enhancing well deliverability, monitoring and accessibility.
    MRC technology was put into application to appraise a tight and thin heterogeneous carbonate reservoir in a giant offshore field in Abu Dhabi. Different completion scenarios were simulated to select the best suited completion to achieve enhanced well deliverability, monitoring and accessibility.
    Heavy casing design with liner and tie-back system was finalized to maximize accessibility and achieve proper isolation behind casing. A special pre-perforated liner was also designed to eliminate the pressure drop across the wellbore. The MRC drain was divided mainly into two sections, blank pipe and pre-perforated liner equipped with swell ...
    The Clair field is the largest discovered oilfield on the UK continental shelf (UKCS) but has high reservoir uncertainty associated with a complex natural fracture network. The field area covers over 200 sq km with an estimated STOIIP of 7 billion barrels. The scale and complexity of the reservoir has led to a phased multi-platform development.
    Phase 1 started production in 2005 with 20 wells drilled prior to an extended drill break. Five new wells (A21 to A25) were drilled and brought online during 2016/17 which increased platform production by c.70%. The new wells incorporated historic lessons to mitigate the risk of wellbore instability in the overburden and be robust to the dynamic uncertainties of the fractured reservoir. Many of the well outcomes and risk events were predicted and mitigated effectively, however the new wells still provided some surprises.
    This paper presents a summary of the lessons from the historic Clair development wells which underpinned the recent drilling c...
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false
    }
    

Evaluation Dataset

offshore_energy_v1

  • Dataset: offshore_energy_v1 at 4e9339c
  • Size: 6,739 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 11 tokens
    • mean: 23.56 tokens
    • max: 52 tokens
    • min: 55 tokens
    • mean: 386.01 tokens
    • max: 1082 tokens
    • min: 45 tokens
    • mean: 382.6 tokens
    • max: 1175 tokens
  • Samples:
    anchor positive negative
    What is the importance of quantifying carbon emissions during cementing operations in decarbonization? An important step in decarbonization is using an end-to-end approach to quantify carbon emissions during cementing operations. By careful analysis of the entire cementing operations process, it is then possible to measure and compare carbon emissions at various stages of the operation. Understanding and isolating the main drivers of the carbon emissions footprint enables making better choices and developing best alternatives with lower environmental impact.
    The methodology considers the lifecycle assessment of cement from quarry extraction to well abandonment, and includes steps such as manufacturing of raw materials, transportation and logistics, and operations in the field. For these stages, careful quantification of emissions is performed based on the manufacturer's carbon emissions of cementing products, transportation (distance and means) to the bulk plant and rig site, and equipment-related emissions such as blending and pumping units. In some cases, when assessing the footprint ...
    Objectives/Scope
    There are many different views on the Energy Transition. What is agreed is that to achieve current climate change targets, the journey to deep decarbonisation must start now. Scope 3 emissions are clearly the major contributor to total emissions and must be actively reduced. However, if Oil and Gas extraction is to be continued, then operators must understand, measure, and reduce Scope 1 and 2 emissions. This paper examines the constituent parts of typical Scope 1 emissions for O&G assets and discusses a credible pathway and initial steps towards decarbonisation of operations.
    Methods, Procedures, Process
    Emissions from typical assets are investigated: data is examined to determine the overall and individual contributions of Scope 1 emissions. A three tiered approach to emissions savings is presented:

    Reduce overall energy usage

    Seek to Remove environmental losses

    Replace energy supply with low carbon alternatives
    A simple method, used to assess carbon emissions,...
    What factors must engineers consider during the drilling design phase? The drilling of oil and gas wells involves several stages including the exploration phase, drilling design, and perforation techniques. In the exploration phase, geologists use seismic surveys to identify potential drilling locations. During the drilling design phase, engineers must consider factors such as wellbore stability, fluid mechanics, and formation pressures. Once the well is drilled, perforation techniques are applied to enhance the flow of hydrocarbons into the wellbore. The effectiveness of these techniques can significantly impact production rates and overall project success. The extraction of crude oil and natural gas is typically carried out through drilling. Drilling uses different techniques to reach the petroleum reservoirs located deep underground. One key method is rotary drilling, where a drill bit is rotated while cutting through the earth's layers to create a wellbore. Rotary drilling is favored for its efficiency in penetrating hard rock layers. Another method is directional drilling, which allows operators to drill at various angles to reach reservoirs that are not directly beneath the drilling platform. This technique increases the area covered by the well and can optimize production. In addition, hydraulic fracturing enhances recovery rates by injecting fluids under high pressure to create fractures in the rock, increasing the permeability and allowing oil and gas to flow more freely. Lastly, the safety and environmental impacts of drilling techniques are a growing concern, and advancements are continually being sought to mitigate these effect...
    How does the 'Dissolved pore network' concept enhance matrix permeability in the modeling of carbonate oil reservoirs? In this paper, we present a case study of using dual porosity dual permeability (DPDP) simulation for an offshore Abu Dhabi carbonate oil reservoir exhibiting complex flow behavior through matrix, fracture system and conductive faults. The main objective of the study is to present and explain the reservoir flow behaviors by constructing and using advanced reservoir geologic and simulation models. The results of the study will be utilized as part of the inputs for full field development plan.
    Initially, an extensive work on the faults and fractures characterization was conducted to properly integrate this information into a dynamic model using DPDP modeling approach. However, the poor response of some wells or field sectors indicated the insufficiency of this concept to capture the full complexity of the reservoir system. Consequently, a new geological concept was proposed to represent the effect of enhanced matrix permeability related to facies dissolution process in the reservoir mode...
    Integration of pressure-derived permeability thickness with other geological data plays a crucial role in estimating the apparent reservoir permeability, which is a key reservoir property required for reliable reservoir characterization as it governs fluid flow and greatly impacts decisions related to production, field development, and reservoir management. The geological model provides a representation of the subsurface reservoir, capturing the spatial distribution of lithology, porosity, permeability, and other geological properties. Analysis of pressure data provides valuable information on well condition, reservoir extent, and dynamic reservoir parameters. Integrating such data with the geological model is an enabler to better quantify and manage the uncertainty in the spatial 3D distribution of permeability away from well control.
    This work proposes a methodology to build high-resolution geological models based on the available dynamic data, seismic data, and geologic interpretati...
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • learning_rate: 2e-05
  • num_train_epochs: 2
  • warmup_ratio: 0.1

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 2
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss Validation Loss ai-job-validation_cosine_accuracy
0.2967 1000 - 0.1417 0.9610
0.5935 2000 - 0.1199 0.9682
0.8902 3000 - 0.1082 0.9717
1.1869 4000 - 0.1102 0.9672
1.4837 5000 0.1614 0.1091 0.9679
1.7804 6000 - 0.1037 0.9714

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 5.1.0
  • Transformers: 4.53.3
  • PyTorch: 2.8.0+cu128
  • Accelerate: 1.9.0
  • Datasets: 4.0.0
  • Tokenizers: 0.21.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}