mpnet_step1 / README.md
suhwan3's picture
Upload fine-tuned model
8857c26 verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:23175
  - loss:TripletLoss
base_model: sentence-transformers/all-mpnet-base-v2
widget:
  - source_sentence: >-
      The First Trust Nasdaq Bank ETF (FTXO) seeks to replicate the performance
      of the Nasdaq US Smart Banks TM Index by investing at least 90% of its
      assets in the index's securities. This fund provides exposure to U.S.
      banking companies, selecting the most liquid stocks and ranking/weighting
      them based on factors including trailing volatility, value (cash flow to
      price), and growth (price returns). The index typically holds around 30
      liquid U.S. banking companies across retail banking, loans, and financial
      services, with an 8% cap on any single holding. The fund is
      non-diversified, and the index undergoes annual reconstitution and
      quarterly rebalancing.
    sentences:
      - >-
        The iShares Evolved U.S. Media and Entertainment ETF seeks to invest in
        U.S. listed common stocks of large-, mid-, and small-capitalization
        companies within the media and entertainment sector. Following an
        "Evolved" approach, the fund selects companies belonging to the Media
        and Entertainment Evolved Sector based on economic characteristics
        historically correlated with traditional sector definitions. Under
        normal circumstances, it allocates at least 80% of its net assets to
        these stocks, and the fund is non-diversified.
      - >-
        The Direxion Daily Healthcare Bull 3X Shares (CURE) is an ETF that seeks
        daily investment results, before fees and expenses, of 300% (3X) of the
        daily performance of the Health Care Select Sector Index. It invests at
        least 80% of its net assets in financial instruments designed to provide
        this 3X daily leveraged exposure. The underlying index tracks US listed
        healthcare companies, including pharmaceuticals, health care equipment
        and supplies, providers and services, biotechnology, life sciences
        tools, and health care technology, covering major large-cap names. CURE
        is non-diversified and intended strictly as a short-term tactical
        instrument, as it delivers its stated 3X exposure only for a single day,
        and returns over longer periods can significantly differ from three
        times the index's performance.
      - >-
        The Xtrackers MSCI Emerging Markets Climate Selection ETF seeks to track
        an emerging markets index focused on companies meeting specific climate
        criteria. Derived from the MSCI ACWI Select Climate 500 methodology, the
        underlying index selects eligible emerging market stocks using an
        optimization process designed to reduce greenhouse gas emission
        intensity (targeting 10% revenue-related and 7% financing-related
        reductions) and increase exposure to companies with SBTi-approved
        targets. The strategy also excludes controversial companies and
        evaluates companies based on broader ESG considerations. The fund is
        non-diversified and invests at least 80% of its assets in the component
        securities of this climate-focused emerging markets index.
  - source_sentence: >-
      The iShares S&P Small-Cap 600 Value ETF (IJS) seeks to track the
      investment results of the S&P SmallCap 600 Value Index, which consists of
      U.S. small-capitalization equities exhibiting value characteristics. This
      index selects value stocks from the S&P SmallCap 600 using factors such as
      book value to price, earnings to price, and sales to price ratios. The
      fund generally invests at least 80% of its assets in the component
      securities of its underlying index and may invest up to 20% in certain
      futures, options, swap contracts, cash, and cash equivalents. The
      underlying index undergoes annual rebalancing in December.
    sentences:
      - >-
        The Global X S&P 500 Risk Managed Income ETF seeks to track the Cboe S&P
        500 Risk Managed Income Index by investing at least 80% of its assets in
        index securities. The index's strategy involves holding the underlying
        stocks of the S&P 500 Index while applying an options collar,
        specifically selling at-the-money covered call options and buying
        monthly 5% out-of-the-money put options corresponding to the portfolio's
        value. This approach aims to generate income, ideally resulting in a net
        credit from the options premiums, and provide risk management, though
        selling at-the-money calls inherently caps the fund's potential for
        upside participation.
      - >-
        The Amplify International Enhanced Dividend Income ETF (IDVO), an
        actively managed fund recently updated to include CWP in its name, seeks
        to provide current income primarily and capital appreciation
        secondarily. The fund invests at least 80% of its assets in
        dividend-paying U.S. exchange-traded American depositary receipt (ADR)
        securities representing companies located outside the U.S., focusing on
        high-quality, large-cap constituents from the MSCI ACWI ex USA Index to
        offer international equity exposure in a domestic wrapper. It enhances
        income generation by opportunistically utilizing a tactical strategy of
        writing (selling) short-term, U.S. exchange-traded covered call option
        contracts on some or all of its individual holdings, targeting income
        from both dividends and option premiums. While aiming for country and
        sector diversification by selecting approximately 30-50 stocks, the fund
        is classified as non-diversified.
      - >-
        The Strive Emerging Markets Ex-China ETF seeks to track the total return
        performance of the Bloomberg Emerging Markets ex China Large & Mid Cap
        Index. This index comprises large and mid-capitalization equity
        securities from 24 emerging market economies, specifically excluding
        China. The index is market cap-weighted, includes common stocks and real
        estate investment trusts, and is rebalanced quarterly and reconstituted
        semi-annually. Under normal circumstances, the fund invests at least 80%
        of its assets in these emerging market securities, which may include
        depositary receipts representing securities included in the index.
  - source_sentence: >-
      The Fidelity MSCI Health Care Index ETF (FHLC) seeks to track the
      performance of the MSCI USA IMI Health Care 25/50 Index, which represents
      the broad U.S. health care sector. The ETF invests at least 80% of its
      assets in securities included in this market-cap-weighted index, which
      captures large, mid, and small-cap companies across over 10 subsectors.
      Employing a representative sampling strategy, the fund aims to correspond
      to the index's performance. The index incorporates a 25/50 capping
      methodology, is rebalanced quarterly, and its broad reach offers
      diversification across cap sizes and subsectors, potentially reducing
      concentration in dominant large pharma names and increasing exposure to
      areas like drug retailers and insurance. The fund is classified as
      non-diversified.
    sentences:
      - >-
        The SPDR S&P Health Care Equipment ETF (XHE) tracks the equal-weighted
        S&P Health Care Equipment Select Industry Index, which is derived from
        the U.S. total market and provides exposure to U.S. health care
        equipment and supplies companies. Employing a sampling strategy, the
        fund invests at least 80% of its assets in the index's securities, which
        are rebalanced quarterly. While encompassing companies of all cap sizes,
        the equal-weight methodology gives XHE a significant small-cap tilt,
        offering focused access to this narrow segment as an alternative for
        investors seeking to avoid the concentration found in broader,
        market-cap-weighted healthcare funds dominated by large pharmaceuticals
        or service providers.
      - >-
        The Global X Silver Miners ETF (SIL) seeks to provide investment results
        that correspond generally to the price and yield performance of the
        Solactive Global Silver Miners Total Return Index. This index is
        designed to measure the broad-based equity market performance of global
        companies primarily involved in the silver mining industry, including
        related activities like exploration and refining. The fund invests at
        least 80% of its total assets in the securities of this underlying index
        and related American and Global Depositary Receipts. The index is
        market-cap-weighted and typically comprises 20-40 stocks, while the fund
        itself is considered non-diversified.
      - >-
        The Invesco S&P 500 Equal Weight Energy ETF (RSPG) is a large-cap sector
        fund tracking an equal-weighted index comprising U.S. energy companies
        within the S&P 500 Index, classified according to the Global Industry
        Classification Standard (GICS). The ETF aims to invest at least 90% of
        its total assets in securities from this underlying index, which applies
        an equal-weighting methodology and rebalances quarterly. The index also
        includes a rule to ensure a minimum of 22 constituents, incorporating
        the largest energy companies from the S&P MidCap 400 Index if necessary
        to meet this count.
  - source_sentence: >-
      The VictoryShares Top Veteran Employers ETF (VTRN) was designed to track
      the Veterans Select Index, focusing on US-listed companies of any market
      capitalization that demonstrated support for US military veterans, service
      members, and their families primarily through employment opportunities and
      related policies. These companies were identified based on various sources
      like rankings and surveys and were typically weighted equally in the
      index. However, this fund is liquidating, and its last day of trading was
      October 11, 2021.
    sentences:
      - >-
        The Invesco S&P 500 Equal Weight Industrials ETF (RSPN) tracks an
        equal-weighted index of U.S. industrial stocks drawn from the S&P 500
        Index, specifically focusing on companies classified within the
        industrials sector according to the Global Industry Classification
        Standard (GICS). The fund generally invests at least 90% of its assets
        in these securities. This equal-weighting scheme offers a
        non-traditional approach compared to market-cap weighting, reducing the
        dominance of large-cap industrial conglomerates and lowering the
        portfolio's weighted average market capitalization. The underlying index
        is rebalanced on a quarterly basis.
      - >-
        The SP Funds Dow Jones Global Sukuk ETF (SPSK) is a passively managed
        fund designed to track the performance, before fees and expenses, of the
        Dow Jones Sukuk Total Return (ex-Reinvestment) Index. This index focuses
        on U.S. dollar-denominated, investment-grade sukuk, which are financial
        certificates similar to bonds, issued in global markets and structured
        to comply with Islamic religious law (Sharia) and its investment
        principles. Sharia compliance involves screening securities to exclude
        businesses such as tobacco, pornography, gambling, and interest-based
        finance, and issuers may include international financial institutions
        and foreign governments or agencies, including from emerging markets.
        Under normal circumstances, the fund attempts to invest substantially
        all (at least 80%) of its assets in the index's component securities,
        which are reconstituted and rebalanced monthly. The ETF is considered
        non-diversified.
      - >-
        The ALUM ETF, part of the USCF ETF Trust, is an actively managed fund
        utilizing a proprietary methodology to seek exposure to the price of
        aluminum through aluminum-based derivative investments. It primarily
        invests in aluminum futures but may also use cash-settled options,
        forward contracts, options on futures, and other options traded on US
        and non-US exchanges. The fund operates through a wholly owned Cayman
        Islands subsidiary to avoid issuing K-1 forms and may hold cash, cash
        equivalents, or investment grade fixed-income securities as collateral.
        This non-diversified fund is currently being delisted, with its last day
        of trading on an exchange scheduled for October 11, 2024.
  - source_sentence: >-
      The Sprott Gold Miners ETF (SGDM) seeks to track the performance of the
      Solactive Gold Miners Custom Factors Total Return Index. This index
      focuses on gold mining companies based in the U.S. and Canada whose shares
      trade on the Toronto Stock Exchange, New York Stock Exchange, or NASDAQ.
      The index employs a weighting methodology that begins with market
      capitalization and then adjusts based on three fundamental factors: higher
      revenue growth, lower debt-to-equity, and higher free cash flow yield. The
      fund is non-diversified and normally invests at least 90% of its net
      assets in securities included in this index.
    sentences:
      - >-
        The Sprott Gold Miners ETF (SGDM) seeks to track the performance of the
        Solactive Gold Miners Custom Factors Total Return Index. This index
        focuses on gold mining companies based in the U.S. and Canada whose
        shares trade on the Toronto Stock Exchange, New York Stock Exchange, or
        NASDAQ. The index employs a weighting methodology that begins with
        market capitalization and then adjusts based on three fundamental
        factors: higher revenue growth, lower debt-to-equity, and higher free
        cash flow yield. The fund is non-diversified and normally invests at
        least 90% of its net assets in securities included in this index.
      - >-
        The VanEck Biotech ETF (BBH) seeks to replicate the performance of the
        MVIS® US Listed Biotech 25 Index, which provides exposure to
        approximately 25 of the largest or leading U.S.-listed companies in the
        biotechnology industry. The fund normally invests at least 80% of its
        assets in securities comprising this market-cap-weighted index. The
        underlying index includes common stocks and depositary receipts of firms
        involved in the research, development, production, marketing, and sale
        of drugs based on genetic analysis and diagnostic equipment. While
        focusing on U.S.-listed companies, it may include foreign firms listed
        domestically, and medium-capitalization companies can be included.
        Reflecting the index's concentration, the fund is non-diversified and
        may have a top-heavy portfolio. The index is reviewed semi-annually.
      - >-
        The KraneShares Global Carbon Offset Strategy ETF (KSET) was the first
        US-listed ETF providing exposure to the global voluntary carbon market.
        It achieved this by investing primarily in liquid carbon offset credit
        futures, including CME-traded Global Emissions Offsets (GEOs) and
        Nature-Based Global Emission Offsets (N-GEOs), which are designed to
        help businesses meet greenhouse gas reduction goals. Tracking an index
        that weighted eligible futures based on liquidity, the fund sought
        exposure to the same carbon offset credit futures, typically those
        maturing within two years. The ETF was considered non-diversified and
        utilized a Cayman Island subsidiary. However, the fund was delisted,
        with its last day of trading on an exchange being March 14, 2024.
datasets:
  - hobbang/stage1-triplet-dataset
pipeline_tag: sentence-similarity
library_name: sentence-transformers

SentenceTransformer based on sentence-transformers/all-mpnet-base-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-mpnet-base-v2 on the stage1-triplet-dataset dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 384, 'do_lower_case': False}) with Transformer model: MPNetModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'The Sprott Gold Miners ETF (SGDM) seeks to track the performance of the Solactive Gold Miners Custom Factors Total Return Index. This index focuses on gold mining companies based in the U.S. and Canada whose shares trade on the Toronto Stock Exchange, New York Stock Exchange, or NASDAQ. The index employs a weighting methodology that begins with market capitalization and then adjusts based on three fundamental factors: higher revenue growth, lower debt-to-equity, and higher free cash flow yield. The fund is non-diversified and normally invests at least 90% of its net assets in securities included in this index.',
    'The KraneShares Global Carbon Offset Strategy ETF (KSET) was the first US-listed ETF providing exposure to the global voluntary carbon market. It achieved this by investing primarily in liquid carbon offset credit futures, including CME-traded Global Emissions Offsets (GEOs) and Nature-Based Global Emission Offsets (N-GEOs), which are designed to help businesses meet greenhouse gas reduction goals. Tracking an index that weighted eligible futures based on liquidity, the fund sought exposure to the same carbon offset credit futures, typically those maturing within two years. The ETF was considered non-diversified and utilized a Cayman Island subsidiary. However, the fund was delisted, with its last day of trading on an exchange being March 14, 2024.',
    "The VanEck Biotech ETF (BBH) seeks to replicate the performance of the MVIS® US Listed Biotech 25 Index, which provides exposure to approximately 25 of the largest or leading U.S.-listed companies in the biotechnology industry. The fund normally invests at least 80% of its assets in securities comprising this market-cap-weighted index. The underlying index includes common stocks and depositary receipts of firms involved in the research, development, production, marketing, and sale of drugs based on genetic analysis and diagnostic equipment. While focusing on U.S.-listed companies, it may include foreign firms listed domestically, and medium-capitalization companies can be included. Reflecting the index's concentration, the fund is non-diversified and may have a top-heavy portfolio. The index is reviewed semi-annually.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

stage1-triplet-dataset

  • Dataset: stage1-triplet-dataset at a0fb998
  • Size: 23,175 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 80 tokens
    • mean: 148.35 tokens
    • max: 211 tokens
    • min: 80 tokens
    • mean: 153.81 tokens
    • max: 238 tokens
    • min: 82 tokens
    • mean: 150.74 tokens
    • max: 208 tokens
  • Samples:
    anchor positive negative
    The Invesco Financial Preferred ETF (PGF) seeks to track the ICE Exchange-Listed Fixed Rate Financial Preferred Securities Index, primarily by investing at least 90% of its total assets in the securities comprising the index. The underlying index is market capitalization weighted and designed to track the performance of exchange-listed, fixed rate, U.S. dollar denominated preferred securities, including functionally equivalent instruments, issued by U.S. financial companies. PGF provides a concentrated portfolio exclusively focused on financial-sector preferred securities and is considered non-diversified, holding both investment- and non-investment-grade securities within this focus. The FlexShares ESG & Climate Investment Grade Corporate Core Index Fund (FEIG) is a passively managed ETF designed to provide broad-market, core exposure to USD-denominated investment-grade corporate bonds. It seeks to track the performance of the Northern Trust ESG & Climate Investment Grade U.S. Corporate Core IndexSM, which selects bonds from a universe of USD-denominated, investment-grade corporate debt with maturities of at least one year. The index employs an optimization process to increase the aggregate ESG score and reduce aggregate climate-related risk among constituent companies, involving ranking firms on material ESG metrics, governance, and carbon risks, while excluding controversial companies and international initiative violators. Weights are also optimized to minimize systematic risk, and the index is rebalanced monthly. Under normal circumstances, the fund invests at least 80% of its assets in the index's securities. The Pacer Nasdaq-100 Top 50 Cash Cows Growth Leaders ETF (QQQG) seeks to track the Pacer Nasdaq 100 Top 50 Cash Cows Growth Leaders Index, which draws its universe from the Nasdaq-100 Index. Following a rules-based strategy, the fund screens these companies based on average projected free cash flows and earnings over the next two fiscal years, excluding financials, real estate, and those with negative projections. It then ranks identified stocks by their trailing twelve-month free cash flow margins and selects the top 50 names, weighted by price momentum. The portfolio is reconstituted and rebalanced quarterly. Aiming to identify quality growth leaders with strong cash flow generation, the fund seeks to invest at least 80% of assets in growth securities and is non-diversified.
    The Invesco Financial Preferred ETF (PGF) seeks to track the ICE Exchange-Listed Fixed Rate Financial Preferred Securities Index, primarily by investing at least 90% of its total assets in the securities comprising the index. The underlying index is market capitalization weighted and designed to track the performance of exchange-listed, fixed rate, U.S. dollar denominated preferred securities, including functionally equivalent instruments, issued by U.S. financial companies. PGF provides a concentrated portfolio exclusively focused on financial-sector preferred securities and is considered non-diversified, holding both investment- and non-investment-grade securities within this focus. The FlexShares ESG & Climate Investment Grade Corporate Core Index Fund (FEIG) is a passively managed ETF designed to provide broad-market, core exposure to USD-denominated investment-grade corporate bonds. It seeks to track the performance of the Northern Trust ESG & Climate Investment Grade U.S. Corporate Core IndexSM, which selects bonds from a universe of USD-denominated, investment-grade corporate debt with maturities of at least one year. The index employs an optimization process to increase the aggregate ESG score and reduce aggregate climate-related risk among constituent companies, involving ranking firms on material ESG metrics, governance, and carbon risks, while excluding controversial companies and international initiative violators. Weights are also optimized to minimize systematic risk, and the index is rebalanced monthly. Under normal circumstances, the fund invests at least 80% of its assets in the index's securities. The Nuveen Global Net Zero Transition ETF (NTZG) was an actively managed fund that sought capital appreciation by investing in global equity securities. The fund focused on companies positioned to contribute to the transition to a net zero carbon economy through their current or planned efforts to reduce global greenhouse gas emissions. Utilizing bottom-up, fundamental analysis, NTZG invested in a range of companies, including climate leaders, firms with disruptive climate mitigation technologies, and high carbon emitters working towards real-world emissions decline. The fund aimed to align with the Paris Climate Agreement by seeking to lower portfolio carbon intensity annually towards a 2050 net zero goal and engaging with portfolio companies, while excluding companies involved in weapons and firearms and investing globally across market capitalizations with allocations to non-US and emerging markets. **Please note: The security has been delisted, and the last day of trading on an exc...
    The Invesco Financial Preferred ETF (PGF) seeks to track the ICE Exchange-Listed Fixed Rate Financial Preferred Securities Index, primarily by investing at least 90% of its total assets in the securities comprising the index. The underlying index is market capitalization weighted and designed to track the performance of exchange-listed, fixed rate, U.S. dollar denominated preferred securities, including functionally equivalent instruments, issued by U.S. financial companies. PGF provides a concentrated portfolio exclusively focused on financial-sector preferred securities and is considered non-diversified, holding both investment- and non-investment-grade securities within this focus. The FlexShares ESG & Climate Investment Grade Corporate Core Index Fund (FEIG) is a passively managed ETF designed to provide broad-market, core exposure to USD-denominated investment-grade corporate bonds. It seeks to track the performance of the Northern Trust ESG & Climate Investment Grade U.S. Corporate Core IndexSM, which selects bonds from a universe of USD-denominated, investment-grade corporate debt with maturities of at least one year. The index employs an optimization process to increase the aggregate ESG score and reduce aggregate climate-related risk among constituent companies, involving ranking firms on material ESG metrics, governance, and carbon risks, while excluding controversial companies and international initiative violators. Weights are also optimized to minimize systematic risk, and the index is rebalanced monthly. Under normal circumstances, the fund invests at least 80% of its assets in the index's securities. The First Trust Expanded Technology ETF (XPND) is an actively managed fund seeking long-term capital appreciation by investing primarily in US stocks identified as "Expanded Technology Companies." Defined as companies whose operations are principally derived from or dependent upon technology, these include traditional information technology firms as well as tech-dependent companies in other sectors, such as communication services and consumer discretionary (like internet and direct marketing retail). The fund invests at least 80% of its net assets in common stocks of these companies. While concentrated in the information technology sector and considered non-diversified, XPND aims for expanded exposure through a portfolio of around 50 companies selected using a quantitative model based on factors like return on equity, momentum, and free cash flow growth. Portfolio weights are generally market-cap-based within set ranges, and the fund is reconstituted and rebalanced quarterly.
  • Loss: TripletLoss with these parameters:
    {
        "distance_metric": "TripletDistanceMetric.COSINE",
        "triplet_margin": 0.05
    }
    

Evaluation Dataset

stage1-triplet-dataset

  • Dataset: stage1-triplet-dataset at a0fb998
  • Size: 3,010 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 84 tokens
    • mean: 152.57 tokens
    • max: 214 tokens
    • min: 70 tokens
    • mean: 154.43 tokens
    • max: 224 tokens
    • min: 70 tokens
    • mean: 150.04 tokens
    • max: 204 tokens
  • Samples:
    anchor positive negative
    The Global X S&P 500 Risk Managed Income ETF seeks to track the Cboe S&P 500 Risk Managed Income Index by investing at least 80% of its assets in index securities. The index's strategy involves holding the underlying stocks of the S&P 500 Index while applying an options collar, specifically selling at-the-money covered call options and buying monthly 5% out-of-the-money put options corresponding to the portfolio's value. This approach aims to generate income, ideally resulting in a net credit from the options premiums, and provide risk management, though selling at-the-money calls inherently caps the fund's potential for upside participation. The U.S. Global Technology and Aerospace & Defense ETF is an actively managed ETF seeking capital appreciation by investing in equity securities of companies expected to benefit from national defense efforts. These efforts include technological innovations and the development of products and services related to aerospace, physical, and cybersecurity defense, often in preparation for or in response to domestic, regional, or global conflicts. The fund is non-diversified. The BlackRock Future Climate and Sustainable Economy ETF (BECO) is an actively managed equity fund focused on the transition to a lower carbon economy and future climate themes. It seeks a relatively concentrated, non-diversified portfolio of globally-listed companies of any market capitalization, investing across multiple subthemes such as sustainable energy, resource efficiency, future transport, sustainable nutrition, and biodiversity. The fund utilizes proprietary environmental criteria, including carbon metrics, and aims to align with the Paris Climate Agreement goals for net-zero emissions by 2050, while excluding certain high-emission industries and companies violating the UN Global Compact. It also attempts to achieve a better aggregate environmental and ESG score than its benchmark, the MSCI ACWI Multiple Industries Select Index. Note that BECO is being delisted, with its last day of trading on an exchange scheduled for August 12, 2024.
    The Global X S&P 500 Risk Managed Income ETF seeks to track the Cboe S&P 500 Risk Managed Income Index by investing at least 80% of its assets in index securities. The index's strategy involves holding the underlying stocks of the S&P 500 Index while applying an options collar, specifically selling at-the-money covered call options and buying monthly 5% out-of-the-money put options corresponding to the portfolio's value. This approach aims to generate income, ideally resulting in a net credit from the options premiums, and provide risk management, though selling at-the-money calls inherently caps the fund's potential for upside participation. The U.S. Global Technology and Aerospace & Defense ETF is an actively managed ETF seeking capital appreciation by investing in equity securities of companies expected to benefit from national defense efforts. These efforts include technological innovations and the development of products and services related to aerospace, physical, and cybersecurity defense, often in preparation for or in response to domestic, regional, or global conflicts. The fund is non-diversified. The iShares Energy Storage & Materials ETF (IBAT) seeks to track the STOXX Global Energy Storage and Materials Index, which measures the performance of equity securities of global companies involved in energy storage solutions, including hydrogen, fuel cells, and batteries, aiming to support the transition to a low carbon economy. Determined by STOXX Ltd., the index selects companies based on their exposure to the theme through revenue analysis and patent assessment, while also applying exclusionary ESG screens. The index is price-weighted, based on market capitalization with capping rules. The fund generally invests at least 90% of its assets in the component securities of its underlying index or substantially identical investments and is considered non-diversified.
    The Global X S&P 500 Risk Managed Income ETF seeks to track the Cboe S&P 500 Risk Managed Income Index by investing at least 80% of its assets in index securities. The index's strategy involves holding the underlying stocks of the S&P 500 Index while applying an options collar, specifically selling at-the-money covered call options and buying monthly 5% out-of-the-money put options corresponding to the portfolio's value. This approach aims to generate income, ideally resulting in a net credit from the options premiums, and provide risk management, though selling at-the-money calls inherently caps the fund's potential for upside participation. The U.S. Global Technology and Aerospace & Defense ETF is an actively managed ETF seeking capital appreciation by investing in equity securities of companies expected to benefit from national defense efforts. These efforts include technological innovations and the development of products and services related to aerospace, physical, and cybersecurity defense, often in preparation for or in response to domestic, regional, or global conflicts. The fund is non-diversified. The Sprott Gold Miners ETF (SGDM) seeks to track the performance of the Solactive Gold Miners Custom Factors Total Return Index. This index focuses on gold mining companies based in the U.S. and Canada whose shares trade on the Toronto Stock Exchange, New York Stock Exchange, or NASDAQ. The index employs a weighting methodology that begins with market capitalization and then adjusts based on three fundamental factors: higher revenue growth, lower debt-to-equity, and higher free cash flow yield. The fund is non-diversified and normally invests at least 90% of its net assets in securities included in this index.
  • Loss: TripletLoss with these parameters:
    {
        "distance_metric": "TripletDistanceMetric.COSINE",
        "triplet_margin": 0.05
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • learning_rate: 3e-05
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • bf16: True
  • dataloader_drop_last: True
  • load_best_model_at_end: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 3e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: True
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss
0.0069 10 0.0448 -
0.0138 20 0.0354 -
0.0207 30 0.0293 -
0.0276 40 0.0381 -
0.0345 50 0.0228 -
0.0414 60 0.0238 -
0.0483 70 0.0229 -
0.0552 80 0.0148 -
0.0622 90 0.0175 -
0.0691 100 0.0161 -
0.0760 110 0.0124 -
0.0829 120 0.0111 -
0.0898 130 0.0165 -
0.0967 140 0.0162 -
0.1036 150 0.0141 -
0.1105 160 0.0116 -
0.1174 170 0.01 -
0.1243 180 0.0134 -
0.1312 190 0.0117 -
0.1381 200 0.0127 0.0131
0.1450 210 0.0083 -
0.1519 220 0.0116 -
0.1588 230 0.0099 -
0.1657 240 0.0086 -
0.1727 250 0.0099 -
0.1796 260 0.0047 -
0.1865 270 0.0052 -
0.1934 280 0.0086 -
0.2003 290 0.0084 -
0.2072 300 0.0068 -
0.2141 310 0.005 -
0.2210 320 0.0077 -
0.2279 330 0.0044 -
0.2348 340 0.0039 -
0.2417 350 0.0058 -
0.2486 360 0.0045 -
0.2555 370 0.0045 -
0.2624 380 0.0064 -
0.2693 390 0.0037 -
0.2762 400 0.0083 0.013
0.2831 410 0.0057 -
0.2901 420 0.0043 -
0.2970 430 0.0028 -
0.3039 440 0.0036 -
0.3108 450 0.0031 -
0.3177 460 0.0072 -
0.3246 470 0.0025 -
0.3315 480 0.0041 -
0.3384 490 0.0049 -
0.3453 500 0.0035 -
0.3522 510 0.0023 -
0.3591 520 0.0043 -
0.3660 530 0.0032 -
0.3729 540 0.0031 -
0.3798 550 0.0039 -
0.3867 560 0.0042 -
0.3936 570 0.0055 -
0.4006 580 0.0041 -
0.4075 590 0.0026 -
0.4144 600 0.002 0.0133
0.4213 610 0.0027 -
0.4282 620 0.0032 -
0.4351 630 0.0025 -
0.4420 640 0.0042 -
0.4489 650 0.0046 -
0.4558 660 0.0011 -
0.4627 670 0.0004 -
0.4696 680 0.0019 -
0.4765 690 0.0034 -
0.4834 700 0.0032 -
0.4903 710 0.0029 -
0.4972 720 0.0038 -
0.5041 730 0.0021 -
0.5110 740 0.0008 -
0.5180 750 0.0015 -
0.5249 760 0.0018 -
0.5318 770 0.0022 -
0.5387 780 0.0006 -
0.5456 790 0.0022 -
0.5525 800 0.0006 0.0160
0.5594 810 0.0021 -
0.5663 820 0.0013 -
0.5732 830 0.0019 -
0.5801 840 0.0017 -
0.5870 850 0.0008 -
0.5939 860 0.0012 -
0.6008 870 0.0003 -
0.6077 880 0.0009 -
0.6146 890 0.001 -
0.6215 900 0.0011 -
0.6285 910 0.0019 -
0.6354 920 0.0009 -
0.6423 930 0.0003 -
0.6492 940 0.0001 -
0.6561 950 0.0019 -
0.6630 960 0.0006 -
0.6699 970 0.0003 -
0.6768 980 0.0005 -
0.6837 990 0.0025 -
0.6906 1000 0.001 0.0154
0.6975 1010 0.0009 -
0.7044 1020 0.0004 -
0.7113 1030 0.0008 -
0.7182 1040 0.001 -
0.7251 1050 0.0018 -
0.7320 1060 0.002 -
0.7390 1070 0.0 -
0.7459 1080 0.0 -
0.7528 1090 0.0003 -
0.7597 1100 0.0002 -
0.7666 1110 0.0004 -
0.7735 1120 0.0004 -
0.7804 1130 0.0001 -
0.7873 1140 0.0002 -
0.7942 1150 0.001 -
0.8011 1160 0.0003 -
0.8080 1170 0.0003 -
0.8149 1180 0.0002 -
0.8218 1190 0.0002 -
0.8287 1200 0.0 0.0179
0.8356 1210 0.0006 -
0.8425 1220 0.0005 -
0.8494 1230 0.0015 -
0.8564 1240 0.0009 -
0.8633 1250 0.0007 -
0.8702 1260 0.0003 -
0.8771 1270 0.0003 -
0.8840 1280 0.0 -
0.8909 1290 0.0 -
0.8978 1300 0.0009 -
0.9047 1310 0.0011 -
0.9116 1320 0.0003 -
0.9185 1330 0.0 -
0.9254 1340 0.0002 -
0.9323 1350 0.0004 -
0.9392 1360 0.0004 -
0.9461 1370 0.0007 -
0.9530 1380 0.0006 -
0.9599 1390 0.0006 -
0.9669 1400 0.0005 0.0167
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 4.1.0
  • Transformers: 4.51.3
  • PyTorch: 2.1.0+cu118
  • Accelerate: 1.6.0
  • Datasets: 3.5.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

TripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification},
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}