|
|
--- |
|
|
tags: |
|
|
- sentence-transformers |
|
|
- sentence-similarity |
|
|
- feature-extraction |
|
|
- generated_from_trainer |
|
|
- dataset_size:23003 |
|
|
- loss:TripletLoss |
|
|
base_model: intfloat/multilingual-e5-large-instruct |
|
|
widget: |
|
|
- source_sentence: The Merlyn.AI SectorSurfer Momentum ETF is designed to dynamically |
|
|
shift its investment strategy based on market conditions, tracking an index that |
|
|
utilizes an algorithmic Bull/Bear indicator assessing U.S. equity markets for |
|
|
advancing trends or elevated decline risk using factors like price-trend, momentum, |
|
|
value sentiment, and volatility. In Bull markets, it targets approximately a 70/30 |
|
|
domestic/foreign aggressive equity allocation by selecting six thematic ETFs (four |
|
|
sectors, two geopolitical), while in Bear markets, it seeks safety by choosing |
|
|
at least four momentum-leading bond, treasury, and gold safe-harbor ETFs, explicitly |
|
|
avoiding inverse and leveraged funds. The index is typically evaluated monthly, |
|
|
though the indicator can trigger strategy changes anytime during excessive market |
|
|
volatility. Under normal circumstances, at least 80% of the fund's assets are |
|
|
invested in the index's component securities; the fund is non-diversified. Please |
|
|
be aware this fund is closing, with its last day of trading scheduled for November |
|
|
10, 2023. |
|
|
sentences: |
|
|
- The BlackRock Future Climate and Sustainable Economy ETF (BECO) is an actively |
|
|
managed equity fund focused on the transition to a lower carbon economy and future |
|
|
climate themes. It seeks a relatively concentrated, non-diversified portfolio |
|
|
of globally-listed companies of any market capitalization, investing across multiple |
|
|
subthemes such as sustainable energy, resource efficiency, future transport, sustainable |
|
|
nutrition, and biodiversity. The fund utilizes proprietary environmental criteria, |
|
|
including carbon metrics, and aims to align with the Paris Climate Agreement goals |
|
|
for net-zero emissions by 2050, while excluding certain high-emission industries |
|
|
and companies violating the UN Global Compact. It also attempts to achieve a better |
|
|
aggregate environmental and ESG score than its benchmark, the MSCI ACWI Multiple |
|
|
Industries Select Index. Note that BECO is being delisted, with its last day of |
|
|
trading on an exchange scheduled for August 12, 2024. |
|
|
- The Direxion Daily Semiconductor Bull 3X Shares (SOXL) seeks daily investment |
|
|
results, before fees and expenses, of 300% of the daily performance of the ICE |
|
|
Semiconductor Index. To achieve this bullish, leveraged exposure, the fund invests |
|
|
at least 80% of its net assets in financial instruments, such as swap agreements, |
|
|
securities of the index, and ETFs that track the index. The underlying ICE Semiconductor |
|
|
Index is a rules-based, modified float-adjusted market capitalization-weighted |
|
|
index that tracks the performance of the thirty largest U.S. listed semiconductor |
|
|
companies. As a daily leveraged fund, SOXL rebalances daily, meaning results over |
|
|
periods longer than one day can differ significantly from 300% of the index's |
|
|
performance due to the effects of compounding; the fund is also non-diversified. |
|
|
- The KraneShares Trust ETF seeks investment results corresponding generally to |
|
|
the price and yield performance of the Solactive Global Luxury Index. Under normal |
|
|
circumstances, the fund invests at least 80% of its net assets in instruments |
|
|
in the underlying index or those with similar economic characteristics. This index |
|
|
is a modified, free float adjusted market capitalization weighted index designed |
|
|
to measure the equity performance of companies from global luxury-related sectors, |
|
|
such as travel & leisure, premium ware, and apparel, located in developed markets. |
|
|
The index selects the top 25 companies based on criteria including size, trading |
|
|
volume, and country of listing, applying a modified weighting approach where the |
|
|
top 5 securities receive higher allocations (with the largest capped at 10%) while |
|
|
others are capped at 4.5%. The index is rebalanced semi-annually. The fund is |
|
|
non-diversified and while targeting US investments, it maintains at least 40% |
|
|
of its assets in foreign entities or those with significant business activities |
|
|
outside the United States. |
|
|
- source_sentence: The Xtrackers MSCI Emerging Markets Climate Selection ETF seeks |
|
|
to track an emerging markets index focused on companies meeting specific climate |
|
|
criteria. Derived from the MSCI ACWI Select Climate 500 methodology, the underlying |
|
|
index selects eligible emerging market stocks using an optimization process designed |
|
|
to reduce greenhouse gas emission intensity (targeting 10% revenue-related and |
|
|
7% financing-related reductions) and increase exposure to companies with SBTi-approved |
|
|
targets. The strategy also excludes controversial companies and evaluates companies |
|
|
based on broader ESG considerations. The fund is non-diversified and invests at |
|
|
least 80% of its assets in the component securities of this climate-focused emerging |
|
|
markets index. |
|
|
sentences: |
|
|
- The First Trust Indxx NextG UCITS ETF seeks investment results that generally |
|
|
correspond to the price and yield of the Indxx 5G & NextG Thematic Index. This |
|
|
tiered-weighted index of global mid- and large-cap equities tracks companies dedicating |
|
|
significant resources to the research, development, and application of fifth generation |
|
|
(5G) and emerging next generation digital cellular technologies. The fund normally |
|
|
invests at least 90% of its net assets in the index's securities, which are primarily |
|
|
drawn from themes including 5G infrastructure and hardware (such as data/cell |
|
|
tower REITs and equipment manufacturers) and telecommunication service providers |
|
|
operating relevant cellular and wireless networks. |
|
|
- The iPath S&P MLP ETN tracks an S&P Dow Jones index designed to provide exposure |
|
|
to leading partnerships listed on major U.S. exchanges. Comprising master limited |
|
|
partnerships (MLPs) and similar publicly traded limited liability companies, these |
|
|
constituents are primarily classified within the GICS Energy Sector and GICS Gas |
|
|
Utilities Industry. |
|
|
- The First Trust NASDAQ ABA Community Bank Index Fund (QABA) seeks investment results |
|
|
corresponding generally to the NASDAQ OMX® ABA Community Bank TM Index, normally |
|
|
investing at least 90% of its net assets in the index's securities. The index |
|
|
tracks NASDAQ-listed US banks and thrifts of small, mid, and large capitalization, |
|
|
designed to capture the community banking industry. Uniquely, it deliberately |
|
|
excludes the 50 largest banks by asset size, banks with significant international |
|
|
operations, and those specializing in credit cards, specifically targeting true |
|
|
community banks and avoiding larger "mega-money centers." The index is market-cap-weighted |
|
|
and undergoes regular rebalancing and reconstitution, subject to certain issuer |
|
|
weight caps. |
|
|
- source_sentence: The VanEck Morningstar Wide Moat ETF (MOAT) seeks to replicate |
|
|
the performance of the Morningstar® Wide Moat Focus IndexSM by investing at least |
|
|
80% of its assets in the index's securities. The fund targets US companies that |
|
|
Morningstar identifies as having sustainable competitive advantages ("wide moat |
|
|
companies") based on a proprietary methodology considering quantitative and qualitative |
|
|
factors. Specifically, the index focuses on companies determined to have the highest |
|
|
fair value among these wide moat firms. MOAT holds a concentrated, equal-weighted |
|
|
portfolio, which typically involves around 40 names but can hold more, featuring |
|
|
a staggered rebalance schedule and potential sector biases. The fund is non-diversified |
|
|
and employs caps on turnover and sector exposure, resulting in a strategy that |
|
|
can significantly diverge from broader market coverage despite its focus on established |
|
|
companies with competitive advantages. |
|
|
sentences: |
|
|
- The Fidelity MSCI Industrials Index ETF (FIDU) aims to match the performance of |
|
|
the MSCI USA IMI Industrials 25/25 Index, which represents the broad U.S. industrial |
|
|
sector using a market-cap-weighted approach with a 25/25 capping methodology. |
|
|
The fund, launched in October 2013, provides plain-vanilla exposure and invests |
|
|
at least 80% of its assets in securities found within this index. It uses a representative |
|
|
sampling strategy rather than replicating the entire index, and the underlying |
|
|
index is rebalanced quarterly. |
|
|
- The KraneShares Electric Vehicles and Future Mobility Index ETF (KARS) seeks to |
|
|
track the price and yield performance of the Bloomberg Electric Vehicles Index |
|
|
by investing at least 80% of its net assets in corresponding instruments or those |
|
|
with similar economic characteristics. The underlying index is designed to measure |
|
|
the equity market performance of globally-listed companies significantly involved |
|
|
in the production of electric vehicles, components, or other initiatives enhancing |
|
|
future mobility, including areas like energy storage, autonomous navigation technology, |
|
|
lithium and copper mining, and hydrogen fuel cells. KARS holds a concentrated |
|
|
portfolio, typically around 32 companies, weighted by market capitalization subject |
|
|
to specific position caps, and is reconstituted and rebalanced quarterly. |
|
|
- The iPath S&P MLP ETN tracks an S&P Dow Jones index designed to provide exposure |
|
|
to leading partnerships listed on major U.S. exchanges. Comprising master limited |
|
|
partnerships (MLPs) and similar publicly traded limited liability companies, these |
|
|
constituents are primarily classified within the GICS Energy Sector and GICS Gas |
|
|
Utilities Industry. |
|
|
- source_sentence: The Global X Clean Water ETF (AQWA) seeks to provide exposure to |
|
|
the global water industry by tracking the Solactive Global Clean Water Industry |
|
|
Index. The fund invests at least 80% of its assets in securities of this index, |
|
|
which targets companies deriving a significant portion (at least 50%) of their |
|
|
revenue from water infrastructure, equipment, and services, including treatment, |
|
|
purification, conservation, and management. The index selection process uses proprietary |
|
|
technology like NLP to identify eligible firms, incorporates minimum ESG standards |
|
|
based on UN Global Compact principles, and includes the 40 highest-ranking companies, |
|
|
weighted by market capitalization with specific caps. Reconstituted and rebalanced |
|
|
semi-annually, the fund is considered non-diversified. |
|
|
sentences: |
|
|
- The First Trust Nasdaq Transportation ETF aims to track the Nasdaq US Smart Transportation |
|
|
TM Index, investing at least 90% of its net assets in the index's securities. |
|
|
This non-diversified fund provides exposure to a concentrated portfolio of approximately |
|
|
30 highly liquid U.S. transportation companies across various segments such as |
|
|
delivery, shipping, marine, railroads, trucking, airports, airlines, bridges, |
|
|
tunnels, and automobiles. The index selects companies based on liquidity and then |
|
|
ranks and weights them according to factors reflecting growth (price returns), |
|
|
value (cash flow-to-price), and low volatility, ensuring no single constituent |
|
|
exceeds 8%. The index undergoes annual reconstitution and quarterly rebalancing. |
|
|
- The Direxion Daily Healthcare Bull 3X Shares (CURE) is an ETF that seeks daily |
|
|
investment results, before fees and expenses, of 300% (3X) of the daily performance |
|
|
of the Health Care Select Sector Index. It invests at least 80% of its net assets |
|
|
in financial instruments designed to provide this 3X daily leveraged exposure. |
|
|
The underlying index tracks US listed healthcare companies, including pharmaceuticals, |
|
|
health care equipment and supplies, providers and services, biotechnology, life |
|
|
sciences tools, and health care technology, covering major large-cap names. CURE |
|
|
is non-diversified and intended strictly as a short-term tactical instrument, |
|
|
as it delivers its stated 3X exposure only for a single day, and returns over |
|
|
longer periods can significantly differ from three times the index's performance. |
|
|
- The BlackRock Future Climate and Sustainable Economy ETF (BECO) is an actively |
|
|
managed equity fund focused on the transition to a lower carbon economy and future |
|
|
climate themes. It seeks a relatively concentrated, non-diversified portfolio |
|
|
of globally-listed companies of any market capitalization, investing across multiple |
|
|
subthemes such as sustainable energy, resource efficiency, future transport, sustainable |
|
|
nutrition, and biodiversity. The fund utilizes proprietary environmental criteria, |
|
|
including carbon metrics, and aims to align with the Paris Climate Agreement goals |
|
|
for net-zero emissions by 2050, while excluding certain high-emission industries |
|
|
and companies violating the UN Global Compact. It also attempts to achieve a better |
|
|
aggregate environmental and ESG score than its benchmark, the MSCI ACWI Multiple |
|
|
Industries Select Index. Note that BECO is being delisted, with its last day of |
|
|
trading on an exchange scheduled for August 12, 2024. |
|
|
- source_sentence: The Horizon Kinetics Medical ETF (MEDX) is an actively-managed, |
|
|
non-diversified fund aiming for long-term capital growth by investing primarily |
|
|
in global companies (U.S. and foreign) within the medical research, pharmaceuticals, |
|
|
medical technology, and related industries. The fund typically focuses on companies |
|
|
generating at least 50% of their revenue from these areas and may include companies |
|
|
of any market capitalization, with an emphasis on those involved in cancer research |
|
|
and treatment. Under normal circumstances, at least 80% of assets are invested |
|
|
in equity securities, convertibles, and warrants of such companies. Portfolio |
|
|
selection and weighting are based on the adviser's evaluation and discretion. |
|
|
The fund may also temporarily invest up to 100% in US short-term debt or invest |
|
|
in non-convertible high-yield bonds. |
|
|
sentences: |
|
|
- The Fidelity MSCI Health Care Index ETF (FHLC) seeks to track the performance |
|
|
of the MSCI USA IMI Health Care 25/50 Index, which represents the broad U.S. health |
|
|
care sector. The ETF invests at least 80% of its assets in securities included |
|
|
in this market-cap-weighted index, which captures large, mid, and small-cap companies |
|
|
across over 10 subsectors. Employing a representative sampling strategy, the fund |
|
|
aims to correspond to the index's performance. The index incorporates a 25/50 |
|
|
capping methodology, is rebalanced quarterly, and its broad reach offers diversification |
|
|
across cap sizes and subsectors, potentially reducing concentration in dominant |
|
|
large pharma names and increasing exposure to areas like drug retailers and insurance. |
|
|
The fund is classified as non-diversified. |
|
|
- The SPDR S&P Oil & Gas Equipment & Services ETF (XES) seeks investment results |
|
|
corresponding generally to the total return performance of the S&P Oil & Gas Equipment |
|
|
& Services Select Industry Index. This index represents companies in the oil and |
|
|
gas equipment and services segment of the broad U.S. S&P Total Market Index (S&P |
|
|
TMI), including those involved in activities like wildcatting, drilling hardware, |
|
|
and related services. The index utilizes an equal-weighting methodology for its |
|
|
constituent companies, which are selected based on market capitalization and liquidity |
|
|
requirements and undergo quarterly rebalancing. The fund itself employs a sampling |
|
|
strategy, aiming to invest at least 80% of its total assets in the securities |
|
|
that comprise its benchmark index. |
|
|
- The VanEck Biotech ETF (BBH) seeks to replicate the performance of the MVIS® US |
|
|
Listed Biotech 25 Index, which provides exposure to approximately 25 of the largest |
|
|
or leading U.S.-listed companies in the biotechnology industry. The fund normally |
|
|
invests at least 80% of its assets in securities comprising this market-cap-weighted |
|
|
index. The underlying index includes common stocks and depositary receipts of |
|
|
firms involved in the research, development, production, marketing, and sale of |
|
|
drugs based on genetic analysis and diagnostic equipment. While focusing on U.S.-listed |
|
|
companies, it may include foreign firms listed domestically, and medium-capitalization |
|
|
companies can be included. Reflecting the index's concentration, the fund is non-diversified |
|
|
and may have a top-heavy portfolio. The index is reviewed semi-annually. |
|
|
datasets: |
|
|
- hobbang/stage1-triplet-dataset-selected |
|
|
pipeline_tag: sentence-similarity |
|
|
library_name: sentence-transformers |
|
|
--- |
|
|
|
|
|
# SentenceTransformer based on intfloat/multilingual-e5-large-instruct |
|
|
|
|
|
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [intfloat/multilingual-e5-large-instruct](https://huggingface.co/intfloat/multilingual-e5-large-instruct) on the [stage1-triplet-dataset-selected](https://huggingface.co/datasets/hobbang/stage1-triplet-dataset-selected) dataset. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
### Model Description |
|
|
- **Model Type:** Sentence Transformer |
|
|
- **Base model:** [intfloat/multilingual-e5-large-instruct](https://huggingface.co/intfloat/multilingual-e5-large-instruct) <!-- at revision 84344a23ee1820ac951bc365f1e91d094a911763 --> |
|
|
- **Maximum Sequence Length:** 512 tokens |
|
|
- **Output Dimensionality:** 1024 dimensions |
|
|
- **Similarity Function:** Cosine Similarity |
|
|
- **Training Dataset:** |
|
|
- [stage1-triplet-dataset-selected](https://huggingface.co/datasets/hobbang/stage1-triplet-dataset-selected) |
|
|
<!-- - **Language:** Unknown --> |
|
|
<!-- - **License:** Unknown --> |
|
|
|
|
|
### Model Sources |
|
|
|
|
|
- **Documentation:** [Sentence Transformers Documentation](https://sbert.net) |
|
|
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers) |
|
|
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers) |
|
|
|
|
|
### Full Model Architecture |
|
|
|
|
|
``` |
|
|
SentenceTransformer( |
|
|
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: XLMRobertaModel |
|
|
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) |
|
|
(2): Normalize() |
|
|
) |
|
|
``` |
|
|
|
|
|
## Usage |
|
|
|
|
|
### Direct Usage (Sentence Transformers) |
|
|
|
|
|
First install the Sentence Transformers library: |
|
|
|
|
|
```bash |
|
|
pip install -U sentence-transformers |
|
|
``` |
|
|
|
|
|
Then you can load this model and run inference. |
|
|
```python |
|
|
from sentence_transformers import SentenceTransformer |
|
|
|
|
|
# Download from the 🤗 Hub |
|
|
model = SentenceTransformer("sentence_transformers_model_id") |
|
|
# Run inference |
|
|
sentences = [ |
|
|
"The Horizon Kinetics Medical ETF (MEDX) is an actively-managed, non-diversified fund aiming for long-term capital growth by investing primarily in global companies (U.S. and foreign) within the medical research, pharmaceuticals, medical technology, and related industries. The fund typically focuses on companies generating at least 50% of their revenue from these areas and may include companies of any market capitalization, with an emphasis on those involved in cancer research and treatment. Under normal circumstances, at least 80% of assets are invested in equity securities, convertibles, and warrants of such companies. Portfolio selection and weighting are based on the adviser's evaluation and discretion. The fund may also temporarily invest up to 100% in US short-term debt or invest in non-convertible high-yield bonds.", |
|
|
"The VanEck Biotech ETF (BBH) seeks to replicate the performance of the MVIS® US Listed Biotech 25 Index, which provides exposure to approximately 25 of the largest or leading U.S.-listed companies in the biotechnology industry. The fund normally invests at least 80% of its assets in securities comprising this market-cap-weighted index. The underlying index includes common stocks and depositary receipts of firms involved in the research, development, production, marketing, and sale of drugs based on genetic analysis and diagnostic equipment. While focusing on U.S.-listed companies, it may include foreign firms listed domestically, and medium-capitalization companies can be included. Reflecting the index's concentration, the fund is non-diversified and may have a top-heavy portfolio. The index is reviewed semi-annually.", |
|
|
'The SPDR S&P Oil & Gas Equipment & Services ETF (XES) seeks investment results corresponding generally to the total return performance of the S&P Oil & Gas Equipment & Services Select Industry Index. This index represents companies in the oil and gas equipment and services segment of the broad U.S. S&P Total Market Index (S&P TMI), including those involved in activities like wildcatting, drilling hardware, and related services. The index utilizes an equal-weighting methodology for its constituent companies, which are selected based on market capitalization and liquidity requirements and undergo quarterly rebalancing. The fund itself employs a sampling strategy, aiming to invest at least 80% of its total assets in the securities that comprise its benchmark index.', |
|
|
] |
|
|
embeddings = model.encode(sentences) |
|
|
print(embeddings.shape) |
|
|
# [3, 1024] |
|
|
|
|
|
# Get the similarity scores for the embeddings |
|
|
similarities = model.similarity(embeddings, embeddings) |
|
|
print(similarities.shape) |
|
|
# [3, 3] |
|
|
``` |
|
|
|
|
|
<!-- |
|
|
### Direct Usage (Transformers) |
|
|
|
|
|
<details><summary>Click to see the direct usage in Transformers</summary> |
|
|
|
|
|
</details> |
|
|
--> |
|
|
|
|
|
<!-- |
|
|
### Downstream Usage (Sentence Transformers) |
|
|
|
|
|
You can finetune this model on your own dataset. |
|
|
|
|
|
<details><summary>Click to expand</summary> |
|
|
|
|
|
</details> |
|
|
--> |
|
|
|
|
|
<!-- |
|
|
### Out-of-Scope Use |
|
|
|
|
|
*List how the model may foreseeably be misused and address what users ought not to do with the model.* |
|
|
--> |
|
|
|
|
|
<!-- |
|
|
## Bias, Risks and Limitations |
|
|
|
|
|
*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.* |
|
|
--> |
|
|
|
|
|
<!-- |
|
|
### Recommendations |
|
|
|
|
|
*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.* |
|
|
--> |
|
|
|
|
|
## Training Details |
|
|
|
|
|
### Training Dataset |
|
|
|
|
|
#### stage1-triplet-dataset-selected |
|
|
|
|
|
* Dataset: [stage1-triplet-dataset-selected](https://huggingface.co/datasets/hobbang/stage1-triplet-dataset-selected) at [18e0423](https://huggingface.co/datasets/hobbang/stage1-triplet-dataset-selected/tree/18e0423399bc6678e814264ca8c8acdf02dfce97) |
|
|
* Size: 23,003 training samples |
|
|
* Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code> |
|
|
* Approximate statistics based on the first 1000 samples: |
|
|
| | anchor | positive | negative | |
|
|
|:--------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------| |
|
|
| type | string | string | string | |
|
|
| details | <ul><li>min: 94 tokens</li><li>mean: 170.87 tokens</li><li>max: 224 tokens</li></ul> | <ul><li>min: 29 tokens</li><li>mean: 174.15 tokens</li><li>max: 261 tokens</li></ul> | <ul><li>min: 72 tokens</li><li>mean: 174.89 tokens</li><li>max: 261 tokens</li></ul> | |
|
|
* Samples: |
|
|
| anchor | positive | negative | |
|
|
|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| |
|
|
| <code>The Invesco Financial Preferred ETF (PGF) seeks to track the ICE Exchange-Listed Fixed Rate Financial Preferred Securities Index, primarily by investing at least 90% of its total assets in the securities comprising the index. The underlying index is market capitalization weighted and designed to track the performance of exchange-listed, fixed rate, U.S. dollar denominated preferred securities, including functionally equivalent instruments, issued by U.S. financial companies. PGF provides a concentrated portfolio exclusively focused on financial-sector preferred securities and is considered non-diversified, holding both investment- and non-investment-grade securities within this focus.</code> | <code>The FlexShares ESG & Climate Investment Grade Corporate Core Index Fund (FEIG) is a passively managed ETF designed to provide broad-market, core exposure to USD-denominated investment-grade corporate bonds. It seeks to track the performance of the Northern Trust ESG & Climate Investment Grade U.S. Corporate Core IndexSM, which selects bonds from a universe of USD-denominated, investment-grade corporate debt with maturities of at least one year. The index employs an optimization process to increase the aggregate ESG score and reduce aggregate climate-related risk among constituent companies, involving ranking firms on material ESG metrics, governance, and carbon risks, while excluding controversial companies and international initiative violators. Weights are also optimized to minimize systematic risk, and the index is rebalanced monthly. Under normal circumstances, the fund invests at least 80% of its assets in the index's securities.</code> | <code>The Viridi Bitcoin Miners ETF primarily invests in companies engaged in Bitcoin mining, aiming to allocate at least 80% of its net assets, plus borrowings for investment purposes, to securities of such companies under normal circumstances. The fund focuses on U.S. and non-U.S. equity securities in developed markets, which may include investments via depositary receipts. It also specifically targets common stock from newly listed IPOs, shares derived from SPAC IPOs, and securities resulting from reverse mergers. This ETF is non-diversified.</code> | |
|
|
| <code>The Invesco Financial Preferred ETF (PGF) seeks to track the ICE Exchange-Listed Fixed Rate Financial Preferred Securities Index, primarily by investing at least 90% of its total assets in the securities comprising the index. The underlying index is market capitalization weighted and designed to track the performance of exchange-listed, fixed rate, U.S. dollar denominated preferred securities, including functionally equivalent instruments, issued by U.S. financial companies. PGF provides a concentrated portfolio exclusively focused on financial-sector preferred securities and is considered non-diversified, holding both investment- and non-investment-grade securities within this focus.</code> | <code>The Fidelity Sustainable High Yield ETF (FSYD) is an actively managed fund primarily seeking high income, and potentially capital growth, by investing at least 80% of its assets in global high-yield (below investment grade) debt securities. The fund focuses on issuers demonstrating proven or improving sustainability practices based on an evaluation of their individual environmental, social, and governance (ESG) profiles using a proprietary rating process. Its comprehensive selection approach also incorporates a multi-factor quantitative screening model and fundamental analysis of issuers, aiming to identify value and quality within the high-yield universe.</code> | <code>The ETFMG Prime Mobile Payments ETF seeks to track the performance of the Nasdaq CTA Global Digital Payments Index, which identifies companies engaged in the global digital payments industry across categories like card networks, infrastructure, software, processors, and solutions. Under normal circumstances, the fund invests at least 80% of its net assets in common stocks (including ADRs and GDRs) of these Mobile Payments Companies. It typically holds a narrow portfolio expected to contain up to 50 companies, weighted using a theme-adjusted market capitalization scheme, and is considered non-diversified.</code> | |
|
|
| <code>The Invesco Financial Preferred ETF (PGF) seeks to track the ICE Exchange-Listed Fixed Rate Financial Preferred Securities Index, primarily by investing at least 90% of its total assets in the securities comprising the index. The underlying index is market capitalization weighted and designed to track the performance of exchange-listed, fixed rate, U.S. dollar denominated preferred securities, including functionally equivalent instruments, issued by U.S. financial companies. PGF provides a concentrated portfolio exclusively focused on financial-sector preferred securities and is considered non-diversified, holding both investment- and non-investment-grade securities within this focus.</code> | <code>The First Trust TCW Securitized Plus ETF (DEED) is an actively-managed fund focused on U.S. securitized debt securities, aiming to maximize long-term total return and outperform the Bloomberg US Mortgage-Backed Securities Index. Under normal market conditions, the fund allocates at least 80% of its net assets to securitized debt, including asset-backed securities, residential and commercial mortgage-backed securities, and collateralized loan obligations (CLOs). At least 50% of total assets are invested in securities issued or guaranteed by the U.S. government, its agencies, or government-sponsored entities, while the balance may include non-government and privately-issued securitized debt. The fund invests across various maturities and credit qualities (junk and investment-grade), using proprietary research to identify undervalued securities, and may utilize OTC derivatives for up to 25% of the portfolio.</code> | <code>The First Trust Growth Strength UCITS ETF aims to track the price and yield of The Growth Strength Index. Passively managed, the fund normally invests at least 80% of its assets in the index's common stocks and REIT components. The index selects 50 equal-weighted, well-capitalized, large-cap US companies from the top 500 US securities by market capitalization based on fundamental criteria such as return on equity, long-term debt levels, liquidity, positive shareholder equity, and a composite ranking based on 3-year revenue and cash flow growth. The resulting portfolio is non-diversified and rebalanced quarterly.</code> | |
|
|
* Loss: [<code>TripletLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#tripletloss) with these parameters: |
|
|
```json |
|
|
{ |
|
|
"distance_metric": "TripletDistanceMetric.COSINE", |
|
|
"triplet_margin": 0.05 |
|
|
} |
|
|
``` |
|
|
|
|
|
### Evaluation Dataset |
|
|
|
|
|
#### stage1-triplet-dataset-selected |
|
|
|
|
|
* Dataset: [stage1-triplet-dataset-selected](https://huggingface.co/datasets/hobbang/stage1-triplet-dataset-selected) at [18e0423](https://huggingface.co/datasets/hobbang/stage1-triplet-dataset-selected/tree/18e0423399bc6678e814264ca8c8acdf02dfce97) |
|
|
* Size: 388 evaluation samples |
|
|
* Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code> |
|
|
* Approximate statistics based on the first 388 samples: |
|
|
| | anchor | positive | negative | |
|
|
|:--------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------| |
|
|
| type | string | string | string | |
|
|
| details | <ul><li>min: 85 tokens</li><li>mean: 176.98 tokens</li><li>max: 271 tokens</li></ul> | <ul><li>min: 85 tokens</li><li>mean: 176.83 tokens</li><li>max: 271 tokens</li></ul> | <ul><li>min: 85 tokens</li><li>mean: 175.41 tokens</li><li>max: 271 tokens</li></ul> | |
|
|
* Samples: |
|
|
| anchor | positive | negative | |
|
|
|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| |
|
|
| <code>The Global X S&P 500 Risk Managed Income ETF seeks to track the Cboe S&P 500 Risk Managed Income Index by investing at least 80% of its assets in index securities. The index's strategy involves holding the underlying stocks of the S&P 500 Index while applying an options collar, specifically selling at-the-money covered call options and buying monthly 5% out-of-the-money put options corresponding to the portfolio's value. This approach aims to generate income, ideally resulting in a net credit from the options premiums, and provide risk management, though selling at-the-money calls inherently caps the fund's potential for upside participation.</code> | <code>The U.S. Global Technology and Aerospace & Defense ETF is an actively managed ETF seeking capital appreciation by investing in equity securities of companies expected to benefit from national defense efforts. These efforts include technological innovations and the development of products and services related to aerospace, physical, and cybersecurity defense, often in preparation for or in response to domestic, regional, or global conflicts. The fund is non-diversified.</code> | <code>The KraneShares Global Carbon Offset Strategy ETF (KSET) was the first US-listed ETF providing exposure to the global voluntary carbon market. It achieved this by investing primarily in liquid carbon offset credit futures, including CME-traded Global Emissions Offsets (GEOs) and Nature-Based Global Emission Offsets (N-GEOs), which are designed to help businesses meet greenhouse gas reduction goals. Tracking an index that weighted eligible futures based on liquidity, the fund sought exposure to the same carbon offset credit futures, typically those maturing within two years. The ETF was considered non-diversified and utilized a Cayman Island subsidiary. However, the fund was delisted, with its last day of trading on an exchange being March 14, 2024.</code> | |
|
|
| <code>The Global X S&P 500 Risk Managed Income ETF seeks to track the Cboe S&P 500 Risk Managed Income Index by investing at least 80% of its assets in index securities. The index's strategy involves holding the underlying stocks of the S&P 500 Index while applying an options collar, specifically selling at-the-money covered call options and buying monthly 5% out-of-the-money put options corresponding to the portfolio's value. This approach aims to generate income, ideally resulting in a net credit from the options premiums, and provide risk management, though selling at-the-money calls inherently caps the fund's potential for upside participation.</code> | <code>The JPMorgan Social Advancement ETF (UPWD) is an actively managed, non-diversified fund that seeks to invest globally in companies facilitating social and economic advancements and empowerment across the socioeconomic spectrum. Primarily holding common stocks, depositary receipts, and REITs, the fund targets themes including essential amenities, affordable housing, healthcare, education, attainable financing, and the digital ecosystem, potentially investing in companies of various sizes, including small-caps, across U.S., foreign, and emerging markets with possible concentration in specific sectors. Security selection follows a proprietary three-step process involving exclusions, thematic ranking using a ThemeBot, and a sustainable investment inclusion process combined with fundamental research. Please note that this security is being delisted, with its last day of trading scheduled for December 15, 2023.</code> | <code>The Direxion Daily Gold Miners Index Bull 2X Shares (NUGT) is designed to provide 200% of the daily performance of the NYSE Arca Gold Miners Index, before fees and expenses. This market-cap-weighted index comprises publicly traded global companies, primarily involved in gold mining and to a lesser extent silver mining, operating in both developed and emerging markets. NUGT achieves its objective by investing at least 80% of its net assets in financial instruments providing 2X daily leveraged exposure to the index. As a leveraged fund intended for daily results, NUGT is designed for short-term trading, typically held for only one trading day, and holding it for longer periods can lead to performance results that differ significantly from the stated daily target due to the effects of compounding. The fund is also non-diversified.</code> | |
|
|
| <code>The Global X S&P 500 Risk Managed Income ETF seeks to track the Cboe S&P 500 Risk Managed Income Index by investing at least 80% of its assets in index securities. The index's strategy involves holding the underlying stocks of the S&P 500 Index while applying an options collar, specifically selling at-the-money covered call options and buying monthly 5% out-of-the-money put options corresponding to the portfolio's value. This approach aims to generate income, ideally resulting in a net credit from the options premiums, and provide risk management, though selling at-the-money calls inherently caps the fund's potential for upside participation.</code> | <code>The Xtrackers MSCI Emerging Markets ESG Leaders Equity ETF tracks an index of large- and mid-cap emerging market stocks that emphasize strong environmental, social, and governance (ESG) characteristics. The index first excludes companies involved in specific controversial industries. From the remaining universe, it ranks stocks based on MSCI ESG scores, including a controversy component, to identify and select the highest-ranking ESG leaders, effectively screening out ESG laggards. To maintain market-like country and sector weights, the index selects the top ESG-scoring stocks within each sector until a specified market capitalization threshold is reached. Selected stocks are then weighted by market capitalization within their respective sectors. The fund typically invests over 80% of its assets in the securities of this underlying index.</code> | <code>The BlackRock Future Climate and Sustainable Economy ETF (BECO) is an actively managed equity fund focused on the transition to a lower carbon economy and future climate themes. It seeks a relatively concentrated, non-diversified portfolio of globally-listed companies of any market capitalization, investing across multiple subthemes such as sustainable energy, resource efficiency, future transport, sustainable nutrition, and biodiversity. The fund utilizes proprietary environmental criteria, including carbon metrics, and aims to align with the Paris Climate Agreement goals for net-zero emissions by 2050, while excluding certain high-emission industries and companies violating the UN Global Compact. It also attempts to achieve a better aggregate environmental and ESG score than its benchmark, the MSCI ACWI Multiple Industries Select Index. Note that BECO is being delisted, with its last day of trading on an exchange scheduled for August 12, 2024.</code> | |
|
|
* Loss: [<code>TripletLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#tripletloss) with these parameters: |
|
|
```json |
|
|
{ |
|
|
"distance_metric": "TripletDistanceMetric.COSINE", |
|
|
"triplet_margin": 0.05 |
|
|
} |
|
|
``` |
|
|
|
|
|
### Training Hyperparameters |
|
|
#### Non-Default Hyperparameters |
|
|
|
|
|
- `eval_strategy`: steps |
|
|
- `per_device_train_batch_size`: 32 |
|
|
- `per_device_eval_batch_size`: 16 |
|
|
- `learning_rate`: 2e-06 |
|
|
- `num_train_epochs`: 1 |
|
|
- `warmup_ratio`: 0.1 |
|
|
- `bf16`: True |
|
|
- `dataloader_drop_last`: True |
|
|
- `load_best_model_at_end`: True |
|
|
- `batch_sampler`: no_duplicates |
|
|
|
|
|
#### All Hyperparameters |
|
|
<details><summary>Click to expand</summary> |
|
|
|
|
|
- `overwrite_output_dir`: False |
|
|
- `do_predict`: False |
|
|
- `eval_strategy`: steps |
|
|
- `prediction_loss_only`: True |
|
|
- `per_device_train_batch_size`: 32 |
|
|
- `per_device_eval_batch_size`: 16 |
|
|
- `per_gpu_train_batch_size`: None |
|
|
- `per_gpu_eval_batch_size`: None |
|
|
- `gradient_accumulation_steps`: 1 |
|
|
- `eval_accumulation_steps`: None |
|
|
- `torch_empty_cache_steps`: None |
|
|
- `learning_rate`: 2e-06 |
|
|
- `weight_decay`: 0.0 |
|
|
- `adam_beta1`: 0.9 |
|
|
- `adam_beta2`: 0.999 |
|
|
- `adam_epsilon`: 1e-08 |
|
|
- `max_grad_norm`: 1.0 |
|
|
- `num_train_epochs`: 1 |
|
|
- `max_steps`: -1 |
|
|
- `lr_scheduler_type`: linear |
|
|
- `lr_scheduler_kwargs`: {} |
|
|
- `warmup_ratio`: 0.1 |
|
|
- `warmup_steps`: 0 |
|
|
- `log_level`: passive |
|
|
- `log_level_replica`: warning |
|
|
- `log_on_each_node`: True |
|
|
- `logging_nan_inf_filter`: True |
|
|
- `save_safetensors`: True |
|
|
- `save_on_each_node`: False |
|
|
- `save_only_model`: False |
|
|
- `restore_callback_states_from_checkpoint`: False |
|
|
- `no_cuda`: False |
|
|
- `use_cpu`: False |
|
|
- `use_mps_device`: False |
|
|
- `seed`: 42 |
|
|
- `data_seed`: None |
|
|
- `jit_mode_eval`: False |
|
|
- `use_ipex`: False |
|
|
- `bf16`: True |
|
|
- `fp16`: False |
|
|
- `fp16_opt_level`: O1 |
|
|
- `half_precision_backend`: auto |
|
|
- `bf16_full_eval`: False |
|
|
- `fp16_full_eval`: False |
|
|
- `tf32`: None |
|
|
- `local_rank`: 0 |
|
|
- `ddp_backend`: None |
|
|
- `tpu_num_cores`: None |
|
|
- `tpu_metrics_debug`: False |
|
|
- `debug`: [] |
|
|
- `dataloader_drop_last`: True |
|
|
- `dataloader_num_workers`: 0 |
|
|
- `dataloader_prefetch_factor`: None |
|
|
- `past_index`: -1 |
|
|
- `disable_tqdm`: False |
|
|
- `remove_unused_columns`: True |
|
|
- `label_names`: None |
|
|
- `load_best_model_at_end`: True |
|
|
- `ignore_data_skip`: False |
|
|
- `fsdp`: [] |
|
|
- `fsdp_min_num_params`: 0 |
|
|
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False} |
|
|
- `tp_size`: 0 |
|
|
- `fsdp_transformer_layer_cls_to_wrap`: None |
|
|
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None} |
|
|
- `deepspeed`: None |
|
|
- `label_smoothing_factor`: 0.0 |
|
|
- `optim`: adamw_torch |
|
|
- `optim_args`: None |
|
|
- `adafactor`: False |
|
|
- `group_by_length`: False |
|
|
- `length_column_name`: length |
|
|
- `ddp_find_unused_parameters`: None |
|
|
- `ddp_bucket_cap_mb`: None |
|
|
- `ddp_broadcast_buffers`: False |
|
|
- `dataloader_pin_memory`: True |
|
|
- `dataloader_persistent_workers`: False |
|
|
- `skip_memory_metrics`: True |
|
|
- `use_legacy_prediction_loop`: False |
|
|
- `push_to_hub`: False |
|
|
- `resume_from_checkpoint`: None |
|
|
- `hub_model_id`: None |
|
|
- `hub_strategy`: every_save |
|
|
- `hub_private_repo`: None |
|
|
- `hub_always_push`: False |
|
|
- `gradient_checkpointing`: False |
|
|
- `gradient_checkpointing_kwargs`: None |
|
|
- `include_inputs_for_metrics`: False |
|
|
- `include_for_metrics`: [] |
|
|
- `eval_do_concat_batches`: True |
|
|
- `fp16_backend`: auto |
|
|
- `push_to_hub_model_id`: None |
|
|
- `push_to_hub_organization`: None |
|
|
- `mp_parameters`: |
|
|
- `auto_find_batch_size`: False |
|
|
- `full_determinism`: False |
|
|
- `torchdynamo`: None |
|
|
- `ray_scope`: last |
|
|
- `ddp_timeout`: 1800 |
|
|
- `torch_compile`: False |
|
|
- `torch_compile_backend`: None |
|
|
- `torch_compile_mode`: None |
|
|
- `include_tokens_per_second`: False |
|
|
- `include_num_input_tokens_seen`: False |
|
|
- `neftune_noise_alpha`: None |
|
|
- `optim_target_modules`: None |
|
|
- `batch_eval_metrics`: False |
|
|
- `eval_on_start`: False |
|
|
- `use_liger_kernel`: False |
|
|
- `eval_use_gather_object`: False |
|
|
- `average_tokens_across_devices`: False |
|
|
- `prompts`: None |
|
|
- `batch_sampler`: no_duplicates |
|
|
- `multi_dataset_batch_sampler`: proportional |
|
|
|
|
|
</details> |
|
|
|
|
|
### Training Logs |
|
|
| Epoch | Step | Training Loss | Validation Loss | |
|
|
|:----------:|:-------:|:-------------:|:---------------:| |
|
|
| 0.0139 | 10 | 0.0367 | - | |
|
|
| 0.0279 | 20 | 0.0378 | - | |
|
|
| 0.0418 | 30 | 0.0346 | - | |
|
|
| 0.0557 | 40 | 0.0337 | - | |
|
|
| 0.0696 | 50 | 0.0328 | - | |
|
|
| 0.0836 | 60 | 0.0291 | - | |
|
|
| 0.0975 | 70 | 0.0257 | - | |
|
|
| 0.1114 | 80 | 0.0206 | - | |
|
|
| 0.1253 | 90 | 0.0201 | - | |
|
|
| 0.1393 | 100 | 0.0208 | 0.0132 | |
|
|
| 0.1532 | 110 | 0.0167 | - | |
|
|
| 0.1671 | 120 | 0.0167 | - | |
|
|
| 0.1811 | 130 | 0.0156 | - | |
|
|
| 0.1950 | 140 | 0.0153 | - | |
|
|
| 0.2089 | 150 | 0.0125 | - | |
|
|
| 0.2228 | 160 | 0.0141 | - | |
|
|
| 0.2368 | 170 | 0.0153 | - | |
|
|
| 0.2507 | 180 | 0.0142 | - | |
|
|
| 0.2646 | 190 | 0.0095 | - | |
|
|
| 0.2786 | 200 | 0.0144 | 0.0111 | |
|
|
| 0.2925 | 210 | 0.0132 | - | |
|
|
| 0.3064 | 220 | 0.0107 | - | |
|
|
| 0.3203 | 230 | 0.0116 | - | |
|
|
| 0.3343 | 240 | 0.0134 | - | |
|
|
| 0.3482 | 250 | 0.0112 | - | |
|
|
| 0.3621 | 260 | 0.0115 | - | |
|
|
| 0.3760 | 270 | 0.0124 | - | |
|
|
| 0.3900 | 280 | 0.0126 | - | |
|
|
| 0.4039 | 290 | 0.0105 | - | |
|
|
| 0.4178 | 300 | 0.0111 | 0.0109 | |
|
|
| 0.4318 | 310 | 0.0136 | - | |
|
|
| 0.4457 | 320 | 0.0123 | - | |
|
|
| 0.4596 | 330 | 0.0113 | - | |
|
|
| 0.4735 | 340 | 0.0125 | - | |
|
|
| 0.4875 | 350 | 0.0082 | - | |
|
|
| 0.5014 | 360 | 0.0102 | - | |
|
|
| 0.5153 | 370 | 0.0081 | - | |
|
|
| 0.5292 | 380 | 0.0115 | - | |
|
|
| 0.5432 | 390 | 0.0107 | - | |
|
|
| 0.5571 | 400 | 0.012 | 0.0106 | |
|
|
| 0.5710 | 410 | 0.0094 | - | |
|
|
| 0.5850 | 420 | 0.0099 | - | |
|
|
| 0.5989 | 430 | 0.0105 | - | |
|
|
| 0.6128 | 440 | 0.0101 | - | |
|
|
| 0.6267 | 450 | 0.0099 | - | |
|
|
| 0.6407 | 460 | 0.0106 | - | |
|
|
| 0.6546 | 470 | 0.0099 | - | |
|
|
| 0.6685 | 480 | 0.0108 | - | |
|
|
| 0.6825 | 490 | 0.01 | - | |
|
|
| **0.6964** | **500** | **0.0084** | **0.0102** | |
|
|
| 0.7103 | 510 | 0.0092 | - | |
|
|
| 0.7242 | 520 | 0.0084 | - | |
|
|
| 0.7382 | 530 | 0.0077 | - | |
|
|
| 0.7521 | 540 | 0.0096 | - | |
|
|
| 0.7660 | 550 | 0.0099 | - | |
|
|
| 0.7799 | 560 | 0.0103 | - | |
|
|
| 0.7939 | 570 | 0.0082 | - | |
|
|
| 0.8078 | 580 | 0.009 | - | |
|
|
| 0.8217 | 590 | 0.0078 | - | |
|
|
| 0.8357 | 600 | 0.0091 | 0.0104 | |
|
|
| 0.8496 | 610 | 0.0088 | - | |
|
|
| 0.8635 | 620 | 0.0103 | - | |
|
|
| 0.8774 | 630 | 0.0109 | - | |
|
|
| 0.8914 | 640 | 0.0072 | - | |
|
|
| 0.9053 | 650 | 0.0084 | - | |
|
|
| 0.9192 | 660 | 0.0099 | - | |
|
|
| 0.9331 | 670 | 0.008 | - | |
|
|
| 0.9471 | 680 | 0.0081 | - | |
|
|
| 0.9610 | 690 | 0.0075 | - | |
|
|
| 0.9749 | 700 | 0.0096 | 0.0103 | |
|
|
| 0.9889 | 710 | 0.0089 | - | |
|
|
|
|
|
* The bold row denotes the saved checkpoint. |
|
|
|
|
|
### Framework Versions |
|
|
- Python: 3.10.12 |
|
|
- Sentence Transformers: 4.1.0 |
|
|
- Transformers: 4.51.3 |
|
|
- PyTorch: 2.1.0+cu118 |
|
|
- Accelerate: 1.6.0 |
|
|
- Datasets: 3.5.0 |
|
|
- Tokenizers: 0.21.1 |
|
|
|
|
|
## Citation |
|
|
|
|
|
### BibTeX |
|
|
|
|
|
#### Sentence Transformers |
|
|
```bibtex |
|
|
@inproceedings{reimers-2019-sentence-bert, |
|
|
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", |
|
|
author = "Reimers, Nils and Gurevych, Iryna", |
|
|
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", |
|
|
month = "11", |
|
|
year = "2019", |
|
|
publisher = "Association for Computational Linguistics", |
|
|
url = "https://arxiv.org/abs/1908.10084", |
|
|
} |
|
|
``` |
|
|
|
|
|
#### TripletLoss |
|
|
```bibtex |
|
|
@misc{hermans2017defense, |
|
|
title={In Defense of the Triplet Loss for Person Re-Identification}, |
|
|
author={Alexander Hermans and Lucas Beyer and Bastian Leibe}, |
|
|
year={2017}, |
|
|
eprint={1703.07737}, |
|
|
archivePrefix={arXiv}, |
|
|
primaryClass={cs.CV} |
|
|
} |
|
|
``` |
|
|
|
|
|
<!-- |
|
|
## Glossary |
|
|
|
|
|
*Clearly define terms in order to be accessible across audiences.* |
|
|
--> |
|
|
|
|
|
<!-- |
|
|
## Model Card Authors |
|
|
|
|
|
*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.* |
|
|
--> |
|
|
|
|
|
<!-- |
|
|
## Model Card Contact |
|
|
|
|
|
*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.* |
|
|
--> |