|
|
--- |
|
|
tags: |
|
|
- sentence-transformers |
|
|
- sentence-similarity |
|
|
- feature-extraction |
|
|
- generated_from_trainer |
|
|
- dataset_size:128997 |
|
|
- loss:MultipleNegativesRankingLoss |
|
|
base_model: suhwan3/mpnet_step1 |
|
|
widget: |
|
|
- source_sentence: The Global X S&P 500 Risk Managed Income ETF seeks to track the |
|
|
Cboe S&P 500 Risk Managed Income Index by investing at least 80% of its assets |
|
|
in index securities. The index's strategy involves holding the underlying stocks |
|
|
of the S&P 500 Index while applying an options collar, specifically selling at-the-money |
|
|
covered call options and buying monthly 5% out-of-the-money put options corresponding |
|
|
to the portfolio's value. This approach aims to generate income, ideally resulting |
|
|
in a net credit from the options premiums, and provide risk management, though |
|
|
selling at-the-money calls inherently caps the fund's potential for upside participation. |
|
|
sentences: |
|
|
- Nasdaq, Inc. operates as a technology company that serves capital markets and |
|
|
other industries worldwide. The Market Technology segment includes anti financial |
|
|
crime technology business, which offers Nasdaq Trade Surveillance, a SaaS solution |
|
|
for brokers and other market participants to assist them in complying with market |
|
|
rules, regulations, and internal market surveillance policies; Nasdaq Automated |
|
|
Investigator, a cloud-deployed anti-money laundering tool; and Verafin, a SaaS |
|
|
technology provider of anti-financial crime management solutions. This segment |
|
|
also handles assets, such as cash equities, equity derivatives, currencies, interest-bearing |
|
|
securities, commodities, energy products, and digital currencies. The Investment |
|
|
Intelligence segment sells and distributes historical and real-time market data; |
|
|
develops and licenses Nasdaq-branded indexes and financial products; and provides |
|
|
investment insights and workflow solutions. The Corporate Platforms segment operates |
|
|
listing platforms; and offers investor relations intelligence and governance solutions. |
|
|
As of December 31, 2021, it had 4,178 companies listed securities on The Nasdaq |
|
|
Stock Market, including 1,632 listings on The Nasdaq Global Select Market; 1,169 |
|
|
on The Nasdaq Global Market; and 1,377 on The Nasdaq Capital Market. The Market |
|
|
Services segment includes equity derivative trading and clearing, cash equity |
|
|
trading, fixed income and commodities trading and clearing, and trade management |
|
|
service businesses. This segment operates various exchanges and other marketplace |
|
|
facilities across various asset classes, which include derivatives, commodities, |
|
|
cash equity, debt, structured products, and exchange traded products; and provides |
|
|
broker, clearing, settlement, and central depository services. The company was |
|
|
formerly known as The NASDAQ OMX Group, Inc. and changed its name to Nasdaq, Inc. |
|
|
in September 2015. Nasdaq, Inc. was founded in 1971 and is headquartered in New |
|
|
York, New York. |
|
|
- Jabil Inc. provides manufacturing services and solutions worldwide. The company |
|
|
operates in two segments, Electronics Manufacturing Services and Diversified Manufacturing |
|
|
Services. It offers electronics design, production, and product management services. |
|
|
The company provides electronic design services, such as application-specific |
|
|
integrated circuit design, firmware development, and rapid prototyping services; |
|
|
and designs plastic and metal enclosures that include the electro-mechanics, such |
|
|
as the printed circuit board assemblies (PCBA). It also specializes in the three-dimensional |
|
|
mechanical design comprising the analysis of electronic, electro-mechanical, and |
|
|
optical assemblies, as well as offers various industrial design, mechanism development, |
|
|
and tooling management services. In addition, the company provides computer-assisted |
|
|
design services consisting of PCBA design, as well as PCBA design validation and |
|
|
verification services; and other consulting services, such as the generation of |
|
|
a bill of materials, approved vendor list, and assembly equipment configuration |
|
|
for various PCBA designs. Further, it offers product and process validation services, |
|
|
such as product system, product safety, regulatory compliance, and reliability |
|
|
tests, as well as manufacturing test solution development services. Additionally, |
|
|
the company provides systems assembly, test, direct-order fulfillment, and configure-to-order |
|
|
services. It serves 5G, wireless and cloud, digital print and retail, industrial |
|
|
and semi-cap, networking and storage, automotive and transportation, connected |
|
|
devices, healthcare and packaging, and mobility industries. The company was formerly |
|
|
known as Jabil Circuit, Inc. and changed its name to Jabil Inc. in June 2017. |
|
|
Jabil Inc. was founded in 1966 and is headquartered in Saint Petersburg, Florida. |
|
|
- 'Realty Income, The Monthly Dividend Company, is an S&P 500 company dedicated |
|
|
to providing stockholders with dependable monthly income. The company is structured |
|
|
as a REIT, and its monthly dividends are supported by the cash flow from over |
|
|
6,500 real estate properties owned under long-term lease agreements with our commercial |
|
|
clients. To date, the company has declared 608 consecutive common stock monthly |
|
|
dividends throughout its 52-year operating history and increased the dividend |
|
|
109 times since Realty Income''s public listing in 1994 (NYSE: O). The company |
|
|
is a member of the S&P 500 Dividend Aristocrats index. Additional information |
|
|
about the company can be obtained from the corporate website at www.realtyincome.com.' |
|
|
- source_sentence: The iShares U.S. Telecommunications ETF (IYZ) seeks to track the |
|
|
investment results of the Russell 1000 Telecommunications RIC 22.5/45 Capped Index, |
|
|
which measures the performance of the U.S. telecommunications sector of the U.S. |
|
|
equity market as defined by FTSE Russell. This market-cap-weighted index includes |
|
|
large-cap companies involved in telecom equipment and service provision and is |
|
|
subject to regulatory capping that limits single holdings to 22.5% and aggregate |
|
|
large holdings to 45%. The fund generally invests at least 80% of its assets in |
|
|
the component securities of its underlying index and is non-diversified; the underlying |
|
|
index is rebalanced quarterly. |
|
|
sentences: |
|
|
- Kanzhun Limited operates an online recruitment platform, BOSS Zhipin in the People's |
|
|
Republic of China. Its recruitment platform assists the recruitment process between |
|
|
job seekers and employers for enterprises, and corporations. The company was founded |
|
|
in 2013 and is headquartered in Beijing, the People's Republic of China. |
|
|
- Frontier Communications Parent, Inc., together with its subsidiaries, provides |
|
|
communications services for consumer and business customers in 25 states in the |
|
|
United States. It offers data and Internet, voice, video, and other services. |
|
|
The company was formerly known as Frontier Communications Corporation and changed |
|
|
its name to Frontier Communications Parent, Inc. in April 2021. Frontier Communications |
|
|
Parent, Inc. was incorporated in 1935 and is based in Norwalk, Connecticut. |
|
|
- Broadcom Inc. designs, develops, and supplies various semiconductor devices with |
|
|
a focus on complex digital and mixed signal complementary metal oxide semiconductor |
|
|
based devices and analog III-V based products worldwide. The company operates |
|
|
in two segments, Semiconductor Solutions and Infrastructure Software. It provides |
|
|
set-top box system-on-chips (SoCs); cable, digital subscriber line, and passive |
|
|
optical networking central office/consumer premise equipment SoCs; wireless local |
|
|
area network access point SoCs; Ethernet switching and routing merchant silicon |
|
|
products; embedded processors and controllers; serializer/deserializer application |
|
|
specific integrated circuits; optical and copper, and physical layers; and fiber |
|
|
optic transmitter and receiver components. The company also offers RF front end |
|
|
modules, filters, and power amplifiers; Wi-Fi, Bluetooth, and global positioning |
|
|
system/global navigation satellite system SoCs; custom touch controllers; serial |
|
|
attached small computer system interface, and redundant array of independent disks |
|
|
controllers and adapters; peripheral component interconnect express switches; |
|
|
fiber channel host bus adapters; read channel based SoCs; custom flash controllers; |
|
|
preamplifiers; and optocouplers, industrial fiber optics, and motion control encoders |
|
|
and subsystems. Its products are used in various applications, including enterprise |
|
|
and data center networking, home connectivity, set-top boxes, broadband access, |
|
|
telecommunication equipment, smartphones and base stations, data center servers |
|
|
and storage systems, factory automation, power generation and alternative energy |
|
|
systems, and electronic displays. Broadcom Inc. was incorporated in 2018 and is |
|
|
headquartered in San Jose, California. |
|
|
- source_sentence: The Xtrackers MSCI Emerging Markets ESG Leaders Equity ETF tracks |
|
|
an index of large- and mid-cap emerging market stocks that emphasize strong environmental, |
|
|
social, and governance (ESG) characteristics. The index first excludes companies |
|
|
involved in specific controversial industries. From the remaining universe, it |
|
|
ranks stocks based on MSCI ESG scores, including a controversy component, to identify |
|
|
and select the highest-ranking ESG leaders, effectively screening out ESG laggards. |
|
|
To maintain market-like country and sector weights, the index selects the top |
|
|
ESG-scoring stocks within each sector until a specified market capitalization |
|
|
threshold is reached. Selected stocks are then weighted by market capitalization |
|
|
within their respective sectors. The fund typically invests over 80% of its assets |
|
|
in the securities of this underlying index. |
|
|
sentences: |
|
|
- Info Edge (India) Limited operates as an online classifieds company in the areas |
|
|
of recruitment, matrimony, real estate, and education and related services in |
|
|
India and internationally. It operates through Recruitment Solutions, 99acres, |
|
|
and Other segments. The company offers recruitment services through naukri.com, |
|
|
an online job website for job seekers and corporate customers, including hiring |
|
|
consultants; firstnaukri.com, a job search network for college students and recent |
|
|
graduates; naukrigulf.com, a website catering to Gulf markets; and quadranglesearch.com, |
|
|
a site that provides off-line placement services to middle and senior management, |
|
|
as well as Highorbit/iimjobs.com, zwayam.com, hirist.com, doselect.com, ambitionbox.com, |
|
|
bigshyft.com, and jobhai.com. It also provides 99acres.com, which offers listing |
|
|
of properties for sale, purchase, and rent; Jeevansathi.com, an online matrimonial |
|
|
classifieds services; and shiksha.com, an education classified website that helps |
|
|
students to decide their undergraduate and postgraduate options by providing useful |
|
|
information on careers, exams, colleges, and courses, as well as operates multiple |
|
|
dating platforms on the web through its mobile apps Aisle, Anbe, Arike and HeyDil. |
|
|
In addition, the company provides internet, computer, and electronic and related |
|
|
services; and software development, consultancy, technical support for consumer |
|
|
companies, SAAS providers, and other services in the field of information technology |
|
|
and product development, as well as brokerage services in the real estate sector. |
|
|
Further, it acts as an investment adviser and manager, financial and management |
|
|
consultant, and sponsor of alternative investment funds, as well as provides advertising |
|
|
space for colleges and universities on www.shiksha.com. Info Edge (India) Limited |
|
|
was incorporated in 1995 and is based in Noida, India. |
|
|
- China Overseas Land & Investment Limited, an investment holding company, engages |
|
|
in the property development and investment, and other operations in the People's |
|
|
Republic of China and the United Kingdom. The company operates through Property |
|
|
Development, Property Investment, and Other Operations segments. It is involved |
|
|
in the investment, development, and rental of residential and commercial properties; |
|
|
issuance of guaranteed notes and corporate bonds; and hotel operation activities. |
|
|
The company also provides construction and building design consultancy services. |
|
|
In addition, it engages in the investment and financing, land consolidation, regional |
|
|
planning, engineering construction, industrial import, commercial operation, and |
|
|
property management. Further, the company offers urban services, including office |
|
|
buildings, flexible working space, shopping malls, star-rated hotels, long-term |
|
|
rental apartments, logistics parks, and architectural design and construction. |
|
|
The company was founded in 1979 and is based in Central, Hong Kong. China Overseas |
|
|
Land & Investment Limited is a subsidiary of China Overseas Holdings Limited. |
|
|
- Mastercard Incorporated, a technology company, provides transaction processing |
|
|
and other payment-related products and services in the United States and internationally. |
|
|
It facilitates the processing of payment transactions, including authorization, |
|
|
clearing, and settlement, as well as delivers other payment-related products and |
|
|
services. The company offers integrated products and value-added services for |
|
|
account holders, merchants, financial institutions, businesses, governments, and |
|
|
other organizations, such as programs that enable issuers to provide consumers |
|
|
with credits to defer payments; prepaid programs and management services; commercial |
|
|
credit and debit payment products and solutions; and payment products and solutions |
|
|
that allow its customers to access funds in deposit and other accounts. It also |
|
|
provides value-added products and services comprising cyber and intelligence solutions |
|
|
for parties to transact, as well as proprietary insights, drawing on principled |
|
|
use of consumer, and merchant data services. In addition, the company offers analytics, |
|
|
test and learn, consulting, managed services, loyalty, processing, and payment |
|
|
gateway solutions for e-commerce merchants. Further, it provides open banking |
|
|
and digital identity platforms services. The company offers payment solutions |
|
|
and services under the MasterCard, Maestro, and Cirrus. Mastercard Incorporated |
|
|
was founded in 1966 and is headquartered in Purchase, New York. |
|
|
- source_sentence: The Global X S&P 500 Risk Managed Income ETF seeks to track the |
|
|
Cboe S&P 500 Risk Managed Income Index by investing at least 80% of its assets |
|
|
in index securities. The index's strategy involves holding the underlying stocks |
|
|
of the S&P 500 Index while applying an options collar, specifically selling at-the-money |
|
|
covered call options and buying monthly 5% out-of-the-money put options corresponding |
|
|
to the portfolio's value. This approach aims to generate income, ideally resulting |
|
|
in a net credit from the options premiums, and provide risk management, though |
|
|
selling at-the-money calls inherently caps the fund's potential for upside participation. |
|
|
sentences: |
|
|
- Incyte Corporation, a biopharmaceutical company, focuses on the discovery, development, |
|
|
and commercialization of proprietary therapeutics in the United States and internationally. |
|
|
The company offers JAKAFI, a drug for the treatment of myelofibrosis and polycythemia |
|
|
vera; PEMAZYRE, a fibroblast growth factor receptor kinase inhibitor that act |
|
|
as oncogenic drivers in various liquid and solid tumor types; and ICLUSIG, a kinase |
|
|
inhibitor to treat chronic myeloid leukemia and philadelphia-chromosome positive |
|
|
acute lymphoblastic leukemia. Its clinical stage products include ruxolitinib, |
|
|
a steroid-refractory chronic graft-versus-host-diseases (GVHD); itacitinib, which |
|
|
is in Phase II/III clinical trial to treat naive chronic GVHD; and pemigatinib |
|
|
for treating bladder cancer, cholangiocarcinoma, myeloproliferative syndrome, |
|
|
and tumor agnostic. In addition, the company engages in developing Parsaclisib, |
|
|
which is in Phase II clinical trial for follicular lymphoma, marginal zone lymphoma, |
|
|
and mantel cell lymphoma. Additionally, it develops Retifanlimab that is in Phase |
|
|
II clinical trials for MSI-high endometrial cancer, merkel cell carcinoma, and |
|
|
anal cancer, as well as in Phase II clinical trials for patients with non-small |
|
|
cell lung cancer. It has collaboration agreements with Novartis International |
|
|
Pharmaceutical Ltd.; Eli Lilly and Company; Agenus Inc.; Calithera Biosciences, |
|
|
Inc; MacroGenics, Inc.; Merus N.V.; Syros Pharmaceuticals, Inc.; Innovent Biologics, |
|
|
Inc.; Zai Lab Limited; Cellenkos, Inc.; and Nimble Therapeutics, as well as clinical |
|
|
collaborations with MorphoSys AG and Xencor, Inc. to investigate the combination |
|
|
of tafasitamab, plamotamab, and lenalidomide in patients with relapsed or refractory |
|
|
diffuse large B-cell lymphoma, and relapsed or refractory follicular lymphoma. |
|
|
The company was incorporated in 1991 and is headquartered in Wilmington, Delaware. |
|
|
- Omnicom Group Inc., together with its subsidiaries, provides advertising, marketing, |
|
|
and corporate communications services. It provides a range of services in the |
|
|
areas of advertising, customer relationship management, public relations, and |
|
|
healthcare. The company's services include advertising, branding, content marketing, |
|
|
corporate social responsibility consulting, crisis communications, custom publishing, |
|
|
data analytics, database management, digital/direct marketing, digital transformation, |
|
|
entertainment marketing, experiential marketing, field marketing, financial/corporate |
|
|
business-to-business advertising, graphic arts/digital imaging, healthcare marketing |
|
|
and communications, and in-store design services. Its services also comprise interactive |
|
|
marketing, investor relations, marketing research, media planning and buying, |
|
|
merchandising and point of sale, mobile marketing, multi-cultural marketing, non-profit |
|
|
marketing, organizational communications, package design, product placement, promotional |
|
|
marketing, public affairs, retail marketing, sales support, search engine marketing, |
|
|
shopper marketing, social media marketing, and sports and event marketing services. |
|
|
It operates in the United States, Canada, Puerto Rico, South America, Mexico, |
|
|
Europe, the Middle East, Africa, Australia, Greater China, India, Japan, Korea, |
|
|
New Zealand, Singapore, and other Asian countries. The company was incorporated |
|
|
in 1944 and is based in New York, New York. |
|
|
- NetApp, Inc. provides cloud-led and data-centric services to manage and share |
|
|
data on-premises, and private and public clouds worldwide. It operates in two |
|
|
segments, Hybrid Cloud and Public Could. The company offers intelligent data management |
|
|
software, such as NetApp ONTAP, NetApp Snapshot, NetApp SnapCenter Backup Management, |
|
|
NetApp SnapMirror Data Replication, NetApp SnapLock Data Compliance, NetApp ElementOS |
|
|
software, and NetApp SANtricity software; and storage infrastructure solutions, |
|
|
including NetApp All-Flash FAS series, NetApp Fabric Attached Storage, NetApp |
|
|
FlexPod, NetApp E/EF series, NetApp StorageGRID, and NetApp SolidFire. It also |
|
|
provides cloud storage and data services comprising NetApp Cloud Volumes ONTAP, |
|
|
Azure NetApp Files, Amazon FSx for NetApp ONTAP, NetApp Cloud Volumes Service |
|
|
for Google Cloud, NetApp Cloud Sync, NetApp Cloud Tiering, NetApp Cloud Backup, |
|
|
NetApp Cloud Data Sense, and NetApp Cloud Volumes Edge Cache; and cloud operations |
|
|
services, such as NetApp Cloud Insights, Spot Ocean Kubernetes Suite, Spot Security, |
|
|
Spot Eco, and Spot CloudCheckr. In addition, the company offers application-aware |
|
|
data management service under the NetApp Astra name; and professional and support |
|
|
services, such as strategic consulting, professional, managed, and support services. |
|
|
Further, it provides assessment, design, implementation, and migration services. |
|
|
The company serves the energy, financial service, government, technology, internet, |
|
|
life science, healthcare service, manufacturing, media, entertainment, animation, |
|
|
video postproduction, and telecommunication markets through a direct sales force |
|
|
and an ecosystem of partners. NetApp, Inc. was incorporated in 1992 and is headquartered |
|
|
in San Jose, California. |
|
|
- source_sentence: The Global X S&P 500 Risk Managed Income ETF seeks to track the |
|
|
Cboe S&P 500 Risk Managed Income Index by investing at least 80% of its assets |
|
|
in index securities. The index's strategy involves holding the underlying stocks |
|
|
of the S&P 500 Index while applying an options collar, specifically selling at-the-money |
|
|
covered call options and buying monthly 5% out-of-the-money put options corresponding |
|
|
to the portfolio's value. This approach aims to generate income, ideally resulting |
|
|
in a net credit from the options premiums, and provide risk management, though |
|
|
selling at-the-money calls inherently caps the fund's potential for upside participation. |
|
|
sentences: |
|
|
- Walgreens Boots Alliance, Inc. operates as a pharmacy-led health and beauty retail |
|
|
company. It operates through two segments, the United States and International. |
|
|
The United States segment sells prescription drugs and an assortment of retail |
|
|
products, including health, wellness, beauty, personal care, consumable, and general |
|
|
merchandise products through its retail drugstores. It also provides central specialty |
|
|
pharmacy services and mail services. As of August 31, 2021, this segment operated |
|
|
8,965 retail stores under the Walgreens and Duane Reade brands in the United States; |
|
|
and five specialty pharmacies. The International segment sells prescription drugs; |
|
|
and health and wellness, beauty, personal care, and other consumer products through |
|
|
its pharmacy-led health and beauty retail stores and optical practices, as well |
|
|
as through boots.com and an integrated mobile application. It also engages in |
|
|
pharmaceutical wholesaling and distribution business in Germany. As of August |
|
|
31, 2021, this segment operated 4,031 retail stores under the Boots, Benavides, |
|
|
and Ahumada in the United Kingdom, Thailand, Norway, the Republic of Ireland, |
|
|
the Netherlands, Mexico, and Chile; and 548 optical practices, including 160 on |
|
|
a franchise basis. Walgreens Boots Alliance, Inc. was founded in 1901 and is based |
|
|
in Deerfield, Illinois. |
|
|
- Middlesex Water Company owns and operates regulated water utility and wastewater |
|
|
systems. It operates in two segments, Regulated and Non-Regulated. The Regulated |
|
|
segment collects, treats, and distributes water on a retail and wholesale basis |
|
|
to residential, commercial, industrial, and fire protection customers, as well |
|
|
as provides regulated wastewater systems in New Jersey and Delaware. The Non-Regulated |
|
|
segment provides non-regulated contract services for the operation and maintenance |
|
|
of municipal and private water and wastewater systems in New Jersey and Delaware. |
|
|
The company was incorporated in 1896 and is headquartered in Iselin, New Jersey. |
|
|
- Liberty Broadband Corporation engages in the communications businesses. It operates |
|
|
through GCI Holdings and Charter segments. The GCI Holdings segment provides a |
|
|
range of wireless, data, video, voice, and managed services to residential customers, |
|
|
businesses, governmental entities, and educational and medical institutions primarily |
|
|
in Alaska under the GCI brand. The Charter segment offers subscription-based video |
|
|
services comprising video on demand, high-definition television, and digital video |
|
|
recorder service; local and long-distance calling, voicemail, call waiting, caller |
|
|
ID, call forwarding, and other voice services, as well as international calling |
|
|
services; and Spectrum TV. It also provides internet services, including an in-home |
|
|
Wi-Fi product that provides customers with high-performance wireless routers and |
|
|
managed Wi-Fi services; advanced community Wi-Fi; mobile internet; and a security |
|
|
suite that offers protection against computer viruses and spyware. In addition, |
|
|
this segment offers internet access, data networking, fiber connectivity to cellular |
|
|
towers and office buildings, video entertainment, and business telephone services; |
|
|
advertising services on cable television networks and digital outlets; and operates |
|
|
regional sports and news networks. Liberty Broadband Corporation was incorporated |
|
|
in 2014 and is based in Englewood, Colorado. |
|
|
datasets: |
|
|
- hobbang/stage2-dataset |
|
|
pipeline_tag: sentence-similarity |
|
|
library_name: sentence-transformers |
|
|
--- |
|
|
|
|
|
# SentenceTransformer based on suhwan3/mpnet_step1 |
|
|
|
|
|
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [suhwan3/mpnet_step1](https://huggingface.co/suhwan3/mpnet_step1) on the [stage2-dataset](https://huggingface.co/datasets/hobbang/stage2-dataset) dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
### Model Description |
|
|
- **Model Type:** Sentence Transformer |
|
|
- **Base model:** [suhwan3/mpnet_step1](https://huggingface.co/suhwan3/mpnet_step1) <!-- at revision 8857c26669998d56b0735085b269cfc7890ca67d --> |
|
|
- **Maximum Sequence Length:** 384 tokens |
|
|
- **Output Dimensionality:** 768 dimensions |
|
|
- **Similarity Function:** Cosine Similarity |
|
|
- **Training Dataset:** |
|
|
- [stage2-dataset](https://huggingface.co/datasets/hobbang/stage2-dataset) |
|
|
<!-- - **Language:** Unknown --> |
|
|
<!-- - **License:** Unknown --> |
|
|
|
|
|
### Model Sources |
|
|
|
|
|
- **Documentation:** [Sentence Transformers Documentation](https://sbert.net) |
|
|
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers) |
|
|
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers) |
|
|
|
|
|
### Full Model Architecture |
|
|
|
|
|
``` |
|
|
SentenceTransformer( |
|
|
(0): Transformer({'max_seq_length': 384, 'do_lower_case': False}) with Transformer model: MPNetModel |
|
|
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) |
|
|
(2): Normalize() |
|
|
) |
|
|
``` |
|
|
|
|
|
## Usage |
|
|
|
|
|
### Direct Usage (Sentence Transformers) |
|
|
|
|
|
First install the Sentence Transformers library: |
|
|
|
|
|
```bash |
|
|
pip install -U sentence-transformers |
|
|
``` |
|
|
|
|
|
Then you can load this model and run inference. |
|
|
```python |
|
|
from sentence_transformers import SentenceTransformer |
|
|
|
|
|
# Download from the 🤗 Hub |
|
|
model = SentenceTransformer("sentence_transformers_model_id") |
|
|
# Run inference |
|
|
sentences = [ |
|
|
"The Global X S&P 500 Risk Managed Income ETF seeks to track the Cboe S&P 500 Risk Managed Income Index by investing at least 80% of its assets in index securities. The index's strategy involves holding the underlying stocks of the S&P 500 Index while applying an options collar, specifically selling at-the-money covered call options and buying monthly 5% out-of-the-money put options corresponding to the portfolio's value. This approach aims to generate income, ideally resulting in a net credit from the options premiums, and provide risk management, though selling at-the-money calls inherently caps the fund's potential for upside participation.", |
|
|
'Walgreens Boots Alliance, Inc. operates as a pharmacy-led health and beauty retail company. It operates through two segments, the United States and International. The United States segment sells prescription drugs and an assortment of retail products, including health, wellness, beauty, personal care, consumable, and general merchandise products through its retail drugstores. It also provides central specialty pharmacy services and mail services. As of August 31, 2021, this segment operated 8,965 retail stores under the Walgreens and Duane Reade brands in the United States; and five specialty pharmacies. The International segment sells prescription drugs; and health and wellness, beauty, personal care, and other consumer products through its pharmacy-led health and beauty retail stores and optical practices, as well as through boots.com and an integrated mobile application. It also engages in pharmaceutical wholesaling and distribution business in Germany. As of August 31, 2021, this segment operated 4,031 retail stores under the Boots, Benavides, and Ahumada in the United Kingdom, Thailand, Norway, the Republic of Ireland, the Netherlands, Mexico, and Chile; and 548 optical practices, including 160 on a franchise basis. Walgreens Boots Alliance, Inc. was founded in 1901 and is based in Deerfield, Illinois.', |
|
|
'Liberty Broadband Corporation engages in the communications businesses. It operates through GCI Holdings and Charter segments. The GCI Holdings segment provides a range of wireless, data, video, voice, and managed services to residential customers, businesses, governmental entities, and educational and medical institutions primarily in Alaska under the GCI brand. The Charter segment offers subscription-based video services comprising video on demand, high-definition television, and digital video recorder service; local and long-distance calling, voicemail, call waiting, caller ID, call forwarding, and other voice services, as well as international calling services; and Spectrum TV. It also provides internet services, including an in-home Wi-Fi product that provides customers with high-performance wireless routers and managed Wi-Fi services; advanced community Wi-Fi; mobile internet; and a security suite that offers protection against computer viruses and spyware. In addition, this segment offers internet access, data networking, fiber connectivity to cellular towers and office buildings, video entertainment, and business telephone services; advertising services on cable television networks and digital outlets; and operates regional sports and news networks. Liberty Broadband Corporation was incorporated in 2014 and is based in Englewood, Colorado.', |
|
|
] |
|
|
embeddings = model.encode(sentences) |
|
|
print(embeddings.shape) |
|
|
# [3, 768] |
|
|
|
|
|
# Get the similarity scores for the embeddings |
|
|
similarities = model.similarity(embeddings, embeddings) |
|
|
print(similarities.shape) |
|
|
# [3, 3] |
|
|
``` |
|
|
|
|
|
<!-- |
|
|
### Direct Usage (Transformers) |
|
|
|
|
|
<details><summary>Click to see the direct usage in Transformers</summary> |
|
|
|
|
|
</details> |
|
|
--> |
|
|
|
|
|
<!-- |
|
|
### Downstream Usage (Sentence Transformers) |
|
|
|
|
|
You can finetune this model on your own dataset. |
|
|
|
|
|
<details><summary>Click to expand</summary> |
|
|
|
|
|
</details> |
|
|
--> |
|
|
|
|
|
<!-- |
|
|
### Out-of-Scope Use |
|
|
|
|
|
*List how the model may foreseeably be misused and address what users ought not to do with the model.* |
|
|
--> |
|
|
|
|
|
<!-- |
|
|
## Bias, Risks and Limitations |
|
|
|
|
|
*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.* |
|
|
--> |
|
|
|
|
|
<!-- |
|
|
### Recommendations |
|
|
|
|
|
*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.* |
|
|
--> |
|
|
|
|
|
## Training Details |
|
|
|
|
|
### Training Dataset |
|
|
|
|
|
#### stage2-dataset |
|
|
|
|
|
* Dataset: [stage2-dataset](https://huggingface.co/datasets/hobbang/stage2-dataset) at [cd393c2](https://huggingface.co/datasets/hobbang/stage2-dataset/tree/cd393c24f4017971e95aa6f73736f2fcb45e30a0) |
|
|
* Size: 128,997 training samples |
|
|
* Columns: <code>anchor</code> and <code>positive</code> |
|
|
* Approximate statistics based on the first 1000 samples: |
|
|
| | anchor | positive | |
|
|
|:--------|:--------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------| |
|
|
| type | string | string | |
|
|
| details | <ul><li>min: 101 tokens</li><li>mean: 143.15 tokens</li><li>max: 186 tokens</li></ul> | <ul><li>min: 35 tokens</li><li>mean: 238.69 tokens</li><li>max: 384 tokens</li></ul> | |
|
|
* Samples: |
|
|
| anchor | positive | |
|
|
|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| |
|
|
| <code>The Invesco Financial Preferred ETF (PGF) seeks to track the ICE Exchange-Listed Fixed Rate Financial Preferred Securities Index, primarily by investing at least 90% of its total assets in the securities comprising the index. The underlying index is market capitalization weighted and designed to track the performance of exchange-listed, fixed rate, U.S. dollar denominated preferred securities, including functionally equivalent instruments, issued by U.S. financial companies. PGF provides a concentrated portfolio exclusively focused on financial-sector preferred securities and is considered non-diversified, holding both investment- and non-investment-grade securities within this focus.</code> | <code>JPMorgan Chase & Co. operates as a financial services company worldwide. It operates through four segments: Consumer & Community Banking (CCB), Corporate & Investment Bank (CIB), Commercial Banking (CB), and Asset & Wealth Management (AWM). The CCB segment offers s deposit, investment and lending products, payments, and services to consumers; lending, deposit, and cash management and payment solutions to small businesses; mortgage origination and servicing activities; residential mortgages and home equity loans; and credit card, auto loan, and leasing services. The CIB segment provides investment banking products and services, including corporate strategy and structure advisory, and equity and debt markets capital-raising services, as well as loan origination and syndication; payments and cross-border financing; and cash and derivative instruments, risk management solutions, prime brokerage, and research. This segment also offers securities services, including custody, fund accounting ...</code> | |
|
|
| <code>The Invesco Financial Preferred ETF (PGF) seeks to track the ICE Exchange-Listed Fixed Rate Financial Preferred Securities Index, primarily by investing at least 90% of its total assets in the securities comprising the index. The underlying index is market capitalization weighted and designed to track the performance of exchange-listed, fixed rate, U.S. dollar denominated preferred securities, including functionally equivalent instruments, issued by U.S. financial companies. PGF provides a concentrated portfolio exclusively focused on financial-sector preferred securities and is considered non-diversified, holding both investment- and non-investment-grade securities within this focus.</code> | <code>JPMorgan Chase & Co. operates as a financial services company worldwide. It operates through four segments: Consumer & Community Banking (CCB), Corporate & Investment Bank (CIB), Commercial Banking (CB), and Asset & Wealth Management (AWM). The CCB segment offers s deposit, investment and lending products, payments, and services to consumers; lending, deposit, and cash management and payment solutions to small businesses; mortgage origination and servicing activities; residential mortgages and home equity loans; and credit card, auto loan, and leasing services. The CIB segment provides investment banking products and services, including corporate strategy and structure advisory, and equity and debt markets capital-raising services, as well as loan origination and syndication; payments and cross-border financing; and cash and derivative instruments, risk management solutions, prime brokerage, and research. This segment also offers securities services, including custody, fund accounting ...</code> | |
|
|
| <code>The Invesco Financial Preferred ETF (PGF) seeks to track the ICE Exchange-Listed Fixed Rate Financial Preferred Securities Index, primarily by investing at least 90% of its total assets in the securities comprising the index. The underlying index is market capitalization weighted and designed to track the performance of exchange-listed, fixed rate, U.S. dollar denominated preferred securities, including functionally equivalent instruments, issued by U.S. financial companies. PGF provides a concentrated portfolio exclusively focused on financial-sector preferred securities and is considered non-diversified, holding both investment- and non-investment-grade securities within this focus.</code> | <code>The Allstate Corporation, together with its subsidiaries, provides property and casualty, and other insurance products in the United States and Canada. The company operates through Allstate Protection; Protection Services; Allstate Health and Benefits; and Run-off Property-Liability segments. The Allstate Protection segment offers private passenger auto and homeowners insurance; other personal lines products; and commercial lines products under the Allstate and Encompass brand names. The Protection Services segment provides consumer product protection plans and related technical support for mobile phones, consumer electronics, furniture, and appliances; finance and insurance products, including vehicle service contracts, guaranteed asset protection waivers, road hazard tire and wheel, and paint and fabric protection; towing, jump-start, lockout, fuel delivery, and tire change services; device and mobile data collection services; data and analytic solutions using automotive telematics i...</code> | |
|
|
* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters: |
|
|
```json |
|
|
{ |
|
|
"scale": 20.0, |
|
|
"similarity_fct": "cos_sim" |
|
|
} |
|
|
``` |
|
|
|
|
|
### Evaluation Dataset |
|
|
|
|
|
#### stage2-dataset |
|
|
|
|
|
* Dataset: [stage2-dataset](https://huggingface.co/datasets/hobbang/stage2-dataset) at [cd393c2](https://huggingface.co/datasets/hobbang/stage2-dataset/tree/cd393c24f4017971e95aa6f73736f2fcb45e30a0) |
|
|
* Size: 16,944 evaluation samples |
|
|
* Columns: <code>anchor</code> and <code>positive</code> |
|
|
* Approximate statistics based on the first 1000 samples: |
|
|
| | anchor | positive | |
|
|
|:--------|:--------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------| |
|
|
| type | string | string | |
|
|
| details | <ul><li>min: 135 tokens</li><li>mean: 149.21 tokens</li><li>max: 214 tokens</li></ul> | <ul><li>min: 42 tokens</li><li>mean: 252.75 tokens</li><li>max: 384 tokens</li></ul> | |
|
|
* Samples: |
|
|
| anchor | positive | |
|
|
|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| |
|
|
| <code>The Global X S&P 500 Risk Managed Income ETF seeks to track the Cboe S&P 500 Risk Managed Income Index by investing at least 80% of its assets in index securities. The index's strategy involves holding the underlying stocks of the S&P 500 Index while applying an options collar, specifically selling at-the-money covered call options and buying monthly 5% out-of-the-money put options corresponding to the portfolio's value. This approach aims to generate income, ideally resulting in a net credit from the options premiums, and provide risk management, though selling at-the-money calls inherently caps the fund's potential for upside participation.</code> | <code>Apple Inc. designs, manufactures, and markets smartphones, personal computers, tablets, wearables, and accessories worldwide. The company offers iPhone, a line of smartphones; Mac, a line of personal computers; iPad, a line of multi-purpose tablets; and wearables, home, and accessories comprising AirPods, Apple TV, Apple Watch, Beats products, and HomePod. It also provides AppleCare support and cloud services; and operates various platforms, including the App Store that allow customers to discover and download applications and digital content, such as books, music, video, games, and podcasts, as well as advertising services include third-party licensing arrangements and its own advertising platforms. In addition, the company offers various subscription-based services, such as Apple Arcade, a game subscription service; Apple Fitness+, a personalized fitness service; Apple Music, which offers users a curated listening experience with on-demand radio stations; Apple News+, a subscription ...</code> | |
|
|
| <code>The Global X S&P 500 Risk Managed Income ETF seeks to track the Cboe S&P 500 Risk Managed Income Index by investing at least 80% of its assets in index securities. The index's strategy involves holding the underlying stocks of the S&P 500 Index while applying an options collar, specifically selling at-the-money covered call options and buying monthly 5% out-of-the-money put options corresponding to the portfolio's value. This approach aims to generate income, ideally resulting in a net credit from the options premiums, and provide risk management, though selling at-the-money calls inherently caps the fund's potential for upside participation.</code> | <code>Microsoft Corporation develops, licenses, and supports software, services, devices, and solutions worldwide. The company operates in three segments: Productivity and Business Processes, Intelligent Cloud, and More Personal Computing. The Productivity and Business Processes segment offers Office, Exchange, SharePoint, Microsoft Teams, Office 365 Security and Compliance, Microsoft Viva, and Skype for Business; Skype, Outlook.com, OneDrive, and LinkedIn; and Dynamics 365, a set of cloud-based and on-premises business solutions for organizations and enterprise divisions. The Intelligent Cloud segment licenses SQL, Windows Servers, Visual Studio, System Center, and related Client Access Licenses; GitHub that provides a collaboration platform and code hosting service for developers; Nuance provides healthcare and enterprise AI solutions; and Azure, a cloud platform. It also offers enterprise support, Microsoft consulting, and nuance professional services to assist customers in developing, de...</code> | |
|
|
| <code>The Global X S&P 500 Risk Managed Income ETF seeks to track the Cboe S&P 500 Risk Managed Income Index by investing at least 80% of its assets in index securities. The index's strategy involves holding the underlying stocks of the S&P 500 Index while applying an options collar, specifically selling at-the-money covered call options and buying monthly 5% out-of-the-money put options corresponding to the portfolio's value. This approach aims to generate income, ideally resulting in a net credit from the options premiums, and provide risk management, though selling at-the-money calls inherently caps the fund's potential for upside participation.</code> | <code>NVIDIA Corporation provides graphics, and compute and networking solutions in the United States, Taiwan, China, and internationally. The company's Graphics segment offers GeForce GPUs for gaming and PCs, the GeForce NOW game streaming service and related infrastructure, and solutions for gaming platforms; Quadro/NVIDIA RTX GPUs for enterprise workstation graphics; vGPU software for cloud-based visual and virtual computing; automotive platforms for infotainment systems; and Omniverse software for building 3D designs and virtual worlds. Its Compute & Networking segment provides Data Center platforms and systems for AI, HPC, and accelerated computing; Mellanox networking and interconnect solutions; automotive AI Cockpit, autonomous driving development agreements, and autonomous vehicle solutions; cryptocurrency mining processors; Jetson for robotics and other embedded platforms; and NVIDIA AI Enterprise and other software. The company's products are used in gaming, professional visualizat...</code> | |
|
|
* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters: |
|
|
```json |
|
|
{ |
|
|
"scale": 20.0, |
|
|
"similarity_fct": "cos_sim" |
|
|
} |
|
|
``` |
|
|
|
|
|
### Training Hyperparameters |
|
|
#### Non-Default Hyperparameters |
|
|
|
|
|
- `eval_strategy`: steps |
|
|
- `per_device_train_batch_size`: 64 |
|
|
- `per_device_eval_batch_size`: 32 |
|
|
- `learning_rate`: 3e-05 |
|
|
- `num_train_epochs`: 1 |
|
|
- `warmup_ratio`: 0.1 |
|
|
- `bf16`: True |
|
|
- `dataloader_drop_last`: True |
|
|
- `load_best_model_at_end`: True |
|
|
- `batch_sampler`: no_duplicates |
|
|
|
|
|
#### All Hyperparameters |
|
|
<details><summary>Click to expand</summary> |
|
|
|
|
|
- `overwrite_output_dir`: False |
|
|
- `do_predict`: False |
|
|
- `eval_strategy`: steps |
|
|
- `prediction_loss_only`: True |
|
|
- `per_device_train_batch_size`: 64 |
|
|
- `per_device_eval_batch_size`: 32 |
|
|
- `per_gpu_train_batch_size`: None |
|
|
- `per_gpu_eval_batch_size`: None |
|
|
- `gradient_accumulation_steps`: 1 |
|
|
- `eval_accumulation_steps`: None |
|
|
- `torch_empty_cache_steps`: None |
|
|
- `learning_rate`: 3e-05 |
|
|
- `weight_decay`: 0.0 |
|
|
- `adam_beta1`: 0.9 |
|
|
- `adam_beta2`: 0.999 |
|
|
- `adam_epsilon`: 1e-08 |
|
|
- `max_grad_norm`: 1.0 |
|
|
- `num_train_epochs`: 1 |
|
|
- `max_steps`: -1 |
|
|
- `lr_scheduler_type`: linear |
|
|
- `lr_scheduler_kwargs`: {} |
|
|
- `warmup_ratio`: 0.1 |
|
|
- `warmup_steps`: 0 |
|
|
- `log_level`: passive |
|
|
- `log_level_replica`: warning |
|
|
- `log_on_each_node`: True |
|
|
- `logging_nan_inf_filter`: True |
|
|
- `save_safetensors`: True |
|
|
- `save_on_each_node`: False |
|
|
- `save_only_model`: False |
|
|
- `restore_callback_states_from_checkpoint`: False |
|
|
- `no_cuda`: False |
|
|
- `use_cpu`: False |
|
|
- `use_mps_device`: False |
|
|
- `seed`: 42 |
|
|
- `data_seed`: None |
|
|
- `jit_mode_eval`: False |
|
|
- `use_ipex`: False |
|
|
- `bf16`: True |
|
|
- `fp16`: False |
|
|
- `fp16_opt_level`: O1 |
|
|
- `half_precision_backend`: auto |
|
|
- `bf16_full_eval`: False |
|
|
- `fp16_full_eval`: False |
|
|
- `tf32`: None |
|
|
- `local_rank`: 0 |
|
|
- `ddp_backend`: None |
|
|
- `tpu_num_cores`: None |
|
|
- `tpu_metrics_debug`: False |
|
|
- `debug`: [] |
|
|
- `dataloader_drop_last`: True |
|
|
- `dataloader_num_workers`: 0 |
|
|
- `dataloader_prefetch_factor`: None |
|
|
- `past_index`: -1 |
|
|
- `disable_tqdm`: False |
|
|
- `remove_unused_columns`: True |
|
|
- `label_names`: None |
|
|
- `load_best_model_at_end`: True |
|
|
- `ignore_data_skip`: False |
|
|
- `fsdp`: [] |
|
|
- `fsdp_min_num_params`: 0 |
|
|
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False} |
|
|
- `tp_size`: 0 |
|
|
- `fsdp_transformer_layer_cls_to_wrap`: None |
|
|
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None} |
|
|
- `deepspeed`: None |
|
|
- `label_smoothing_factor`: 0.0 |
|
|
- `optim`: adamw_torch |
|
|
- `optim_args`: None |
|
|
- `adafactor`: False |
|
|
- `group_by_length`: False |
|
|
- `length_column_name`: length |
|
|
- `ddp_find_unused_parameters`: None |
|
|
- `ddp_bucket_cap_mb`: None |
|
|
- `ddp_broadcast_buffers`: False |
|
|
- `dataloader_pin_memory`: True |
|
|
- `dataloader_persistent_workers`: False |
|
|
- `skip_memory_metrics`: True |
|
|
- `use_legacy_prediction_loop`: False |
|
|
- `push_to_hub`: False |
|
|
- `resume_from_checkpoint`: None |
|
|
- `hub_model_id`: None |
|
|
- `hub_strategy`: every_save |
|
|
- `hub_private_repo`: None |
|
|
- `hub_always_push`: False |
|
|
- `gradient_checkpointing`: False |
|
|
- `gradient_checkpointing_kwargs`: None |
|
|
- `include_inputs_for_metrics`: False |
|
|
- `include_for_metrics`: [] |
|
|
- `eval_do_concat_batches`: True |
|
|
- `fp16_backend`: auto |
|
|
- `push_to_hub_model_id`: None |
|
|
- `push_to_hub_organization`: None |
|
|
- `mp_parameters`: |
|
|
- `auto_find_batch_size`: False |
|
|
- `full_determinism`: False |
|
|
- `torchdynamo`: None |
|
|
- `ray_scope`: last |
|
|
- `ddp_timeout`: 1800 |
|
|
- `torch_compile`: False |
|
|
- `torch_compile_backend`: None |
|
|
- `torch_compile_mode`: None |
|
|
- `include_tokens_per_second`: False |
|
|
- `include_num_input_tokens_seen`: False |
|
|
- `neftune_noise_alpha`: None |
|
|
- `optim_target_modules`: None |
|
|
- `batch_eval_metrics`: False |
|
|
- `eval_on_start`: False |
|
|
- `use_liger_kernel`: False |
|
|
- `eval_use_gather_object`: False |
|
|
- `average_tokens_across_devices`: False |
|
|
- `prompts`: None |
|
|
- `batch_sampler`: no_duplicates |
|
|
- `multi_dataset_batch_sampler`: proportional |
|
|
|
|
|
</details> |
|
|
|
|
|
### Training Logs |
|
|
<details><summary>Click to expand</summary> |
|
|
|
|
|
| Epoch | Step | Training Loss | Validation Loss | |
|
|
|:------:|:----:|:-------------:|:---------------:| |
|
|
| 0.0050 | 10 | 4.6656 | - | |
|
|
| 0.0099 | 20 | 4.4733 | - | |
|
|
| 0.0149 | 30 | 4.0093 | - | |
|
|
| 0.0199 | 40 | 3.9259 | - | |
|
|
| 0.0248 | 50 | 3.8315 | - | |
|
|
| 0.0298 | 60 | 3.673 | - | |
|
|
| 0.0347 | 70 | 3.5076 | - | |
|
|
| 0.0397 | 80 | 3.4416 | - | |
|
|
| 0.0447 | 90 | 3.4362 | - | |
|
|
| 0.0496 | 100 | 3.3934 | - | |
|
|
| 0.0546 | 110 | 3.3343 | - | |
|
|
| 0.0596 | 120 | 3.3018 | - | |
|
|
| 0.0645 | 130 | 3.2882 | - | |
|
|
| 0.0695 | 140 | 3.3027 | - | |
|
|
| 0.0744 | 150 | 3.2177 | - | |
|
|
| 0.0794 | 160 | 3.2708 | - | |
|
|
| 0.0844 | 170 | 3.2645 | - | |
|
|
| 0.0893 | 180 | 3.1939 | - | |
|
|
| 0.0943 | 190 | 3.0575 | - | |
|
|
| 0.0993 | 200 | 3.0799 | - | |
|
|
| 0.1042 | 210 | 3.0824 | - | |
|
|
| 0.1092 | 220 | 3.0693 | - | |
|
|
| 0.1141 | 230 | 3.1014 | - | |
|
|
| 0.1191 | 240 | 3.0458 | - | |
|
|
| 0.1241 | 250 | 3.04 | - | |
|
|
| 0.1290 | 260 | 3.0311 | - | |
|
|
| 0.1340 | 270 | 2.9778 | - | |
|
|
| 0.1390 | 280 | 3.0701 | - | |
|
|
| 0.1439 | 290 | 2.9039 | - | |
|
|
| 0.1489 | 300 | 3.0449 | 2.5685 | |
|
|
| 0.1538 | 310 | 2.8896 | - | |
|
|
| 0.1588 | 320 | 3.0527 | - | |
|
|
| 0.1638 | 330 | 3.0153 | - | |
|
|
| 0.1687 | 340 | 2.869 | - | |
|
|
| 0.1737 | 350 | 2.9678 | - | |
|
|
| 0.1787 | 360 | 2.9756 | - | |
|
|
| 0.1836 | 370 | 2.9348 | - | |
|
|
| 0.1886 | 380 | 2.9967 | - | |
|
|
| 0.1935 | 390 | 2.8953 | - | |
|
|
| 0.1985 | 400 | 2.9546 | - | |
|
|
| 0.2035 | 410 | 2.9919 | - | |
|
|
| 0.2084 | 420 | 2.8487 | - | |
|
|
| 0.2134 | 430 | 2.7609 | - | |
|
|
| 0.2184 | 440 | 2.9126 | - | |
|
|
| 0.2233 | 450 | 2.8991 | - | |
|
|
| 0.2283 | 460 | 2.9272 | - | |
|
|
| 0.2333 | 470 | 2.9084 | - | |
|
|
| 0.2382 | 480 | 2.7963 | - | |
|
|
| 0.2432 | 490 | 2.822 | - | |
|
|
| 0.2481 | 500 | 2.9376 | - | |
|
|
| 0.2531 | 510 | 2.8969 | - | |
|
|
| 0.2581 | 520 | 2.7745 | - | |
|
|
| 0.2630 | 530 | 2.8103 | - | |
|
|
| 0.2680 | 540 | 2.8189 | - | |
|
|
| 0.2730 | 550 | 2.8322 | - | |
|
|
| 0.2779 | 560 | 2.7627 | - | |
|
|
| 0.2829 | 570 | 2.7796 | - | |
|
|
| 0.2878 | 580 | 2.8515 | - | |
|
|
| 0.2928 | 590 | 2.8758 | - | |
|
|
| 0.2978 | 600 | 2.7963 | 2.4142 | |
|
|
| 0.3027 | 610 | 2.8259 | - | |
|
|
| 0.3077 | 620 | 2.829 | - | |
|
|
| 0.3127 | 630 | 2.7699 | - | |
|
|
| 0.3176 | 640 | 2.7311 | - | |
|
|
| 0.3226 | 650 | 2.735 | - | |
|
|
| 0.3275 | 660 | 2.7306 | - | |
|
|
| 0.3325 | 670 | 2.7467 | - | |
|
|
| 0.3375 | 680 | 2.7494 | - | |
|
|
| 0.3424 | 690 | 2.7386 | - | |
|
|
| 0.3474 | 700 | 2.8513 | - | |
|
|
| 0.3524 | 710 | 2.673 | - | |
|
|
| 0.3573 | 720 | 2.8101 | - | |
|
|
| 0.3623 | 730 | 2.7527 | - | |
|
|
| 0.3672 | 740 | 2.7213 | - | |
|
|
| 0.3722 | 750 | 2.753 | - | |
|
|
| 0.3772 | 760 | 2.8034 | - | |
|
|
| 0.3821 | 770 | 2.8288 | - | |
|
|
| 0.3871 | 780 | 2.613 | - | |
|
|
| 0.3921 | 790 | 2.7315 | - | |
|
|
| 0.3970 | 800 | 2.8077 | - | |
|
|
| 0.4020 | 810 | 2.7442 | - | |
|
|
| 0.4069 | 820 | 2.7351 | - | |
|
|
| 0.4119 | 830 | 2.7643 | - | |
|
|
| 0.4169 | 840 | 2.8984 | - | |
|
|
| 0.4218 | 850 | 2.7377 | - | |
|
|
| 0.4268 | 860 | 2.7021 | - | |
|
|
| 0.4318 | 870 | 2.6756 | - | |
|
|
| 0.4367 | 880 | 2.7852 | - | |
|
|
| 0.4417 | 890 | 2.7531 | - | |
|
|
| 0.4467 | 900 | 2.6636 | 2.3456 | |
|
|
| 0.4516 | 910 | 2.7089 | - | |
|
|
| 0.4566 | 920 | 2.8029 | - | |
|
|
| 0.4615 | 930 | 2.721 | - | |
|
|
| 0.4665 | 940 | 2.5606 | - | |
|
|
| 0.4715 | 950 | 2.6397 | - | |
|
|
| 0.4764 | 960 | 2.6563 | - | |
|
|
| 0.4814 | 970 | 2.7163 | - | |
|
|
| 0.4864 | 980 | 2.6225 | - | |
|
|
| 0.4913 | 990 | 2.645 | - | |
|
|
| 0.4963 | 1000 | 2.6576 | - | |
|
|
| 0.5012 | 1010 | 2.7019 | - | |
|
|
| 0.5062 | 1020 | 2.7195 | - | |
|
|
| 0.5112 | 1030 | 2.7242 | - | |
|
|
| 0.5161 | 1040 | 2.6729 | - | |
|
|
| 0.5211 | 1050 | 2.7637 | - | |
|
|
| 0.5261 | 1060 | 2.677 | - | |
|
|
| 0.5310 | 1070 | 2.7018 | - | |
|
|
| 0.5360 | 1080 | 2.6469 | - | |
|
|
| 0.5409 | 1090 | 2.7186 | - | |
|
|
| 0.5459 | 1100 | 2.6728 | - | |
|
|
| 0.5509 | 1110 | 2.6694 | - | |
|
|
| 0.5558 | 1120 | 2.7839 | - | |
|
|
| 0.5608 | 1130 | 2.5834 | - | |
|
|
| 0.5658 | 1140 | 2.6905 | - | |
|
|
| 0.5707 | 1150 | 2.7223 | - | |
|
|
| 0.5757 | 1160 | 2.7235 | - | |
|
|
| 0.5806 | 1170 | 2.636 | - | |
|
|
| 0.5856 | 1180 | 2.6314 | - | |
|
|
| 0.5906 | 1190 | 2.5941 | - | |
|
|
| 0.5955 | 1200 | 2.7827 | 2.2911 | |
|
|
| 0.6005 | 1210 | 2.6104 | - | |
|
|
| 0.6055 | 1220 | 2.6148 | - | |
|
|
| 0.6104 | 1230 | 2.6355 | - | |
|
|
| 0.6154 | 1240 | 2.6269 | - | |
|
|
| 0.6203 | 1250 | 2.6003 | - | |
|
|
| 0.6253 | 1260 | 2.6256 | - | |
|
|
| 0.6303 | 1270 | 2.6326 | - | |
|
|
| 0.6352 | 1280 | 2.681 | - | |
|
|
| 0.6402 | 1290 | 2.5776 | - | |
|
|
| 0.6452 | 1300 | 2.7528 | - | |
|
|
| 0.6501 | 1310 | 2.6076 | - | |
|
|
| 0.6551 | 1320 | 2.5784 | - | |
|
|
| 0.6600 | 1330 | 2.6064 | - | |
|
|
| 0.6650 | 1340 | 2.5757 | - | |
|
|
| 0.6700 | 1350 | 2.5851 | - | |
|
|
| 0.6749 | 1360 | 2.6007 | - | |
|
|
| 0.6799 | 1370 | 2.5674 | - | |
|
|
| 0.6849 | 1380 | 2.6984 | - | |
|
|
| 0.6898 | 1390 | 2.6202 | - | |
|
|
| 0.6948 | 1400 | 2.6729 | - | |
|
|
| 0.6998 | 1410 | 2.6683 | - | |
|
|
| 0.7047 | 1420 | 2.6355 | - | |
|
|
| 0.7097 | 1430 | 2.6033 | - | |
|
|
| 0.7146 | 1440 | 2.6834 | - | |
|
|
| 0.7196 | 1450 | 2.6597 | - | |
|
|
| 0.7246 | 1460 | 2.6298 | - | |
|
|
| 0.7295 | 1470 | 2.6232 | - | |
|
|
| 0.7345 | 1480 | 2.5672 | - | |
|
|
| 0.7395 | 1490 | 2.5139 | - | |
|
|
| 0.7444 | 1500 | 2.6248 | 2.3090 | |
|
|
| 0.7494 | 1510 | 2.6417 | - | |
|
|
| 0.7543 | 1520 | 2.6197 | - | |
|
|
| 0.7593 | 1530 | 2.6911 | - | |
|
|
| 0.7643 | 1540 | 2.5542 | - | |
|
|
| 0.7692 | 1550 | 2.6584 | - | |
|
|
| 0.7742 | 1560 | 2.6182 | - | |
|
|
| 0.7792 | 1570 | 2.6301 | - | |
|
|
| 0.7841 | 1580 | 2.5629 | - | |
|
|
| 0.7891 | 1590 | 2.5965 | - | |
|
|
| 0.7940 | 1600 | 2.5722 | - | |
|
|
| 0.7990 | 1610 | 2.5835 | - | |
|
|
| 0.8040 | 1620 | 2.5901 | - | |
|
|
| 0.8089 | 1630 | 2.6055 | - | |
|
|
| 0.8139 | 1640 | 2.6019 | - | |
|
|
| 0.8189 | 1650 | 2.6421 | - | |
|
|
| 0.8238 | 1660 | 2.6049 | - | |
|
|
| 0.8288 | 1670 | 2.5351 | - | |
|
|
| 0.8337 | 1680 | 2.6158 | - | |
|
|
| 0.8387 | 1690 | 2.5994 | - | |
|
|
| 0.8437 | 1700 | 2.5816 | - | |
|
|
| 0.8486 | 1710 | 2.5848 | - | |
|
|
| 0.8536 | 1720 | 2.6138 | - | |
|
|
| 0.8586 | 1730 | 2.5811 | - | |
|
|
| 0.8635 | 1740 | 2.5933 | - | |
|
|
| 0.8685 | 1750 | 2.5869 | - | |
|
|
| 0.8734 | 1760 | 2.5464 | - | |
|
|
| 0.8784 | 1770 | 2.6842 | - | |
|
|
| 0.8834 | 1780 | 2.6312 | - | |
|
|
| 0.8883 | 1790 | 2.5621 | - | |
|
|
| 0.8933 | 1800 | 2.6103 | 2.2858 | |
|
|
|
|
|
</details> |
|
|
|
|
|
### Framework Versions |
|
|
- Python: 3.10.12 |
|
|
- Sentence Transformers: 4.1.0 |
|
|
- Transformers: 4.51.3 |
|
|
- PyTorch: 2.1.0+cu118 |
|
|
- Accelerate: 1.6.0 |
|
|
- Datasets: 3.5.0 |
|
|
- Tokenizers: 0.21.1 |
|
|
|
|
|
## Citation |
|
|
|
|
|
### BibTeX |
|
|
|
|
|
#### Sentence Transformers |
|
|
```bibtex |
|
|
@inproceedings{reimers-2019-sentence-bert, |
|
|
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", |
|
|
author = "Reimers, Nils and Gurevych, Iryna", |
|
|
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", |
|
|
month = "11", |
|
|
year = "2019", |
|
|
publisher = "Association for Computational Linguistics", |
|
|
url = "https://arxiv.org/abs/1908.10084", |
|
|
} |
|
|
``` |
|
|
|
|
|
#### MultipleNegativesRankingLoss |
|
|
```bibtex |
|
|
@misc{henderson2017efficient, |
|
|
title={Efficient Natural Language Response Suggestion for Smart Reply}, |
|
|
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil}, |
|
|
year={2017}, |
|
|
eprint={1705.00652}, |
|
|
archivePrefix={arXiv}, |
|
|
primaryClass={cs.CL} |
|
|
} |
|
|
``` |
|
|
|
|
|
<!-- |
|
|
## Glossary |
|
|
|
|
|
*Clearly define terms in order to be accessible across audiences.* |
|
|
--> |
|
|
|
|
|
<!-- |
|
|
## Model Card Authors |
|
|
|
|
|
*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.* |
|
|
--> |
|
|
|
|
|
<!-- |
|
|
## Model Card Contact |
|
|
|
|
|
*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.* |
|
|
--> |