SentenceTransformer based on tyuan73/ModernBERT-large-with-new-tokenizer
This is a sentence-transformers model finetuned from tyuan73/ModernBERT-large-with-new-tokenizer on the processed_yahoo_finance_stockmarket_news dataset. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: tyuan73/ModernBERT-large-with-new-tokenizer
- Maximum Sequence Length: 8192 tokens
- Output Dimensionality: 1024 dimensions
- Similarity Function: Cosine Similarity
- Training Dataset:
Model Sources
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: ModernBertModel
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("tyuan73/finetuned-modernbert-finance-large")
sentences = [
"Stocks will rally 5% in just 5 days after the Fed's rate decision this week, Fundstrat says",
'Photo by Cindy Ord/Getty Images for YahooThe stock market could see a 5% gain over the next week, according to Fundstrat\'s Tom Lee.The rally would be sparked by a dovish Fed FOMC meeting on Wednesday that all but confirms imminent interest rate cuts."These are significant gains, implying the S&P 500 could gain 200-300 points in the next week," Lee said.The stock market is poised to surge as much as 5% in the next week, according to a Wednesday note from Fundstrat.The research firm said it expects an explosive rally in theS&P 500to materialize in the five days following the Federal Reserve\'s FOMC meeting on Wednesday.While the Fed isnot expected to cut interest rates at its July FOMC meeting,it is expected to signal that a rate cut is all but certain when it meets again in September."The key premise is the Fed is likely to commit to a September rate cut of at least 25bp. A possibility of more than that is not necessary. And while bond markets have priced in 100% probability of this, equity investors likely will not be convinced until the Fed affirms this as such," Fundstrat\'s Tom Lee said.Thenear certainty of a rate cut from the Fedin September should spark a risk-on rally for stocks, especially given that theNasdaq 100has already experienced a near-10% correctionin recent weeks, according to the note."Overall, we believe risk-on moment is coming," Lee said.Lee\'s confidence in a strong rally post-Fed meeting is based on the fact that recent Fed meetings have sparked a big rally in stocks.In the past two years, when stocks were down heading into a Fed FOMC meeting, stocks saw a five-day gain of as much as 5.5% and a median gain of 3.4%."These are significant gains, implying the S&P 500 could gain 200-300 points in the next week. This is very compelling in our view," Lee said.And while a 25 basis point interest rate cut may not seem like much, it has real-world economic impacts that could ultimately influence the US housing market in a big way."Here are some tangible reasons a Fed cut makes sense: 30-year mortgage has excess spread to 10-year due to uncertainty. The spread could shrink from 270 basis points to 170 basis points (50-year average)," Lee explained.Interest rate cuts from the Fed, however small, would also help alleviate an ongoing slowdown in the housing, durables, and auto markets, Lee said.A 5% rally in the S&P 500 would catapult the index to fresh record highs, completely erasing its 5% decline over the past few weeks.Read the original article onBusiness Insider',
'AMBALA, India (Reuters) - Indian farmers demanding higher prices for their crops will postpone a planned protest march to New Delhi until unions hold another round of talks with government ministers on Sunday.\n\nAgriculture Minister Arjun Munda, who met farmers\' representatives on Thurdsay along with Commerce Minister Piyush Goyal and Minister of State for Home Affairs Nityanand Rai, said the talks were "positive".\n\n"We have decided that the next meeting to take the discussion forward will take place on Sunday at 6 pm...We believe we will all find a solution together peacefully," he told reporters following Thursday\'s meeting.\n\nProtest leader Jagjit Singh Dallewal also told reporters the farmers would hold off their march for now.\n\n"When the meetings have started, if we move forward (towards Delhi) then how will meetings happen?" Dallewal told reporters, adding that the protest "will continue peacefully".\n\nThousands of farmers had embarked on the "Delhi Chalo", or "Let\'s go to Delhi" march earlier this week to press the government to set a minimum price for their produce, but they were stopped by security forces about 200 kms (125 miles) away from the capital, triggering clashes.\n\nThe protests erupted a few months before India is due to hold national elections in which Prime Minister Narendra Modi is seeking a third term. Farmers form an influential voting bloc.\n\nThe farmers remained camped on the border between Punjab and Haryana states on Friday. Security forces have used concrete and metal barricades, as well as drones carrying tear gascanisters, to stop them for advancing.\n\nThe protest comes two years after Modi\'s government, following a similar protest movement, repealed some farm laws and promised to find ways to ensure support prices for all produce.\n\n(Writing by Sakshi Dayal; editing by Miral Fahmy)',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
Evaluation
Metrics
Triplet
| Metric |
Value |
| cosine_accuracy |
1.0 |
Training Details
Training Dataset
processed_yahoo_finance_stockmarket_news
Evaluation Dataset
processed_yahoo_finance_stockmarket_news
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy: steps
per_device_train_batch_size: 2
per_device_eval_batch_size: 2
num_train_epochs: 1
warmup_ratio: 0.1
fp16: True
gradient_checkpointing: True
batch_sampler: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir: False
do_predict: False
eval_strategy: steps
prediction_loss_only: True
per_device_train_batch_size: 2
per_device_eval_batch_size: 2
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 5e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1.0
num_train_epochs: 1
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: {}
warmup_ratio: 0.1
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: False
fp16: True
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: False
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
gradient_checkpointing: True
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
include_for_metrics: []
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
dispatch_batches: None
split_batches: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
eval_use_gather_object: False
average_tokens_across_devices: False
prompts: None
batch_sampler: no_duplicates
multi_dataset_batch_sampler: proportional
Training Logs
| Epoch |
Step |
Training Loss |
Validation Loss |
test_cosine_accuracy |
| 0.1042 |
100 |
1.8001 |
2.5929 |
- |
| 0.2083 |
200 |
1.3643 |
0.9732 |
- |
| 0.3125 |
300 |
0.8984 |
1.1093 |
- |
| 0.4167 |
400 |
1.0902 |
0.9527 |
- |
| 0.5208 |
500 |
0.913 |
0.5731 |
- |
| 0.625 |
600 |
0.8656 |
1.4353 |
- |
| 0.7292 |
700 |
0.807 |
0.7052 |
- |
| 0.8333 |
800 |
0.7846 |
0.7704 |
- |
| 0.9375 |
900 |
0.5951 |
0.9136 |
- |
| -1 |
-1 |
- |
- |
1.0 |
Framework Versions
- Python: 3.11.11
- Sentence Transformers: 3.4.1
- Transformers: 4.49.0
- PyTorch: 2.5.1+cu124
- Accelerate: 1.3.0
- Datasets: 3.3.2
- Tokenizers: 0.21.0
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}