SentenceTransformer based on Qwen/Qwen3-Embedding-0.6B
This is a sentence-transformers model finetuned from Qwen/Qwen3-Embedding-0.6B on the finanical-rag-embedding-dataset dataset. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: Qwen/Qwen3-Embedding-0.6B
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 1024 dimensions
- Similarity Function: Cosine Similarity
- Training Dataset:
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'PeftModelForFeatureExtraction'})
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("elnurgar/qwen3-embedding-finetuned")
# Run inference
sentences = [
"How much did GM Financial's primary source of cash from finance charge income increase in 2023 compared to the previous year?",
'In the year ended December 31, 2023, Net cash provided by operating activities increased primarily due to an increase in finance charge income of $1.7 billion.',
"A corporate entity referred to as a management services organization (MSO) provides various management services and keeps the physician entity 'friendly' through a stock transfer restriction agreement and/or other relationships. The fees under the management services arrangement must comply with state fee splitting laws, which in some states may prohibit percentage-based fees.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[ 1.0000, 0.5147, -0.0829],
# [ 0.5147, 1.0000, -0.0916],
# [-0.0829, -0.0916, 1.0000]])
Evaluation
Metrics
Information Retrieval
- Dataset:
ir-eval - Evaluated with
InformationRetrievalEvaluator
| Metric | Value |
|---|---|
| cosine_accuracy@1 | 0.6971 |
| cosine_accuracy@3 | 0.8471 |
| cosine_accuracy@5 | 0.8843 |
| cosine_accuracy@10 | 0.9143 |
| cosine_precision@1 | 0.6971 |
| cosine_precision@3 | 0.2824 |
| cosine_precision@5 | 0.1769 |
| cosine_precision@10 | 0.0914 |
| cosine_recall@1 | 0.6971 |
| cosine_recall@3 | 0.8471 |
| cosine_recall@5 | 0.8843 |
| cosine_recall@10 | 0.9143 |
| cosine_ndcg@10 | 0.8111 |
| cosine_mrr@10 | 0.7774 |
| cosine_map@100 | 0.7808 |
Training Details
Training Dataset
finanical-rag-embedding-dataset
- Dataset: finanical-rag-embedding-dataset at e0b1781
- Size: 6,300 training samples
- Columns:
anchorandpositive - Approximate statistics based on the first 1000 samples:
anchor positive type string string details - min: 7 tokens
- mean: 20.9 tokens
- max: 45 tokens
- min: 9 tokens
- mean: 47.3 tokens
- max: 512 tokens
- Samples:
anchor positive What was the amount of cash generated from operations by the company in fiscal year 2023?Highlights during fiscal year 2023 include the following: We generated $18,085 million of cash from operations.How much were unrealized losses on U.S. government and agency securities for those held for 12 months or greater as of June 30, 2023?U.S. government and agency securities | $ | 7,950 | | $ | (336 | ) | $ | 45,273 | $ | (3,534 | ) | $ | 53,223 | $ | (3,870 | )How is the impairment of assets assessed for projects still under development?For assets under development, assets are grouped and assessed for impairment by estimating the undiscounted cash flows, which include remaining construction costs, over the asset's remaining useful life. If cash flows do not exceed the carrying amount, impairment based on fair value versus carrying value is considered. - Loss:
MultipleNegativesRankingLosswith these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim", "gather_across_devices": false }
Evaluation Dataset
finanical-rag-embedding-dataset
- Dataset: finanical-rag-embedding-dataset at e0b1781
- Size: 700 evaluation samples
- Columns:
anchorandpositive - Approximate statistics based on the first 700 samples:
anchor positive type string string details - min: 8 tokens
- mean: 21.52 tokens
- max: 54 tokens
- min: 4 tokens
- mean: 50.92 tokens
- max: 512 tokens
- Samples:
anchor positive How much were the company's debt obligations as of December 31, 2023?The company's debt obligations as of December 31, 2023, totaled $2,299,887 thousand.What are the specific structures and legal considerations for a management services organization (MSO) in relation to its relationship with physician owners?A corporate entity referred to as a management services organization (MSO) provides various management services and keeps the physician entity 'friendly' through a stock transfer restriction agreement and/or other relationships. The fees under the management services arrangement must comply with state fee splitting laws, which in some states may prohibit percentage-based fees.Where does Eli Lilly and Company manufacture and distribute its products?We manufacture and distribute our products through facilities in the United States (U.S.), including Puerto Rico, and in Europe and Asia. - Loss:
MultipleNegativesRankingLosswith these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim", "gather_across_devices": false }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy: stepsper_device_train_batch_size: 16per_device_eval_batch_size: 32gradient_accumulation_steps: 8learning_rate: 0.0002weight_decay: 0.1num_train_epochs: 4lr_scheduler_type: cosinewarmup_ratio: 0.1bf16: Truefp16_full_eval: Truetf32: Trueload_best_model_at_end: Truebatch_sampler: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 16per_device_eval_batch_size: 32per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 8eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 0.0002weight_decay: 0.1adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 4max_steps: -1lr_scheduler_type: cosinelr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falsebf16: Truefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Truetf32: Truelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Trueignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters:auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueprompts: Nonebatch_sampler: no_duplicatesmulti_dataset_batch_sampler: proportionalrouter_mapping: {}learning_rate_mapping: {}
Training Logs
| Epoch | Step | Training Loss | Validation Loss | ir-eval_cosine_ndcg@10 |
|---|---|---|---|---|
| 0.0406 | 2 | 0.3124 | - | - |
| 0.0812 | 4 | 0.2676 | 0.4594 | 0.7143 |
| 0.1218 | 6 | 0.2724 | - | - |
| 0.1624 | 8 | 0.2373 | 0.2985 | 0.7329 |
| 0.2030 | 10 | 0.1475 | - | - |
| 0.2437 | 12 | 0.0728 | 0.1531 | 0.7410 |
| 0.2843 | 14 | 0.0654 | - | - |
| 0.3249 | 16 | 0.0655 | 0.1202 | 0.7349 |
| 0.3655 | 18 | 0.0604 | - | - |
| 0.4061 | 20 | 0.1237 | 0.1128 | 0.7402 |
| 0.4467 | 22 | 0.0528 | - | - |
| 0.4873 | 24 | 0.062 | 0.1033 | 0.7504 |
| 0.5279 | 26 | 0.0244 | - | - |
| 0.5685 | 28 | 0.0432 | 0.0921 | 0.7601 |
| 0.6091 | 30 | 0.0301 | - | - |
| 0.6497 | 32 | 0.0637 | 0.0839 | 0.7712 |
| 0.6904 | 34 | 0.0542 | - | - |
| 0.7310 | 36 | 0.0256 | 0.0778 | 0.7810 |
| 0.7716 | 38 | 0.0397 | - | - |
| 0.8122 | 40 | 0.0191 | 0.0736 | 0.7879 |
| 0.8528 | 42 | 0.0403 | - | - |
| 0.8934 | 44 | 0.0354 | 0.0702 | 0.7903 |
| 0.9340 | 46 | 0.0291 | - | - |
| 0.9746 | 48 | 0.0534 | 0.0696 | 0.7956 |
| 1.0 | 50 | 0.0229 | - | - |
| 1.0406 | 52 | 0.0364 | 0.0718 | 0.7911 |
| 1.0812 | 54 | 0.0219 | - | - |
| 1.1218 | 56 | 0.0347 | 0.0725 | 0.7903 |
| 1.1624 | 58 | 0.0303 | - | - |
| 1.2030 | 60 | 0.0679 | 0.0725 | 0.7914 |
| 1.2437 | 62 | 0.0249 | - | - |
| 1.2843 | 64 | 0.032 | 0.0733 | 0.7948 |
| 1.3249 | 66 | 0.0285 | - | - |
| 1.3655 | 68 | 0.0226 | 0.0752 | 0.7939 |
| 1.4061 | 70 | 0.0176 | - | - |
| 1.4467 | 72 | 0.0257 | 0.0749 | 0.7991 |
| 1.4873 | 74 | 0.0131 | - | - |
| 1.5279 | 76 | 0.0362 | 0.0733 | 0.8033 |
| 1.5685 | 78 | 0.0199 | - | - |
| 1.6091 | 80 | 0.0073 | 0.0726 | 0.8030 |
| 1.6497 | 82 | 0.0152 | - | - |
| 1.6904 | 84 | 0.0078 | 0.0721 | 0.8042 |
| 1.7310 | 86 | 0.035 | - | - |
| 1.7716 | 88 | 0.0267 | 0.0700 | 0.8029 |
| 1.8122 | 90 | 0.0114 | - | - |
| 1.8528 | 92 | 0.0438 | 0.0674 | 0.8068 |
| 1.8934 | 94 | 0.0244 | - | - |
| 1.9340 | 96 | 0.0125 | 0.0666 | 0.8051 |
| 1.9746 | 98 | 0.0463 | - | - |
| 2.0 | 100 | 0.0095 | 0.0670 | 0.8100 |
| 2.0406 | 102 | 0.0251 | - | - |
| 2.0812 | 104 | 0.0163 | 0.0670 | 0.8073 |
| 2.1218 | 106 | 0.0126 | - | - |
| 2.1624 | 108 | 0.025 | 0.0666 | 0.8086 |
| 2.2030 | 110 | 0.0261 | - | - |
| 2.2437 | 112 | 0.0313 | 0.0672 | 0.8073 |
| 2.2843 | 114 | 0.0197 | - | - |
| 2.3249 | 116 | 0.022 | 0.0664 | 0.8055 |
| 2.3655 | 118 | 0.019 | - | - |
| 2.4061 | 120 | 0.0121 | 0.0654 | 0.8071 |
| 2.4467 | 122 | 0.0093 | - | - |
| 2.4873 | 124 | 0.022 | 0.0649 | 0.8059 |
| 2.5279 | 126 | 0.0125 | - | - |
| 2.5685 | 128 | 0.0206 | 0.0647 | 0.8043 |
| 2.6091 | 130 | 0.012 | - | - |
| 2.6497 | 132 | 0.0271 | 0.0646 | 0.8093 |
| 2.6904 | 134 | 0.0257 | - | - |
| 2.7310 | 136 | 0.0097 | 0.0637 | 0.8066 |
| 2.7716 | 138 | 0.0348 | - | - |
| 2.8122 | 140 | 0.0349 | 0.0637 | 0.8081 |
| 2.8528 | 142 | 0.0215 | - | - |
| 2.8934 | 144 | 0.0106 | 0.0631 | 0.8067 |
| 2.9340 | 146 | 0.0421 | - | - |
| 2.9746 | 148 | 0.0093 | 0.0625 | 0.8096 |
| 3.0 | 150 | 0.008 | - | - |
| 3.0406 | 152 | 0.0144 | 0.0621 | 0.8079 |
| 3.0812 | 154 | 0.0531 | - | - |
| 3.1218 | 156 | 0.0088 | 0.0622 | 0.8091 |
| 3.1624 | 158 | 0.0093 | - | - |
| 3.2030 | 160 | 0.018 | 0.0619 | 0.8081 |
| 3.2437 | 162 | 0.0127 | - | - |
| 3.2843 | 164 | 0.0091 | 0.0620 | 0.8101 |
| 3.3249 | 166 | 0.0121 | - | - |
| 3.3655 | 168 | 0.0021 | 0.0618 | 0.8092 |
| 3.4061 | 170 | 0.0072 | - | - |
| 3.4467 | 172 | 0.0178 | 0.0617 | 0.8090 |
| 3.4873 | 174 | 0.0256 | - | - |
| 3.5279 | 176 | 0.0156 | 0.0619 | 0.8105 |
| 3.5685 | 178 | 0.0223 | - | - |
| 3.6091 | 180 | 0.0215 | 0.0617 | 0.8112 |
| 3.6497 | 182 | 0.0084 | - | - |
| 3.6904 | 184 | 0.0156 | 0.0617 | 0.8100 |
| 3.7310 | 186 | 0.0292 | - | - |
| 3.7716 | 188 | 0.0138 | 0.0619 | 0.8105 |
| 3.8122 | 190 | 0.0072 | - | - |
| 3.8528 | 192 | 0.0103 | 0.0614 | 0.8097 |
| 3.8934 | 194 | 0.0102 | - | - |
| 3.9340 | 196 | 0.0176 | 0.0617 | 0.8096 |
| 3.9746 | 198 | 0.016 | - | - |
| 4.0 | 200 | 0.0037 | 0.0618 | 0.8111 |
- The bold row denotes the saved checkpoint.
Framework Versions
- Python: 3.12.12
- Sentence Transformers: 5.1.2
- Transformers: 4.57.2
- PyTorch: 2.9.0+cu126
- Accelerate: 1.12.0
- Datasets: 4.3.0
- Tokenizers: 0.22.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Model tree for elnurgar/qwen3-embedding-finetuned
Dataset used to train elnurgar/qwen3-embedding-finetuned
Evaluation results
- Cosine Accuracy@1 on ir evalself-reported0.697
- Cosine Accuracy@3 on ir evalself-reported0.847
- Cosine Accuracy@5 on ir evalself-reported0.884
- Cosine Accuracy@10 on ir evalself-reported0.914
- Cosine Precision@1 on ir evalself-reported0.697
- Cosine Precision@3 on ir evalself-reported0.282
- Cosine Precision@5 on ir evalself-reported0.177
- Cosine Precision@10 on ir evalself-reported0.091
- Cosine Recall@1 on ir evalself-reported0.697
- Cosine Recall@3 on ir evalself-reported0.847