Training in progress, step 200, checkpoint

Browse files

Files changed (9) hide show

last-checkpoint/2_Dense/model.safetensors +2 -2
last-checkpoint/3_Dense/model.safetensors +2 -2
last-checkpoint/README.md +15 -13
last-checkpoint/config.json +3 -4
last-checkpoint/config_sentence_transformers.json +1 -1
last-checkpoint/model.safetensors +2 -2
last-checkpoint/optimizer.pt +2 -2
last-checkpoint/trainer_state.json +8 -8
last-checkpoint/training_args.bin +2 -2

last-checkpoint/2_Dense/model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:fed83dfd00c1a0263f07eea8794b9265914ae7b3dc5c76729cf3807e2861adc3
-size 9437272

 version https://git-lfs.github.com/spec/v1
+oid sha256:1e06277ca8787b7fa33c7a991a49e7c44cedc64537c9a587e3eabe4480d98101
+size 4718680

last-checkpoint/3_Dense/model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:87b5e471a0697253d32e596ab7ab53200a19437e9d28de12f4dc211852102b58
-size 9437272

 version https://git-lfs.github.com/spec/v1
+oid sha256:6a1747a1bbfdac934f7ee3e281dffa826558868f822e91fdd1f85e39c452033a
+size 4718680

last-checkpoint/README.md CHANGED Viewed

@@ -424,7 +424,7 @@ print(query_embeddings.shape, document_embeddings.shape)
 # Get the similarity scores for the embeddings
 similarities = model.similarity(query_embeddings, document_embeddings)
 print(similarities)
-# tensor([[ 0.9179,  0.0553, -0.0070]])
 ```
 <!--
@@ -488,7 +488,7 @@ You can finetune this model on your own dataset.
   {
       "scale": 20.0,
       "similarity_fct": "cos_sim",
-      "mini_batch_size": 32,
       "gather_across_devices": false
   }
   ```
@@ -503,7 +503,7 @@ You can finetune this model on your own dataset.
 - `push_to_hub`: True
 - `hub_model_id`: guyhadad01/EncodeRec_300M_Toys
 - `hub_strategy`: checkpoint
-- `prompts`: task: search result | query:
 #### All Hyperparameters
 <details><summary>Click to expand</summary>
@@ -545,7 +545,6 @@ You can finetune this model on your own dataset.
 - `seed`: 42
 - `data_seed`: None
 - `jit_mode_eval`: False
-- `use_ipex`: False
 - `bf16`: True
 - `fp16`: False
 - `fp16_opt_level`: O1
@@ -572,6 +571,7 @@ You can finetune this model on your own dataset.
 - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
 - `fsdp_transformer_layer_cls_to_wrap`: None
 - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
 - `deepspeed`: None
 - `label_smoothing_factor`: 0.0
 - `optim`: adamw_torch
@@ -579,6 +579,8 @@ You can finetune this model on your own dataset.
 - `adafactor`: False
 - `group_by_length`: False
 - `length_column_name`: length
 - `ddp_find_unused_parameters`: None
 - `ddp_bucket_cap_mb`: None
 - `ddp_broadcast_buffers`: False
@@ -611,7 +613,7 @@ You can finetune this model on your own dataset.
 - `torch_compile_backend`: None
 - `torch_compile_mode`: None
 - `include_tokens_per_second`: False
-- `include_num_input_tokens_seen`: False
 - `neftune_noise_alpha`: None
 - `optim_target_modules`: None
 - `batch_eval_metrics`: False
@@ -619,8 +621,8 @@ You can finetune this model on your own dataset.
 - `use_liger_kernel`: False
 - `liger_kernel_config`: None
 - `eval_use_gather_object`: False
-- `average_tokens_across_devices`: False
-- `prompts`: task: search result | query:
 - `batch_sampler`: batch_sampler
 - `multi_dataset_batch_sampler`: proportional
 - `router_mapping`: {}
@@ -631,20 +633,20 @@ You can finetune this model on your own dataset.
 ### Training Logs
 | Epoch  | Step | Training Loss |
 |:------:|:----:|:-------------:|
-| 0.0463 | 50   | 0.4695        |
-| 0.0926 | 100  | 0.2072        |
-| 0.1389 | 150  | 0.2185        |
-| 0.1852 | 200  | 0.2196        |
 ### Framework Versions
 - Python: 3.12.11
 - Sentence Transformers: 5.1.0
-- Transformers: 4.55.2
 - PyTorch: 2.7.1+cu126
 - Accelerate: 1.10.0
 - Datasets: 3.6.0
-- Tokenizers: 0.21.4
 ## Citation

 # Get the similarity scores for the embeddings
 similarities = model.similarity(query_embeddings, document_embeddings)
 print(similarities)
+# tensor([[0.8959, 0.0632, 0.0102]])
 ```
 <!--
   {
       "scale": 20.0,
       "similarity_fct": "cos_sim",
+      "mini_batch_size": 64,
       "gather_across_devices": false
   }
   ```
 - `push_to_hub`: True
 - `hub_model_id`: guyhadad01/EncodeRec_300M_Toys
 - `hub_strategy`: checkpoint
+- `prompts`: {'question': 'task: search result | query: ', 'passage_text': 'title: none | text: '}
 #### All Hyperparameters
 <details><summary>Click to expand</summary>
 - `seed`: 42
 - `data_seed`: None
 - `jit_mode_eval`: False
 - `bf16`: True
 - `fp16`: False
 - `fp16_opt_level`: O1
 - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
 - `fsdp_transformer_layer_cls_to_wrap`: None
 - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
+- `parallelism_config`: None
 - `deepspeed`: None
 - `label_smoothing_factor`: 0.0
 - `optim`: adamw_torch
 - `adafactor`: False
 - `group_by_length`: False
 - `length_column_name`: length
+- `project`: huggingface
+- `trackio_space_id`: trackio
 - `ddp_find_unused_parameters`: None
 - `ddp_bucket_cap_mb`: None
 - `ddp_broadcast_buffers`: False
 - `torch_compile_backend`: None
 - `torch_compile_mode`: None
 - `include_tokens_per_second`: False
+- `include_num_input_tokens_seen`: no
 - `neftune_noise_alpha`: None
 - `optim_target_modules`: None
 - `batch_eval_metrics`: False
 - `use_liger_kernel`: False
 - `liger_kernel_config`: None
 - `eval_use_gather_object`: False
+- `average_tokens_across_devices`: True
+- `prompts`: {'question': 'task: search result | query: ', 'passage_text': 'title: none | text: '}
 - `batch_sampler`: batch_sampler
 - `multi_dataset_batch_sampler`: proportional
 - `router_mapping`: {}
 ### Training Logs
 | Epoch  | Step | Training Loss |
 |:------:|:----:|:-------------:|
+| 0.0463 | 50   | 0.2551        |
+| 0.0926 | 100  | 0.1353        |
+| 0.1389 | 150  | 0.1541        |
+| 0.1852 | 200  | 0.1499        |
 ### Framework Versions
 - Python: 3.12.11
 - Sentence Transformers: 5.1.0
+- Transformers: 4.57.0
 - PyTorch: 2.7.1+cu126
 - Accelerate: 1.10.0
 - Datasets: 3.6.0
+- Tokenizers: 0.22.1
 ## Citation

last-checkpoint/config.json CHANGED Viewed

@@ -7,7 +7,7 @@
   "attention_dropout": 0.0,
   "attn_logit_softcapping": null,
   "bos_token_id": 2,
-  "dtype": "float32",
   "eos_token_id": 1,
   "final_logit_softcapping": null,
   "head_dim": 256,
@@ -52,9 +52,8 @@
   "rope_local_base_freq": 10000.0,
   "rope_scaling": null,
   "rope_theta": 1000000.0,
-  "sliding_window": 512,
-  "torch_dtype": "float32",
-  "transformers_version": "4.55.2",
   "use_bidirectional_attention": true,
   "use_cache": true,
   "vocab_size": 262144

   "attention_dropout": 0.0,
   "attn_logit_softcapping": null,
   "bos_token_id": 2,
+  "dtype": "bfloat16",
   "eos_token_id": 1,
   "final_logit_softcapping": null,
   "head_dim": 256,
   "rope_local_base_freq": 10000.0,
   "rope_scaling": null,
   "rope_theta": 1000000.0,
+  "sliding_window": 257,
+  "transformers_version": "4.57.0",
   "use_bidirectional_attention": true,
   "use_cache": true,
   "vocab_size": 262144

last-checkpoint/config_sentence_transformers.json CHANGED Viewed

@@ -2,7 +2,7 @@
   "model_type": "SentenceTransformer",
   "__version__": {
     "sentence_transformers": "5.1.0",
-    "transformers": "4.55.2",
     "pytorch": "2.7.1+cu126"
   },
   "prompts": {

   "model_type": "SentenceTransformer",
   "__version__": {
     "sentence_transformers": "5.1.0",
+    "transformers": "4.57.0",
     "pytorch": "2.7.1+cu126"
   },
   "prompts": {

last-checkpoint/model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a3569d3e977da69f1e6effdc5cfd35f2c31712c0f88b884d222e1b60040f0e26
-size 1211486072

 version https://git-lfs.github.com/spec/v1
+oid sha256:abac25e1e6fdf3255533a12e513ba6078edbe7d810a3fa975b6d4d0639fab536
+size 605759848

last-checkpoint/optimizer.pt CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:01580aac4f96298117d8589da031aaea8e5b6ce4a27bb8c251eb3a20d7cf5c0e
-size 2460919051

 version https://git-lfs.github.com/spec/v1
+oid sha256:af6ae0591084794587e796774ea539c6d9d1c58565ec4d0bf461ec38c34219ab
+size 1230592267

last-checkpoint/trainer_state.json CHANGED Viewed

@@ -11,30 +11,30 @@
   "log_history": [
     {
       "epoch": 0.046296296296296294,
-      "grad_norm": 6.824543476104736,
       "learning_rate": 2.2685185185185187e-05,
-      "loss": 0.4695,
       "step": 50
     },
     {
       "epoch": 0.09259259259259259,
-      "grad_norm": 6.649824142456055,
       "learning_rate": 4.5833333333333334e-05,
-      "loss": 0.2072,
       "step": 100
     },
     {
       "epoch": 0.1388888888888889,
-      "grad_norm": 5.679929733276367,
       "learning_rate": 4.7890946502057616e-05,
-      "loss": 0.2185,
       "step": 150
     },
     {
       "epoch": 0.18518518518518517,
-      "grad_norm": 4.710780620574951,
       "learning_rate": 4.531893004115226e-05,
-      "loss": 0.2196,
       "step": 200
     }
   ],

   "log_history": [
     {
       "epoch": 0.046296296296296294,
+      "grad_norm": 5.875,
       "learning_rate": 2.2685185185185187e-05,
+      "loss": 0.2551,
       "step": 50
     },
     {
       "epoch": 0.09259259259259259,
+      "grad_norm": 6.84375,
       "learning_rate": 4.5833333333333334e-05,
+      "loss": 0.1353,
       "step": 100
     },
     {
       "epoch": 0.1388888888888889,
+      "grad_norm": 5.375,
       "learning_rate": 4.7890946502057616e-05,
+      "loss": 0.1541,
       "step": 150
     },
     {
       "epoch": 0.18518518518518517,
+      "grad_norm": 4.75,
       "learning_rate": 4.531893004115226e-05,
+      "loss": 0.1499,
       "step": 200
     }
   ],

last-checkpoint/training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d08e873fcc6af5914cd3b33b8457f079faba879550636b5ef8bf74269ab02c7c
-size 6161

 version https://git-lfs.github.com/spec/v1
+oid sha256:d251256d6a17063ebe50c1a916e869c5121c6daeb0ba390c2cedfa45a16a448e
+size 6289