foochun
/

bge-reranker-ft

@@ -3,7 +3,7 @@ tags:
 - sentence-transformers
 - cross-encoder
 - generated_from_trainer
-- dataset_size:72905
 - loss:MultipleNegativesRankingLoss
 base_model: BAAI/bge-reranker-base
 pipeline_tag: text-ranking
@@ -50,11 +50,11 @@ from sentence_transformers import CrossEncoder
 model = CrossEncoder("foochun/bge-reranker-ft")
 # Get scores for pairs of texts
 pairs = [
-    ['zach koh yong liang', 'yong liang koh zach'],
-    ['zulkifli bin mohamad', 'zulkifli bin muhammad'],
-    ['rahman bin mohd rashid', 'rahman mohammed rashid'],
-    ['mohd syukri bin bakar', 'muhd syukri bakar'],
-    ['carmen tan fang kiat', 'tan fang kiat'],
 ]
 scores = model.predict(pairs)
 print(scores.shape)
@@ -62,13 +62,13 @@ print(scores.shape)
 # Or rank different texts based on similarity to a single text
 ranks = model.rank(
-    'zach koh yong liang',
     [
-        'yong liang koh zach',
-        'zulkifli bin muhammad',
-        'rahman mohammed rashid',
-        'muhd syukri bakar',
-        'tan fang kiat',
     ]
 )
 # [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
@@ -116,19 +116,19 @@ You can finetune this model on your own dataset.
 #### Unnamed Dataset
-* Size: 72,905 training samples
 * Columns: <code>query</code>, <code>pos</code>, and <code>neg</code>
 * Approximate statistics based on the first 1000 samples:
-  |         | query                                                                                         | pos                                                                                           | neg                                                                                           |
-  |:--------|:----------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------|
-  | type    | string                                                                                        | string                                                                                        | string                                                                                        |
-  | details | <ul><li>min: 9 characters</li><li>mean: 19.91 characters</li><li>max: 45 characters</li></ul> | <ul><li>min: 9 characters</li><li>mean: 17.64 characters</li><li>max: 40 characters</li></ul> | <ul><li>min: 9 characters</li><li>mean: 17.95 characters</li><li>max: 37 characters</li></ul> |
 * Samples:
-  | query                                      | pos                                  | neg                                |
-  |:-------------------------------------------|:-------------------------------------|:-----------------------------------|
-  | <code>sim hong soon</code>                 | <code>sim hong soon</code>           | <code>sim soon hong</code>         |
-  | <code>raja mariam binti raja sharif</code> | <code>raja mariam raja sharif</code> | <code>zuraidah binti dollah</code> |
-  | <code>saw ann fui</code>                   | <code>fui saw ann</code>             | <code>ann saw fui</code>           |
 * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#multiplenegativesrankingloss) with these parameters:
   ```json
   {
@@ -142,19 +142,19 @@ You can finetune this model on your own dataset.
 #### Unnamed Dataset
-* Size: 10,415 evaluation samples
 * Columns: <code>query</code>, <code>pos</code>, and <code>neg</code>
 * Approximate statistics based on the first 1000 samples:
-  |         | query                                                                                         | pos                                                                                          | neg                                                                                           |
-  |:--------|:----------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------|
-  | type    | string                                                                                        | string                                                                                       | string                                                                                        |
-  | details | <ul><li>min: 9 characters</li><li>mean: 19.95 characters</li><li>max: 43 characters</li></ul> | <ul><li>min: 9 characters</li><li>mean: 17.8 characters</li><li>max: 42 characters</li></ul> | <ul><li>min: 8 characters</li><li>mean: 18.33 characters</li><li>max: 36 characters</li></ul> |
 * Samples:
-  | query                               | pos                                 | neg                              |
-  |:------------------------------------|:------------------------------------|:---------------------------------|
-  | <code>zach koh yong liang</code>    | <code>yong liang koh zach</code>    | <code>liang yong koh zach</code> |
-  | <code>zulkifli bin mohamad</code>   | <code>zulkifli bin muhammad</code>  | <code>razak bin ibrahim</code>   |
-  | <code>rahman bin mohd rashid</code> | <code>rahman mohammed rashid</code> | <code>fauzi bin mohd</code>      |
 * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#multiplenegativesrankingloss) with these parameters:
   ```json
   {
@@ -243,7 +243,6 @@ You can finetune this model on your own dataset.
 - `fsdp`: []
 - `fsdp_min_num_params`: 0
 - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
-- `tp_size`: 0
 - `fsdp_transformer_layer_cls_to_wrap`: None
 - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
 - `deepspeed`: None
@@ -301,18 +300,18 @@ You can finetune this model on your own dataset.
 ### Training Logs
 | Epoch  | Step | Training Loss |
 |:------:|:----:|:-------------:|
-| 0.0009 | 1    | 0.5117        |
-| 0.8772 | 1000 | 0.0955        |
-| 1.7544 | 2000 | 0.005         |
-| 2.6316 | 3000 | 0.0039        |
 ### Framework Versions
 - Python: 3.11.9
 - Sentence Transformers: 4.1.0
-- Transformers: 4.51.3
 - PyTorch: 2.6.0+cu124
-- Accelerate: 1.6.0
 - Datasets: 3.6.0
 - Tokenizers: 0.21.1

 - sentence-transformers
 - cross-encoder
 - generated_from_trainer
+- dataset_size:82744
 - loss:MultipleNegativesRankingLoss
 base_model: BAAI/bge-reranker-base
 pipeline_tag: text-ranking
 model = CrossEncoder("foochun/bge-reranker-ft")
 # Get scores for pairs of texts
 pairs = [
+    ['quinn toh heng yi', 'heng yi toh quinn'],
+    ['mohd iskandi bin hassan', 'muhd iskandi hassan'],
+    ['quinn ng ee siu', 'quinn ee siu ng'],
+    ['malini doraisamy', 'malini doraisamy'],
+    ['see shan fui', 'shanfui see'],
 ]
 scores = model.predict(pairs)
 print(scores.shape)
 # Or rank different texts based on similarity to a single text
 ranks = model.rank(
+    'quinn toh heng yi',
     [
+        'heng yi toh quinn',
+        'muhd iskandi hassan',
+        'quinn ee siu ng',
+        'malini doraisamy',
+        'shanfui see',
     ]
 )
 # [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
 #### Unnamed Dataset
+* Size: 82,744 training samples
 * Columns: <code>query</code>, <code>pos</code>, and <code>neg</code>
 * Approximate statistics based on the first 1000 samples:
+  |         | query                                                                                         | pos                                                                                           | neg                                                                                          |
+  |:--------|:----------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------|
+  | type    | string                                                                                        | string                                                                                        | string                                                                                       |
+  | details | <ul><li>min: 9 characters</li><li>mean: 19.16 characters</li><li>max: 42 characters</li></ul> | <ul><li>min: 9 characters</li><li>mean: 17.11 characters</li><li>max: 37 characters</li></ul> | <ul><li>min: 9 characters</li><li>mean: 17.7 characters</li><li>max: 38 characters</li></ul> |
 * Samples:
+  | query                            | pos                            | neg                              |
+  |:---------------------------------|:-------------------------------|:---------------------------------|
+  | <code>brandon teh min jun</code> | <code>jun teh min</code>       | <code>brandon min teh jun</code> |
+  | <code>suling anak peroi</code>   | <code>suling anak peroi</code> | <code>suling anak rahim</code>   |
+  | <code>chin sze tian</code>       | <code>szetian chin</code>      | <code>chin sze tian wong</code>  |
 * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#multiplenegativesrankingloss) with these parameters:
   ```json
   {
 #### Unnamed Dataset
+* Size: 11,820 evaluation samples
 * Columns: <code>query</code>, <code>pos</code>, and <code>neg</code>
 * Approximate statistics based on the first 1000 samples:
+  |         | query                                                                                          | pos                                                                                           | neg                                                                                           |
+  |:--------|:-----------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------|
+  | type    | string                                                                                         | string                                                                                        | string                                                                                        |
+  | details | <ul><li>min: 10 characters</li><li>mean: 19.08 characters</li><li>max: 45 characters</li></ul> | <ul><li>min: 9 characters</li><li>mean: 17.02 characters</li><li>max: 40 characters</li></ul> | <ul><li>min: 9 characters</li><li>mean: 17.58 characters</li><li>max: 44 characters</li></ul> |
 * Samples:
+  | query                                | pos                              | neg                                             |
+  |:-------------------------------------|:---------------------------------|:------------------------------------------------|
+  | <code>quinn toh heng yi</code>       | <code>heng yi toh quinn</code>   | <code>toh yi heng</code>                        |
+  | <code>mohd iskandi bin hassan</code> | <code>muhd iskandi hassan</code> | <code>puteri balqis binti megat sulaiman</code> |
+  | <code>quinn ng ee siu</code>         | <code>quinn ee siu ng</code>     | <code>quinn ee ng siu</code>                    |
 * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#multiplenegativesrankingloss) with these parameters:
   ```json
   {
 - `fsdp`: []
 - `fsdp_min_num_params`: 0
 - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
 - `fsdp_transformer_layer_cls_to_wrap`: None
 - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
 - `deepspeed`: None
 ### Training Logs
 | Epoch  | Step | Training Loss |
 |:------:|:----:|:-------------:|
+| 0.0008 | 1    | 0.4707        |
+| 0.7734 | 1000 | 0.1114        |
+| 1.5468 | 2000 | 0.0051        |
+| 2.3202 | 3000 | 0.0046        |
 ### Framework Versions
 - Python: 3.11.9
 - Sentence Transformers: 4.1.0
+- Transformers: 4.52.4
 - PyTorch: 2.6.0+cu124
+- Accelerate: 1.7.0
 - Datasets: 3.6.0
 - Tokenizers: 0.21.1

config.json CHANGED Viewed

@@ -30,7 +30,7 @@
     "version": "4.1.0"
   },
   "torch_dtype": "float32",
-  "transformers_version": "4.51.3",
   "type_vocab_size": 1,
   "use_cache": true,
   "vocab_size": 250002

     "version": "4.1.0"
   },
   "torch_dtype": "float32",
+  "transformers_version": "4.52.4",
   "type_vocab_size": 1,
   "use_cache": true,
   "vocab_size": 250002

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:edc64662e2fe56e8a890faf4992682b1605b018ba49b2acb609a13667cead4ce
 size 1112201932

 version https://git-lfs.github.com/spec/v1
+oid sha256:590bafb40b20dad3f7206e0dd682b70c7d962305730ffde246762e9b04328fba
 size 1112201932