Upload fine-tuned chart reranker model

Browse files

Files changed (9) hide show

.gitattributes +1 -0
README.md +46 -51
config.json +10 -9
eval/CrossEncoderCorrelationEvaluator_validation_results.csv +5 -3
model.safetensors +2 -2
special_tokens_map.json +20 -6
tokenizer.json +0 -0
tokenizer_config.json +19 -22
training_info.txt +7 -4

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+tokenizer.json filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -4,16 +4,16 @@ tags:
 - cross-encoder
 - reranker
 - generated_from_trainer
-- dataset_size:8000
 - loss:BinaryCrossEntropyLoss
-base_model: cross-encoder/ms-marco-MiniLM-L6-v2
 pipeline_tag: text-ranking
 library_name: sentence-transformers
 metrics:
 - pearson
 - spearman
 model-index:
-- name: CrossEncoder based on cross-encoder/ms-marco-MiniLM-L6-v2
   results:
   - task:
       type: cross-encoder-correlation
@@ -23,22 +23,22 @@ model-index:
       type: validation
     metrics:
     - type: pearson
-      value: 0.8481096700155641
       name: Pearson
     - type: spearman
-      value: 0.8528646396544212
       name: Spearman
 ---
-# CrossEncoder based on cross-encoder/ms-marco-MiniLM-L6-v2
-This is a [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.html) model finetuned from [cross-encoder/ms-marco-MiniLM-L6-v2](https://huggingface.co/cross-encoder/ms-marco-MiniLM-L6-v2) using the [sentence-transformers](https://www.SBERT.net) library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
 ## Model Details
 ### Model Description
 - **Model Type:** Cross Encoder
-- **Base model:** [cross-encoder/ms-marco-MiniLM-L6-v2](https://huggingface.co/cross-encoder/ms-marco-MiniLM-L6-v2) <!-- at revision c5ee24cb16019beea0893ab7796b1df96625c6b8 -->
 - **Maximum Sequence Length:** 512 tokens
 - **Number of Output Labels:** 1 label
 <!-- - **Training Dataset:** Unknown -->
@@ -70,11 +70,11 @@ from sentence_transformers import CrossEncoder
 model = CrossEncoder("cross_encoder_model_id")
 # Get scores for pairs of texts
 pairs = [
-    ['prix blé tendre bio Indre et Loire 2025', 'Chart Title: "Wheat (US Soft Red Winter) Spot Price", Collections: Commodity Prices'],
-    ['oil prices', 'Chart Title: "West Texas Intermediate Crude Oil - Price in United States", Collections: Commodities::EIAEnergyIndicators::TimeseriesManager'],
-    ['Nvidia earnings AI chip demand', 'Chart Title: "Nvidia Quarterly Price to Earnings", Collections: Companies::CompanyComputedRatiosV2::TimeseriesManager'],
-    ['show me tesla stock performance 2020 to 2025', 'Title: "Manakoa Services Corporation Stock Performance"\n  Collections: Companies\n  Chart Type: company:private\n  Sources: S&P Global'],
-    ['Samsung A56 5G mémoire', 'Chart Title: "Samsung Publishing Co., Ltd Stock Prices", Info: Stock details for company Samsung Publishing Co., Ltd, Collections: Company Card, Chart Type: company:finance'],
 ]
 scores = model.predict(pairs)
 print(scores.shape)
@@ -82,13 +82,13 @@ print(scores.shape)
 # Or rank different texts based on similarity to a single text
 ranks = model.rank(
-    'prix blé tendre bio Indre et Loire 2025',
     [
-        'Chart Title: "Wheat (US Soft Red Winter) Spot Price", Collections: Commodity Prices',
-        'Chart Title: "West Texas Intermediate Crude Oil - Price in United States", Collections: Commodities::EIAEnergyIndicators::TimeseriesManager',
-        'Chart Title: "Nvidia Quarterly Price to Earnings", Collections: Companies::CompanyComputedRatiosV2::TimeseriesManager',
-        'Title: "Manakoa Services Corporation Stock Performance"\n  Collections: Companies\n  Chart Type: company:private\n  Sources: S&P Global',
-        'Chart Title: "Samsung Publishing Co., Ltd Stock Prices", Info: Stock details for company Samsung Publishing Co., Ltd, Collections: Company Card, Chart Type: company:finance',
     ]
 )
 # [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
@@ -129,8 +129,8 @@ You can finetune this model on your own dataset.
 | Metric       | Value      |
 |:-------------|:-----------|
-| pearson      | 0.8481     |
-| **spearman** | **0.8529** |
 <!--
 ## Bias, Risks and Limitations
@@ -150,19 +150,19 @@ You can finetune this model on your own dataset.
 #### Unnamed Dataset
-* Size: 8,000 training samples
 * Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
 * Approximate statistics based on the first 1000 samples:
-  |         | sentence_0                                                                                      | sentence_1                                                                                       | label                                                          |
-  |:--------|:------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------|:---------------------------------------------------------------|
-  | type    | string                                                                                          | string                                                                                           | float                                                          |
-  | details | <ul><li>min: 3 characters</li><li>mean: 51.78 characters</li><li>max: 1024 characters</li></ul> | <ul><li>min: 49 characters</li><li>mean: 136.27 characters</li><li>max: 716 characters</li></ul> | <ul><li>min: 0.2</li><li>mean: 0.52</li><li>max: 1.0</li></ul> |
 * Samples:
-  | sentence_0                                           | sentence_1                                                                                                                                               | label            |
-  |:-----------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------|
-  | <code>prix blé tendre bio Indre et Loire 2025</code> | <code>Chart Title: "Wheat (US Soft Red Winter) Spot Price", Collections: Commodity Prices</code>                                                         | <code>0.4</code> |
-  | <code>oil prices</code>                              | <code>Chart Title: "West Texas Intermediate Crude Oil - Price in United States", Collections: Commodities::EIAEnergyIndicators::TimeseriesManager</code> | <code>0.8</code> |
-  | <code>Nvidia earnings AI chip demand</code>          | <code>Chart Title: "Nvidia Quarterly Price to Earnings", Collections: Companies::CompanyComputedRatiosV2::TimeseriesManager</code>                       | <code>0.4</code> |
 * Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
   ```json
   {
@@ -175,8 +175,9 @@ You can finetune this model on your own dataset.
 #### Non-Default Hyperparameters
 - `eval_strategy`: steps
-- `per_device_train_batch_size`: 16
-- `per_device_eval_batch_size`: 16
 #### All Hyperparameters
 <details><summary>Click to expand</summary>
@@ -185,8 +186,8 @@ You can finetune this model on your own dataset.
 - `do_predict`: False
 - `eval_strategy`: steps
 - `prediction_loss_only`: True
-- `per_device_train_batch_size`: 16
-- `per_device_eval_batch_size`: 16
 - `per_gpu_train_batch_size`: None
 - `per_gpu_eval_batch_size`: None
 - `gradient_accumulation_steps`: 1
@@ -198,7 +199,7 @@ You can finetune this model on your own dataset.
 - `adam_beta2`: 0.999
 - `adam_epsilon`: 1e-08
 - `max_grad_norm`: 1
-- `num_train_epochs`: 3
 - `max_steps`: -1
 - `lr_scheduler_type`: linear
 - `lr_scheduler_kwargs`: {}
@@ -306,21 +307,15 @@ You can finetune this model on your own dataset.
 ### Training Logs
 | Epoch | Step | Training Loss | validation_spearman |
 |:-----:|:----:|:-------------:|:-------------------:|
-| 0.2   | 100  | -             | 0.7038              |
-| 0.4   | 200  | -             | 0.7816              |
-| 0.6   | 300  | -             | 0.8134              |
-| 0.8   | 400  | -             | 0.8216              |
-| 1.0   | 500  | 0.8021        | 0.8296              |
-| 1.2   | 600  | -             | 0.8358              |
-| 1.4   | 700  | -             | 0.8418              |
-| 1.6   | 800  | -             | 0.8418              |
-| 1.8   | 900  | -             | 0.8478              |
-| 2.0   | 1000 | 0.5726        | 0.8471              |
-| 2.2   | 1100 | -             | 0.8487              |
-| 2.4   | 1200 | -             | 0.8497              |
-| 2.6   | 1300 | -             | 0.8522              |
-| 2.8   | 1400 | -             | 0.8523              |
-| 3.0   | 1500 | 0.5616        | 0.8529              |
 ### Framework Versions

 - cross-encoder
 - reranker
 - generated_from_trainer
+- dataset_size:3999
 - loss:BinaryCrossEntropyLoss
+base_model: cross-encoder/mmarco-mMiniLMv2-L12-H384-v1
 pipeline_tag: text-ranking
 library_name: sentence-transformers
 metrics:
 - pearson
 - spearman
 model-index:
+- name: CrossEncoder based on cross-encoder/mmarco-mMiniLMv2-L12-H384-v1
   results:
   - task:
       type: cross-encoder-correlation
       type: validation
     metrics:
     - type: pearson
+      value: 0.7551794832253556
       name: Pearson
     - type: spearman
+      value: 0.8052608880870304
       name: Spearman
 ---
+# CrossEncoder based on cross-encoder/mmarco-mMiniLMv2-L12-H384-v1
+This is a [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.html) model finetuned from [cross-encoder/mmarco-mMiniLMv2-L12-H384-v1](https://huggingface.co/cross-encoder/mmarco-mMiniLMv2-L12-H384-v1) using the [sentence-transformers](https://www.SBERT.net) library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
 ## Model Details
 ### Model Description
 - **Model Type:** Cross Encoder
+- **Base model:** [cross-encoder/mmarco-mMiniLMv2-L12-H384-v1](https://huggingface.co/cross-encoder/mmarco-mMiniLMv2-L12-H384-v1) <!-- at revision 1427fd652930e4ba29e8149678df786c240d8825 -->
 - **Maximum Sequence Length:** 512 tokens
 - **Number of Output Labels:** 1 label
 <!-- - **Training Dataset:** Unknown -->
 model = CrossEncoder("cross_encoder_model_id")
 # Get scores for pairs of texts
 pairs = [
+    ['NVIDIA stock price trend from February 2024 to February 2025', 'Title: "Nvidia Stockpile (Annual)"\nCollections: Companies\nDatasets: StandardIncomeStatement\nChart Type: timeseries:eav_v2\nCanonical forms: "Stockpile"="inventory"\nSources: S&P Global'],
+    ['What is the price of Costco stock? Answer in as few words as possible.', 'Title: "Costco Quarterly Price to Earnings, Costco Stock (Annual)"\nCollections: Companies\nDatasets: StandardIncomeStatement, CompanyComputedRatiosV2\nChart Type: timeseries:eav_v2\nCanonical forms: "Price to Earnings"="computed_ratio_last_close_price_to_earnings", "Stock"="inventory"'],
+    ['Who was named EY World Entrepreneur Of The Year 2024?', 'Title: "World Overview"\nCollections: Companies\nChart Type: company:private\nSources: S&P Global'],
+    ['dubbed movies streaming', 'Title: "How Brits subscribe to film service subscriptions e.g. Sky Go (United Kingdom)"\nCollections: YouGov Trackers\nDatasets: YouGovTrackerValueV2\nChart Type: survey:timeseries\nSources: YouGov'],
+    ['Virtual Reality (VR) – Meta Quest 3', 'Title: "Meta Overview"\nCollections: Companies\nChart Type: company:finance\nCanonical forms: "Meta"="Meta Platforms, Inc.", "Overview"="Stock Overview"\nSources: S&P Global'],
 ]
 scores = model.predict(pairs)
 print(scores.shape)
 # Or rank different texts based on similarity to a single text
 ranks = model.rank(
+    'NVIDIA stock price trend from February 2024 to February 2025',
     [
+        'Title: "Nvidia Stockpile (Annual)"\nCollections: Companies\nDatasets: StandardIncomeStatement\nChart Type: timeseries:eav_v2\nCanonical forms: "Stockpile"="inventory"\nSources: S&P Global',
+        'Title: "Costco Quarterly Price to Earnings, Costco Stock (Annual)"\nCollections: Companies\nDatasets: StandardIncomeStatement, CompanyComputedRatiosV2\nChart Type: timeseries:eav_v2\nCanonical forms: "Price to Earnings"="computed_ratio_last_close_price_to_earnings", "Stock"="inventory"',
+        'Title: "World Overview"\nCollections: Companies\nChart Type: company:private\nSources: S&P Global',
+        'Title: "How Brits subscribe to film service subscriptions e.g. Sky Go (United Kingdom)"\nCollections: YouGov Trackers\nDatasets: YouGovTrackerValueV2\nChart Type: survey:timeseries\nSources: YouGov',
+        'Title: "Meta Overview"\nCollections: Companies\nChart Type: company:finance\nCanonical forms: "Meta"="Meta Platforms, Inc.", "Overview"="Stock Overview"\nSources: S&P Global',
     ]
 )
 # [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
 | Metric       | Value      |
 |:-------------|:-----------|
+| pearson      | 0.7552     |
+| **spearman** | **0.8053** |
 <!--
 ## Bias, Risks and Limitations
 #### Unnamed Dataset
+* Size: 3,999 training samples
 * Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
 * Approximate statistics based on the first 1000 samples:
+  |         | sentence_0                                                                                    | sentence_1                                                                                       | label                                                          |
+  |:--------|:----------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------|:---------------------------------------------------------------|
+  | type    | string                                                                                        | string                                                                                           | float                                                          |
+  | details | <ul><li>min: 3 characters</li><li>mean: 43.12 characters</li><li>max: 99 characters</li></ul> | <ul><li>min: 76 characters</li><li>mean: 181.15 characters</li><li>max: 393 characters</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.46</li><li>max: 1.0</li></ul> |
 * Samples:
+  | sentence_0                                                                          | sentence_1                                                                                                                                                                                                                                                                                                          | label            |
+  |:------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------|
+  | <code>NVIDIA stock price trend from February 2024 to February 2025</code>           | <code>Title: "Nvidia Stockpile (Annual)"<br>Collections: Companies<br>Datasets: StandardIncomeStatement<br>Chart Type: timeseries:eav_v2<br>Canonical forms: "Stockpile"="inventory"<br>Sources: S&P Global</code>                                                                                                  | <code>0.0</code> |
+  | <code>What is the price of Costco stock? Answer in as few words as possible.</code> | <code>Title: "Costco Quarterly Price to Earnings, Costco Stock (Annual)"<br>Collections: Companies<br>Datasets: StandardIncomeStatement, CompanyComputedRatiosV2<br>Chart Type: timeseries:eav_v2<br>Canonical forms: "Price to Earnings"="computed_ratio_last_close_price_to_earnings", "Stock"="inventory"</code> | <code>0.5</code> |
+  | <code>Who was named EY World Entrepreneur Of The Year 2024?</code>                  | <code>Title: "World Overview"<br>Collections: Companies<br>Chart Type: company:private<br>Sources: S&P Global</code>                                                                                                                                                                                                | <code>0.0</code> |
 * Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
   ```json
   {
 #### Non-Default Hyperparameters
 - `eval_strategy`: steps
+- `per_device_train_batch_size`: 32
+- `per_device_eval_batch_size`: 32
+- `num_train_epochs`: 5
 #### All Hyperparameters
 <details><summary>Click to expand</summary>
 - `do_predict`: False
 - `eval_strategy`: steps
 - `prediction_loss_only`: True
+- `per_device_train_batch_size`: 32
+- `per_device_eval_batch_size`: 32
 - `per_gpu_train_batch_size`: None
 - `per_gpu_eval_batch_size`: None
 - `gradient_accumulation_steps`: 1
 - `adam_beta2`: 0.999
 - `adam_epsilon`: 1e-08
 - `max_grad_norm`: 1
+- `num_train_epochs`: 5
 - `max_steps`: -1
 - `lr_scheduler_type`: linear
 - `lr_scheduler_kwargs`: {}
 ### Training Logs
 | Epoch | Step | Training Loss | validation_spearman |
 |:-----:|:----:|:-------------:|:-------------------:|
+| 0.8   | 100  | -             | 0.7305              |
+| 1.0   | 125  | -             | 0.7516              |
+| 1.6   | 200  | -             | 0.7809              |
+| 2.0   | 250  | -             | 0.7922              |
+| 2.4   | 300  | -             | 0.7947              |
+| 3.0   | 375  | -             | 0.8022              |
+| 3.2   | 400  | -             | 0.7995              |
+| 4.0   | 500  | 0.5555        | 0.8045              |
+| 4.8   | 600  | -             | 0.8053              |
 ### Framework Versions

config.json CHANGED Viewed

@@ -1,11 +1,12 @@
 {
   "architectures": [
-    "BertForSequenceClassification"
   ],
   "attention_probs_dropout_prob": 0.1,
   "classifier_dropout": null,
   "dtype": "float32",
-  "gradient_checkpointing": false,
   "hidden_act": "gelu",
   "hidden_dropout_prob": 0.1,
   "hidden_size": 384,
@@ -17,19 +18,19 @@
   "label2id": {
     "LABEL_0": 0
   },
-  "layer_norm_eps": 1e-12,
-  "max_position_embeddings": 512,
-  "model_type": "bert",
   "num_attention_heads": 12,
-  "num_hidden_layers": 6,
-  "pad_token_id": 0,
   "position_embedding_type": "absolute",
   "sentence_transformers": {
     "activation_fn": "torch.nn.modules.linear.Identity",
     "version": "5.1.1"
   },
   "transformers_version": "4.57.1",
-  "type_vocab_size": 2,
   "use_cache": true,
-  "vocab_size": 30522
 }

 {
   "architectures": [
+    "XLMRobertaForSequenceClassification"
   ],
   "attention_probs_dropout_prob": 0.1,
+  "bos_token_id": 0,
   "classifier_dropout": null,
   "dtype": "float32",
+  "eos_token_id": 2,
   "hidden_act": "gelu",
   "hidden_dropout_prob": 0.1,
   "hidden_size": 384,
   "label2id": {
     "LABEL_0": 0
   },
+  "layer_norm_eps": 1e-05,
+  "max_position_embeddings": 514,
+  "model_type": "xlm-roberta",
   "num_attention_heads": 12,
+  "num_hidden_layers": 12,
+  "pad_token_id": 1,
   "position_embedding_type": "absolute",
   "sentence_transformers": {
     "activation_fn": "torch.nn.modules.linear.Identity",
     "version": "5.1.1"
   },
   "transformers_version": "4.57.1",
+  "type_vocab_size": 1,
   "use_cache": true,
+  "vocab_size": 250002
 }

eval/CrossEncoderCorrelationEvaluator_validation_results.csv CHANGED Viewed

@@ -1,4 +1,6 @@
 epoch,steps,Pearson_Correlation,Spearman_Correlation
-1.0,500,0.8334498280984426,0.8296374514172629
-2.0,1000,0.8444343598056561,0.8471494664684638
-3.0,1500,0.8481096700155641,0.8528646396544212

 epoch,steps,Pearson_Correlation,Spearman_Correlation
+1.0,125,0.7309011622271578,0.7516436739892058
+2.0,250,0.7588798492491784,0.7921942665271138
+3.0,375,0.7523098419884638,0.8021607473982901
+4.0,500,0.7556591422221105,0.8044702495085688
+5.0,625,0.7553359223115874,0.8050527106824349

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:93357cfe857f758d0ab0429d2076e1599cd7661ab2cc03f999bede0267e1167c
-size 90866412

 version https://git-lfs.github.com/spec/v1
+oid sha256:5f41f04568b485258127e43e4fb378afcbdb017f968f848d88687ca2ba76591e
+size 470588492

special_tokens_map.json CHANGED Viewed

@@ -1,34 +1,48 @@
 {
   "cls_token": {
-    "content": "[CLS]",
     "lstrip": false,
     "normalized": false,
     "rstrip": false,
     "single_word": false
   },
-  "mask_token": {
-    "content": "[MASK]",
     "lstrip": false,
     "normalized": false,
     "rstrip": false,
     "single_word": false
   },
   "pad_token": {
-    "content": "[PAD]",
     "lstrip": false,
     "normalized": false,
     "rstrip": false,
     "single_word": false
   },
   "sep_token": {
-    "content": "[SEP]",
     "lstrip": false,
     "normalized": false,
     "rstrip": false,
     "single_word": false
   },
   "unk_token": {
-    "content": "[UNK]",
     "lstrip": false,
     "normalized": false,
     "rstrip": false,

 {
+  "bos_token": {
+    "content": "<s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
   "cls_token": {
+    "content": "<s>",
     "lstrip": false,
     "normalized": false,
     "rstrip": false,
     "single_word": false
   },
+  "eos_token": {
+    "content": "</s>",
     "lstrip": false,
     "normalized": false,
     "rstrip": false,
     "single_word": false
   },
+  "mask_token": {
+    "content": "<mask>",
+    "lstrip": true,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
   "pad_token": {
+    "content": "<pad>",
     "lstrip": false,
     "normalized": false,
     "rstrip": false,
     "single_word": false
   },
   "sep_token": {
+    "content": "</s>",
     "lstrip": false,
     "normalized": false,
     "rstrip": false,
     "single_word": false
   },
   "unk_token": {
+    "content": "<unk>",
     "lstrip": false,
     "normalized": false,
     "rstrip": false,

tokenizer.json CHANGED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json CHANGED Viewed

@@ -1,58 +1,55 @@
 {
   "added_tokens_decoder": {
     "0": {
-      "content": "[PAD]",
       "lstrip": false,
       "normalized": false,
       "rstrip": false,
       "single_word": false,
       "special": true
     },
-    "100": {
-      "content": "[UNK]",
       "lstrip": false,
       "normalized": false,
       "rstrip": false,
       "single_word": false,
       "special": true
     },
-    "101": {
-      "content": "[CLS]",
       "lstrip": false,
       "normalized": false,
       "rstrip": false,
       "single_word": false,
       "special": true
     },
-    "102": {
-      "content": "[SEP]",
       "lstrip": false,
       "normalized": false,
       "rstrip": false,
       "single_word": false,
       "special": true
     },
-    "103": {
-      "content": "[MASK]",
-      "lstrip": false,
       "normalized": false,
       "rstrip": false,
       "single_word": false,
       "special": true
     }
   },
-  "clean_up_tokenization_spaces": true,
-  "cls_token": "[CLS]",
-  "do_basic_tokenize": true,
-  "do_lower_case": true,
   "extra_special_tokens": {},
-  "mask_token": "[MASK]",
   "model_max_length": 512,
-  "never_split": null,
-  "pad_token": "[PAD]",
-  "sep_token": "[SEP]",
-  "strip_accents": null,
-  "tokenize_chinese_chars": true,
-  "tokenizer_class": "BertTokenizer",
-  "unk_token": "[UNK]"
 }

 {
   "added_tokens_decoder": {
     "0": {
+      "content": "<s>",
       "lstrip": false,
       "normalized": false,
       "rstrip": false,
       "single_word": false,
       "special": true
     },
+    "1": {
+      "content": "<pad>",
       "lstrip": false,
       "normalized": false,
       "rstrip": false,
       "single_word": false,
       "special": true
     },
+    "2": {
+      "content": "</s>",
       "lstrip": false,
       "normalized": false,
       "rstrip": false,
       "single_word": false,
       "special": true
     },
+    "3": {
+      "content": "<unk>",
       "lstrip": false,
       "normalized": false,
       "rstrip": false,
       "single_word": false,
       "special": true
     },
+    "250001": {
+      "content": "<mask>",
+      "lstrip": true,
       "normalized": false,
       "rstrip": false,
       "single_word": false,
       "special": true
     }
   },
+  "bos_token": "<s>",
+  "clean_up_tokenization_spaces": false,
+  "cls_token": "<s>",
+  "eos_token": "</s>",
   "extra_special_tokens": {},
+  "mask_token": "<mask>",
   "model_max_length": 512,
+  "pad_token": "<pad>",
+  "sep_token": "</s>",
+  "tokenizer_class": "XLMRobertaTokenizer",
+  "unk_token": "<unk>"
 }

training_info.txt CHANGED Viewed

@@ -1,6 +1,9 @@
-Base Model: cross-encoder/ms-marco-MiniLM-L6-v2
-Training Samples: 8000
-Epochs: 3
-Batch Size: 16
 Learning Rate: 2e-05
 Max Length: 512

+Base Model: cross-encoder/mmarco-mMiniLMv2-L12-H384-v1
+Training Samples: 3999
+Epochs: 5
+Batch Size: 32
 Learning Rate: 2e-05
+Weight Decay: 0.01
+Scheduler: warmuplinear
+Warmup Steps: 100
 Max Length: 512