Text Ranking
sentence-transformers
Safetensors
new
cross-encoder
reranker
Generated from Trainer
dataset_size:24588
loss:BinaryCrossEntropyLoss
custom_code
Eval Results (legacy)
text-embeddings-inference
Instructions to use TakoData/chart-reranker with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use TakoData/chart-reranker with sentence-transformers:
from sentence_transformers import CrossEncoder model = CrossEncoder("TakoData/chart-reranker", trust_remote_code=True) query = "Which planet is known as the Red Planet?" passages = [ "Venus is often called Earth's twin because of its similar size and proximity.", "Mars, known for its reddish appearance, is often referred to as the Red Planet.", "Jupiter, the largest planet in our solar system, has a prominent red spot.", "Saturn, famous for its rings, is sometimes mistaken for the Red Planet." ] scores = model.predict([(query, passage) for passage in passages]) print(scores) - Notebooks
- Google Colab
- Kaggle
Upload fine-tuned chart reranker model
Browse files- .gitattributes +1 -0
- README.md +46 -51
- config.json +10 -9
- eval/CrossEncoderCorrelationEvaluator_validation_results.csv +5 -3
- model.safetensors +2 -2
- special_tokens_map.json +20 -6
- tokenizer.json +0 -0
- tokenizer_config.json +19 -22
- training_info.txt +7 -4
.gitattributes
CHANGED
|
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
tokenizer.json filter=lfs diff=lfs merge=lfs -text
|
README.md
CHANGED
|
@@ -4,16 +4,16 @@ tags:
|
|
| 4 |
- cross-encoder
|
| 5 |
- reranker
|
| 6 |
- generated_from_trainer
|
| 7 |
-
- dataset_size:
|
| 8 |
- loss:BinaryCrossEntropyLoss
|
| 9 |
-
base_model: cross-encoder/
|
| 10 |
pipeline_tag: text-ranking
|
| 11 |
library_name: sentence-transformers
|
| 12 |
metrics:
|
| 13 |
- pearson
|
| 14 |
- spearman
|
| 15 |
model-index:
|
| 16 |
-
- name: CrossEncoder based on cross-encoder/
|
| 17 |
results:
|
| 18 |
- task:
|
| 19 |
type: cross-encoder-correlation
|
|
@@ -23,22 +23,22 @@ model-index:
|
|
| 23 |
type: validation
|
| 24 |
metrics:
|
| 25 |
- type: pearson
|
| 26 |
-
value: 0.
|
| 27 |
name: Pearson
|
| 28 |
- type: spearman
|
| 29 |
-
value: 0.
|
| 30 |
name: Spearman
|
| 31 |
---
|
| 32 |
|
| 33 |
-
# CrossEncoder based on cross-encoder/
|
| 34 |
|
| 35 |
-
This is a [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.html) model finetuned from [cross-encoder/
|
| 36 |
|
| 37 |
## Model Details
|
| 38 |
|
| 39 |
### Model Description
|
| 40 |
- **Model Type:** Cross Encoder
|
| 41 |
-
- **Base model:** [cross-encoder/
|
| 42 |
- **Maximum Sequence Length:** 512 tokens
|
| 43 |
- **Number of Output Labels:** 1 label
|
| 44 |
<!-- - **Training Dataset:** Unknown -->
|
|
@@ -70,11 +70,11 @@ from sentence_transformers import CrossEncoder
|
|
| 70 |
model = CrossEncoder("cross_encoder_model_id")
|
| 71 |
# Get scores for pairs of texts
|
| 72 |
pairs = [
|
| 73 |
-
['
|
| 74 |
-
['
|
| 75 |
-
['
|
| 76 |
-
['
|
| 77 |
-
['
|
| 78 |
]
|
| 79 |
scores = model.predict(pairs)
|
| 80 |
print(scores.shape)
|
|
@@ -82,13 +82,13 @@ print(scores.shape)
|
|
| 82 |
|
| 83 |
# Or rank different texts based on similarity to a single text
|
| 84 |
ranks = model.rank(
|
| 85 |
-
'
|
| 86 |
[
|
| 87 |
-
'
|
| 88 |
-
'
|
| 89 |
-
'
|
| 90 |
-
'Title: "
|
| 91 |
-
'
|
| 92 |
]
|
| 93 |
)
|
| 94 |
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
|
|
@@ -129,8 +129,8 @@ You can finetune this model on your own dataset.
|
|
| 129 |
|
| 130 |
| Metric | Value |
|
| 131 |
|:-------------|:-----------|
|
| 132 |
-
| pearson | 0.
|
| 133 |
-
| **spearman** | **0.
|
| 134 |
|
| 135 |
<!--
|
| 136 |
## Bias, Risks and Limitations
|
|
@@ -150,19 +150,19 @@ You can finetune this model on your own dataset.
|
|
| 150 |
|
| 151 |
#### Unnamed Dataset
|
| 152 |
|
| 153 |
-
* Size:
|
| 154 |
* Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
|
| 155 |
* Approximate statistics based on the first 1000 samples:
|
| 156 |
-
| | sentence_0
|
| 157 |
-
|:--------|:----------------------------------------------------------------------------------------------
|
| 158 |
-
| type | string
|
| 159 |
-
| details | <ul><li>min: 3 characters</li><li>mean:
|
| 160 |
* Samples:
|
| 161 |
-
| sentence_0
|
| 162 |
-
|:-----------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------|
|
| 163 |
-
| <code>
|
| 164 |
-
| <code>
|
| 165 |
-
| <code>
|
| 166 |
* Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
|
| 167 |
```json
|
| 168 |
{
|
|
@@ -175,8 +175,9 @@ You can finetune this model on your own dataset.
|
|
| 175 |
#### Non-Default Hyperparameters
|
| 176 |
|
| 177 |
- `eval_strategy`: steps
|
| 178 |
-
- `per_device_train_batch_size`:
|
| 179 |
-
- `per_device_eval_batch_size`:
|
|
|
|
| 180 |
|
| 181 |
#### All Hyperparameters
|
| 182 |
<details><summary>Click to expand</summary>
|
|
@@ -185,8 +186,8 @@ You can finetune this model on your own dataset.
|
|
| 185 |
- `do_predict`: False
|
| 186 |
- `eval_strategy`: steps
|
| 187 |
- `prediction_loss_only`: True
|
| 188 |
-
- `per_device_train_batch_size`:
|
| 189 |
-
- `per_device_eval_batch_size`:
|
| 190 |
- `per_gpu_train_batch_size`: None
|
| 191 |
- `per_gpu_eval_batch_size`: None
|
| 192 |
- `gradient_accumulation_steps`: 1
|
|
@@ -198,7 +199,7 @@ You can finetune this model on your own dataset.
|
|
| 198 |
- `adam_beta2`: 0.999
|
| 199 |
- `adam_epsilon`: 1e-08
|
| 200 |
- `max_grad_norm`: 1
|
| 201 |
-
- `num_train_epochs`:
|
| 202 |
- `max_steps`: -1
|
| 203 |
- `lr_scheduler_type`: linear
|
| 204 |
- `lr_scheduler_kwargs`: {}
|
|
@@ -306,21 +307,15 @@ You can finetune this model on your own dataset.
|
|
| 306 |
### Training Logs
|
| 307 |
| Epoch | Step | Training Loss | validation_spearman |
|
| 308 |
|:-----:|:----:|:-------------:|:-------------------:|
|
| 309 |
-
| 0.
|
| 310 |
-
|
|
| 311 |
-
|
|
| 312 |
-
|
|
| 313 |
-
|
|
| 314 |
-
|
|
| 315 |
-
|
|
| 316 |
-
|
|
| 317 |
-
|
|
| 318 |
-
| 2.0 | 1000 | 0.5726 | 0.8471 |
|
| 319 |
-
| 2.2 | 1100 | - | 0.8487 |
|
| 320 |
-
| 2.4 | 1200 | - | 0.8497 |
|
| 321 |
-
| 2.6 | 1300 | - | 0.8522 |
|
| 322 |
-
| 2.8 | 1400 | - | 0.8523 |
|
| 323 |
-
| 3.0 | 1500 | 0.5616 | 0.8529 |
|
| 324 |
|
| 325 |
|
| 326 |
### Framework Versions
|
|
|
|
| 4 |
- cross-encoder
|
| 5 |
- reranker
|
| 6 |
- generated_from_trainer
|
| 7 |
+
- dataset_size:3999
|
| 8 |
- loss:BinaryCrossEntropyLoss
|
| 9 |
+
base_model: cross-encoder/mmarco-mMiniLMv2-L12-H384-v1
|
| 10 |
pipeline_tag: text-ranking
|
| 11 |
library_name: sentence-transformers
|
| 12 |
metrics:
|
| 13 |
- pearson
|
| 14 |
- spearman
|
| 15 |
model-index:
|
| 16 |
+
- name: CrossEncoder based on cross-encoder/mmarco-mMiniLMv2-L12-H384-v1
|
| 17 |
results:
|
| 18 |
- task:
|
| 19 |
type: cross-encoder-correlation
|
|
|
|
| 23 |
type: validation
|
| 24 |
metrics:
|
| 25 |
- type: pearson
|
| 26 |
+
value: 0.7551794832253556
|
| 27 |
name: Pearson
|
| 28 |
- type: spearman
|
| 29 |
+
value: 0.8052608880870304
|
| 30 |
name: Spearman
|
| 31 |
---
|
| 32 |
|
| 33 |
+
# CrossEncoder based on cross-encoder/mmarco-mMiniLMv2-L12-H384-v1
|
| 34 |
|
| 35 |
+
This is a [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.html) model finetuned from [cross-encoder/mmarco-mMiniLMv2-L12-H384-v1](https://huggingface.co/cross-encoder/mmarco-mMiniLMv2-L12-H384-v1) using the [sentence-transformers](https://www.SBERT.net) library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
|
| 36 |
|
| 37 |
## Model Details
|
| 38 |
|
| 39 |
### Model Description
|
| 40 |
- **Model Type:** Cross Encoder
|
| 41 |
+
- **Base model:** [cross-encoder/mmarco-mMiniLMv2-L12-H384-v1](https://huggingface.co/cross-encoder/mmarco-mMiniLMv2-L12-H384-v1) <!-- at revision 1427fd652930e4ba29e8149678df786c240d8825 -->
|
| 42 |
- **Maximum Sequence Length:** 512 tokens
|
| 43 |
- **Number of Output Labels:** 1 label
|
| 44 |
<!-- - **Training Dataset:** Unknown -->
|
|
|
|
| 70 |
model = CrossEncoder("cross_encoder_model_id")
|
| 71 |
# Get scores for pairs of texts
|
| 72 |
pairs = [
|
| 73 |
+
['NVIDIA stock price trend from February 2024 to February 2025', 'Title: "Nvidia Stockpile (Annual)"\nCollections: Companies\nDatasets: StandardIncomeStatement\nChart Type: timeseries:eav_v2\nCanonical forms: "Stockpile"="inventory"\nSources: S&P Global'],
|
| 74 |
+
['What is the price of Costco stock? Answer in as few words as possible.', 'Title: "Costco Quarterly Price to Earnings, Costco Stock (Annual)"\nCollections: Companies\nDatasets: StandardIncomeStatement, CompanyComputedRatiosV2\nChart Type: timeseries:eav_v2\nCanonical forms: "Price to Earnings"="computed_ratio_last_close_price_to_earnings", "Stock"="inventory"'],
|
| 75 |
+
['Who was named EY World Entrepreneur Of The Year 2024?', 'Title: "World Overview"\nCollections: Companies\nChart Type: company:private\nSources: S&P Global'],
|
| 76 |
+
['dubbed movies streaming', 'Title: "How Brits subscribe to film service subscriptions e.g. Sky Go (United Kingdom)"\nCollections: YouGov Trackers\nDatasets: YouGovTrackerValueV2\nChart Type: survey:timeseries\nSources: YouGov'],
|
| 77 |
+
['Virtual Reality (VR) – Meta Quest 3', 'Title: "Meta Overview"\nCollections: Companies\nChart Type: company:finance\nCanonical forms: "Meta"="Meta Platforms, Inc.", "Overview"="Stock Overview"\nSources: S&P Global'],
|
| 78 |
]
|
| 79 |
scores = model.predict(pairs)
|
| 80 |
print(scores.shape)
|
|
|
|
| 82 |
|
| 83 |
# Or rank different texts based on similarity to a single text
|
| 84 |
ranks = model.rank(
|
| 85 |
+
'NVIDIA stock price trend from February 2024 to February 2025',
|
| 86 |
[
|
| 87 |
+
'Title: "Nvidia Stockpile (Annual)"\nCollections: Companies\nDatasets: StandardIncomeStatement\nChart Type: timeseries:eav_v2\nCanonical forms: "Stockpile"="inventory"\nSources: S&P Global',
|
| 88 |
+
'Title: "Costco Quarterly Price to Earnings, Costco Stock (Annual)"\nCollections: Companies\nDatasets: StandardIncomeStatement, CompanyComputedRatiosV2\nChart Type: timeseries:eav_v2\nCanonical forms: "Price to Earnings"="computed_ratio_last_close_price_to_earnings", "Stock"="inventory"',
|
| 89 |
+
'Title: "World Overview"\nCollections: Companies\nChart Type: company:private\nSources: S&P Global',
|
| 90 |
+
'Title: "How Brits subscribe to film service subscriptions e.g. Sky Go (United Kingdom)"\nCollections: YouGov Trackers\nDatasets: YouGovTrackerValueV2\nChart Type: survey:timeseries\nSources: YouGov',
|
| 91 |
+
'Title: "Meta Overview"\nCollections: Companies\nChart Type: company:finance\nCanonical forms: "Meta"="Meta Platforms, Inc.", "Overview"="Stock Overview"\nSources: S&P Global',
|
| 92 |
]
|
| 93 |
)
|
| 94 |
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
|
|
|
|
| 129 |
|
| 130 |
| Metric | Value |
|
| 131 |
|:-------------|:-----------|
|
| 132 |
+
| pearson | 0.7552 |
|
| 133 |
+
| **spearman** | **0.8053** |
|
| 134 |
|
| 135 |
<!--
|
| 136 |
## Bias, Risks and Limitations
|
|
|
|
| 150 |
|
| 151 |
#### Unnamed Dataset
|
| 152 |
|
| 153 |
+
* Size: 3,999 training samples
|
| 154 |
* Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
|
| 155 |
* Approximate statistics based on the first 1000 samples:
|
| 156 |
+
| | sentence_0 | sentence_1 | label |
|
| 157 |
+
|:--------|:----------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------|:---------------------------------------------------------------|
|
| 158 |
+
| type | string | string | float |
|
| 159 |
+
| details | <ul><li>min: 3 characters</li><li>mean: 43.12 characters</li><li>max: 99 characters</li></ul> | <ul><li>min: 76 characters</li><li>mean: 181.15 characters</li><li>max: 393 characters</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.46</li><li>max: 1.0</li></ul> |
|
| 160 |
* Samples:
|
| 161 |
+
| sentence_0 | sentence_1 | label |
|
| 162 |
+
|:------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------|
|
| 163 |
+
| <code>NVIDIA stock price trend from February 2024 to February 2025</code> | <code>Title: "Nvidia Stockpile (Annual)"<br>Collections: Companies<br>Datasets: StandardIncomeStatement<br>Chart Type: timeseries:eav_v2<br>Canonical forms: "Stockpile"="inventory"<br>Sources: S&P Global</code> | <code>0.0</code> |
|
| 164 |
+
| <code>What is the price of Costco stock? Answer in as few words as possible.</code> | <code>Title: "Costco Quarterly Price to Earnings, Costco Stock (Annual)"<br>Collections: Companies<br>Datasets: StandardIncomeStatement, CompanyComputedRatiosV2<br>Chart Type: timeseries:eav_v2<br>Canonical forms: "Price to Earnings"="computed_ratio_last_close_price_to_earnings", "Stock"="inventory"</code> | <code>0.5</code> |
|
| 165 |
+
| <code>Who was named EY World Entrepreneur Of The Year 2024?</code> | <code>Title: "World Overview"<br>Collections: Companies<br>Chart Type: company:private<br>Sources: S&P Global</code> | <code>0.0</code> |
|
| 166 |
* Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
|
| 167 |
```json
|
| 168 |
{
|
|
|
|
| 175 |
#### Non-Default Hyperparameters
|
| 176 |
|
| 177 |
- `eval_strategy`: steps
|
| 178 |
+
- `per_device_train_batch_size`: 32
|
| 179 |
+
- `per_device_eval_batch_size`: 32
|
| 180 |
+
- `num_train_epochs`: 5
|
| 181 |
|
| 182 |
#### All Hyperparameters
|
| 183 |
<details><summary>Click to expand</summary>
|
|
|
|
| 186 |
- `do_predict`: False
|
| 187 |
- `eval_strategy`: steps
|
| 188 |
- `prediction_loss_only`: True
|
| 189 |
+
- `per_device_train_batch_size`: 32
|
| 190 |
+
- `per_device_eval_batch_size`: 32
|
| 191 |
- `per_gpu_train_batch_size`: None
|
| 192 |
- `per_gpu_eval_batch_size`: None
|
| 193 |
- `gradient_accumulation_steps`: 1
|
|
|
|
| 199 |
- `adam_beta2`: 0.999
|
| 200 |
- `adam_epsilon`: 1e-08
|
| 201 |
- `max_grad_norm`: 1
|
| 202 |
+
- `num_train_epochs`: 5
|
| 203 |
- `max_steps`: -1
|
| 204 |
- `lr_scheduler_type`: linear
|
| 205 |
- `lr_scheduler_kwargs`: {}
|
|
|
|
| 307 |
### Training Logs
|
| 308 |
| Epoch | Step | Training Loss | validation_spearman |
|
| 309 |
|:-----:|:----:|:-------------:|:-------------------:|
|
| 310 |
+
| 0.8 | 100 | - | 0.7305 |
|
| 311 |
+
| 1.0 | 125 | - | 0.7516 |
|
| 312 |
+
| 1.6 | 200 | - | 0.7809 |
|
| 313 |
+
| 2.0 | 250 | - | 0.7922 |
|
| 314 |
+
| 2.4 | 300 | - | 0.7947 |
|
| 315 |
+
| 3.0 | 375 | - | 0.8022 |
|
| 316 |
+
| 3.2 | 400 | - | 0.7995 |
|
| 317 |
+
| 4.0 | 500 | 0.5555 | 0.8045 |
|
| 318 |
+
| 4.8 | 600 | - | 0.8053 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 319 |
|
| 320 |
|
| 321 |
### Framework Versions
|
config.json
CHANGED
|
@@ -1,11 +1,12 @@
|
|
| 1 |
{
|
| 2 |
"architectures": [
|
| 3 |
-
"
|
| 4 |
],
|
| 5 |
"attention_probs_dropout_prob": 0.1,
|
|
|
|
| 6 |
"classifier_dropout": null,
|
| 7 |
"dtype": "float32",
|
| 8 |
-
"
|
| 9 |
"hidden_act": "gelu",
|
| 10 |
"hidden_dropout_prob": 0.1,
|
| 11 |
"hidden_size": 384,
|
|
@@ -17,19 +18,19 @@
|
|
| 17 |
"label2id": {
|
| 18 |
"LABEL_0": 0
|
| 19 |
},
|
| 20 |
-
"layer_norm_eps": 1e-
|
| 21 |
-
"max_position_embeddings":
|
| 22 |
-
"model_type": "
|
| 23 |
"num_attention_heads": 12,
|
| 24 |
-
"num_hidden_layers":
|
| 25 |
-
"pad_token_id":
|
| 26 |
"position_embedding_type": "absolute",
|
| 27 |
"sentence_transformers": {
|
| 28 |
"activation_fn": "torch.nn.modules.linear.Identity",
|
| 29 |
"version": "5.1.1"
|
| 30 |
},
|
| 31 |
"transformers_version": "4.57.1",
|
| 32 |
-
"type_vocab_size":
|
| 33 |
"use_cache": true,
|
| 34 |
-
"vocab_size":
|
| 35 |
}
|
|
|
|
| 1 |
{
|
| 2 |
"architectures": [
|
| 3 |
+
"XLMRobertaForSequenceClassification"
|
| 4 |
],
|
| 5 |
"attention_probs_dropout_prob": 0.1,
|
| 6 |
+
"bos_token_id": 0,
|
| 7 |
"classifier_dropout": null,
|
| 8 |
"dtype": "float32",
|
| 9 |
+
"eos_token_id": 2,
|
| 10 |
"hidden_act": "gelu",
|
| 11 |
"hidden_dropout_prob": 0.1,
|
| 12 |
"hidden_size": 384,
|
|
|
|
| 18 |
"label2id": {
|
| 19 |
"LABEL_0": 0
|
| 20 |
},
|
| 21 |
+
"layer_norm_eps": 1e-05,
|
| 22 |
+
"max_position_embeddings": 514,
|
| 23 |
+
"model_type": "xlm-roberta",
|
| 24 |
"num_attention_heads": 12,
|
| 25 |
+
"num_hidden_layers": 12,
|
| 26 |
+
"pad_token_id": 1,
|
| 27 |
"position_embedding_type": "absolute",
|
| 28 |
"sentence_transformers": {
|
| 29 |
"activation_fn": "torch.nn.modules.linear.Identity",
|
| 30 |
"version": "5.1.1"
|
| 31 |
},
|
| 32 |
"transformers_version": "4.57.1",
|
| 33 |
+
"type_vocab_size": 1,
|
| 34 |
"use_cache": true,
|
| 35 |
+
"vocab_size": 250002
|
| 36 |
}
|
eval/CrossEncoderCorrelationEvaluator_validation_results.csv
CHANGED
|
@@ -1,4 +1,6 @@
|
|
| 1 |
epoch,steps,Pearson_Correlation,Spearman_Correlation
|
| 2 |
-
1.0,
|
| 3 |
-
2.0,
|
| 4 |
-
3.0,
|
|
|
|
|
|
|
|
|
| 1 |
epoch,steps,Pearson_Correlation,Spearman_Correlation
|
| 2 |
+
1.0,125,0.7309011622271578,0.7516436739892058
|
| 3 |
+
2.0,250,0.7588798492491784,0.7921942665271138
|
| 4 |
+
3.0,375,0.7523098419884638,0.8021607473982901
|
| 5 |
+
4.0,500,0.7556591422221105,0.8044702495085688
|
| 6 |
+
5.0,625,0.7553359223115874,0.8050527106824349
|
model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:5f41f04568b485258127e43e4fb378afcbdb017f968f848d88687ca2ba76591e
|
| 3 |
+
size 470588492
|
special_tokens_map.json
CHANGED
|
@@ -1,34 +1,48 @@
|
|
| 1 |
{
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
"cls_token": {
|
| 3 |
-
"content": "
|
| 4 |
"lstrip": false,
|
| 5 |
"normalized": false,
|
| 6 |
"rstrip": false,
|
| 7 |
"single_word": false
|
| 8 |
},
|
| 9 |
-
"
|
| 10 |
-
"content": "
|
| 11 |
"lstrip": false,
|
| 12 |
"normalized": false,
|
| 13 |
"rstrip": false,
|
| 14 |
"single_word": false
|
| 15 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 16 |
"pad_token": {
|
| 17 |
-
"content": "
|
| 18 |
"lstrip": false,
|
| 19 |
"normalized": false,
|
| 20 |
"rstrip": false,
|
| 21 |
"single_word": false
|
| 22 |
},
|
| 23 |
"sep_token": {
|
| 24 |
-
"content": "
|
| 25 |
"lstrip": false,
|
| 26 |
"normalized": false,
|
| 27 |
"rstrip": false,
|
| 28 |
"single_word": false
|
| 29 |
},
|
| 30 |
"unk_token": {
|
| 31 |
-
"content": "
|
| 32 |
"lstrip": false,
|
| 33 |
"normalized": false,
|
| 34 |
"rstrip": false,
|
|
|
|
| 1 |
{
|
| 2 |
+
"bos_token": {
|
| 3 |
+
"content": "<s>",
|
| 4 |
+
"lstrip": false,
|
| 5 |
+
"normalized": false,
|
| 6 |
+
"rstrip": false,
|
| 7 |
+
"single_word": false
|
| 8 |
+
},
|
| 9 |
"cls_token": {
|
| 10 |
+
"content": "<s>",
|
| 11 |
"lstrip": false,
|
| 12 |
"normalized": false,
|
| 13 |
"rstrip": false,
|
| 14 |
"single_word": false
|
| 15 |
},
|
| 16 |
+
"eos_token": {
|
| 17 |
+
"content": "</s>",
|
| 18 |
"lstrip": false,
|
| 19 |
"normalized": false,
|
| 20 |
"rstrip": false,
|
| 21 |
"single_word": false
|
| 22 |
},
|
| 23 |
+
"mask_token": {
|
| 24 |
+
"content": "<mask>",
|
| 25 |
+
"lstrip": true,
|
| 26 |
+
"normalized": false,
|
| 27 |
+
"rstrip": false,
|
| 28 |
+
"single_word": false
|
| 29 |
+
},
|
| 30 |
"pad_token": {
|
| 31 |
+
"content": "<pad>",
|
| 32 |
"lstrip": false,
|
| 33 |
"normalized": false,
|
| 34 |
"rstrip": false,
|
| 35 |
"single_word": false
|
| 36 |
},
|
| 37 |
"sep_token": {
|
| 38 |
+
"content": "</s>",
|
| 39 |
"lstrip": false,
|
| 40 |
"normalized": false,
|
| 41 |
"rstrip": false,
|
| 42 |
"single_word": false
|
| 43 |
},
|
| 44 |
"unk_token": {
|
| 45 |
+
"content": "<unk>",
|
| 46 |
"lstrip": false,
|
| 47 |
"normalized": false,
|
| 48 |
"rstrip": false,
|
tokenizer.json
CHANGED
|
The diff for this file is too large to render.
See raw diff
|
|
|
tokenizer_config.json
CHANGED
|
@@ -1,58 +1,55 @@
|
|
| 1 |
{
|
| 2 |
"added_tokens_decoder": {
|
| 3 |
"0": {
|
| 4 |
-
"content": "
|
| 5 |
"lstrip": false,
|
| 6 |
"normalized": false,
|
| 7 |
"rstrip": false,
|
| 8 |
"single_word": false,
|
| 9 |
"special": true
|
| 10 |
},
|
| 11 |
-
"
|
| 12 |
-
"content": "
|
| 13 |
"lstrip": false,
|
| 14 |
"normalized": false,
|
| 15 |
"rstrip": false,
|
| 16 |
"single_word": false,
|
| 17 |
"special": true
|
| 18 |
},
|
| 19 |
-
"
|
| 20 |
-
"content": "
|
| 21 |
"lstrip": false,
|
| 22 |
"normalized": false,
|
| 23 |
"rstrip": false,
|
| 24 |
"single_word": false,
|
| 25 |
"special": true
|
| 26 |
},
|
| 27 |
-
"
|
| 28 |
-
"content": "
|
| 29 |
"lstrip": false,
|
| 30 |
"normalized": false,
|
| 31 |
"rstrip": false,
|
| 32 |
"single_word": false,
|
| 33 |
"special": true
|
| 34 |
},
|
| 35 |
-
"
|
| 36 |
-
"content": "
|
| 37 |
-
"lstrip":
|
| 38 |
"normalized": false,
|
| 39 |
"rstrip": false,
|
| 40 |
"single_word": false,
|
| 41 |
"special": true
|
| 42 |
}
|
| 43 |
},
|
| 44 |
-
"
|
| 45 |
-
"
|
| 46 |
-
"
|
| 47 |
-
"
|
| 48 |
"extra_special_tokens": {},
|
| 49 |
-
"mask_token": "
|
| 50 |
"model_max_length": 512,
|
| 51 |
-
"
|
| 52 |
-
"
|
| 53 |
-
"
|
| 54 |
-
"
|
| 55 |
-
"tokenize_chinese_chars": true,
|
| 56 |
-
"tokenizer_class": "BertTokenizer",
|
| 57 |
-
"unk_token": "[UNK]"
|
| 58 |
}
|
|
|
|
| 1 |
{
|
| 2 |
"added_tokens_decoder": {
|
| 3 |
"0": {
|
| 4 |
+
"content": "<s>",
|
| 5 |
"lstrip": false,
|
| 6 |
"normalized": false,
|
| 7 |
"rstrip": false,
|
| 8 |
"single_word": false,
|
| 9 |
"special": true
|
| 10 |
},
|
| 11 |
+
"1": {
|
| 12 |
+
"content": "<pad>",
|
| 13 |
"lstrip": false,
|
| 14 |
"normalized": false,
|
| 15 |
"rstrip": false,
|
| 16 |
"single_word": false,
|
| 17 |
"special": true
|
| 18 |
},
|
| 19 |
+
"2": {
|
| 20 |
+
"content": "</s>",
|
| 21 |
"lstrip": false,
|
| 22 |
"normalized": false,
|
| 23 |
"rstrip": false,
|
| 24 |
"single_word": false,
|
| 25 |
"special": true
|
| 26 |
},
|
| 27 |
+
"3": {
|
| 28 |
+
"content": "<unk>",
|
| 29 |
"lstrip": false,
|
| 30 |
"normalized": false,
|
| 31 |
"rstrip": false,
|
| 32 |
"single_word": false,
|
| 33 |
"special": true
|
| 34 |
},
|
| 35 |
+
"250001": {
|
| 36 |
+
"content": "<mask>",
|
| 37 |
+
"lstrip": true,
|
| 38 |
"normalized": false,
|
| 39 |
"rstrip": false,
|
| 40 |
"single_word": false,
|
| 41 |
"special": true
|
| 42 |
}
|
| 43 |
},
|
| 44 |
+
"bos_token": "<s>",
|
| 45 |
+
"clean_up_tokenization_spaces": false,
|
| 46 |
+
"cls_token": "<s>",
|
| 47 |
+
"eos_token": "</s>",
|
| 48 |
"extra_special_tokens": {},
|
| 49 |
+
"mask_token": "<mask>",
|
| 50 |
"model_max_length": 512,
|
| 51 |
+
"pad_token": "<pad>",
|
| 52 |
+
"sep_token": "</s>",
|
| 53 |
+
"tokenizer_class": "XLMRobertaTokenizer",
|
| 54 |
+
"unk_token": "<unk>"
|
|
|
|
|
|
|
|
|
|
| 55 |
}
|
training_info.txt
CHANGED
|
@@ -1,6 +1,9 @@
|
|
| 1 |
-
Base Model: cross-encoder/
|
| 2 |
-
Training Samples:
|
| 3 |
-
Epochs:
|
| 4 |
-
Batch Size:
|
| 5 |
Learning Rate: 2e-05
|
|
|
|
|
|
|
|
|
|
| 6 |
Max Length: 512
|
|
|
|
| 1 |
+
Base Model: cross-encoder/mmarco-mMiniLMv2-L12-H384-v1
|
| 2 |
+
Training Samples: 3999
|
| 3 |
+
Epochs: 5
|
| 4 |
+
Batch Size: 32
|
| 5 |
Learning Rate: 2e-05
|
| 6 |
+
Weight Decay: 0.01
|
| 7 |
+
Scheduler: warmuplinear
|
| 8 |
+
Warmup Steps: 100
|
| 9 |
Max Length: 512
|