noahjax commited on
Commit
5192992
ยท
verified ยท
1 Parent(s): d23f1b5

Upload fine-tuned chart reranker model

Browse files
README.md CHANGED
@@ -4,7 +4,7 @@ tags:
4
  - cross-encoder
5
  - reranker
6
  - generated_from_trainer
7
- - dataset_size:20347
8
  - loss:BinaryCrossEntropyLoss
9
  base_model: Alibaba-NLP/gte-multilingual-reranker-base
10
  pipeline_tag: text-ranking
@@ -23,10 +23,10 @@ model-index:
23
  type: validation
24
  metrics:
25
  - type: pearson
26
- value: 0.8381245620713855
27
  name: Pearson
28
  - type: spearman
29
- value: 0.8388188648567115
30
  name: Spearman
31
  ---
32
 
@@ -70,11 +70,11 @@ from sentence_transformers import CrossEncoder
70
  model = CrossEncoder("cross_encoder_model_id")
71
  # Get scores for pairs of texts
72
  pairs = [
73
- ['Thanks, now you have everything pick the most important item or 2 or three if you find it really appropriate from each group. Just simplify this list a bit, to make sure I have my micro nutrients, vitamins, whatever checked off.', 'Title: "Natural Grocers by Vitamin Cottage Overview"\nCollections: Companies\nDatasets: InstrumentClosePrice1Day\nChart Type: timeseries:eav_v3\nCanonical forms: "Natural Grocers by Vitamin Cottage"="closing_price"'],
74
- ['How do people feel about Nicola Sturgeon?', 'Title: "Nicola Sturgeon fame & popularity tracker (United Kingdom)"\nCollections: YouGov Trackers\nDatasets: YouGovTrackerValueV2\nChart Type: survey:timeseries\nSources: YouGov'],
75
- ['Create a skit about hino. It should be a horror theme and humor in the end. Without the need of driving a truck. it can be about hino genuine spareparts or technician services', 'Title: "Hino Motors Overview"\nCollections: Companies\nChart Type: company:finance\nCanonical forms: "Hino Motors"="Hino Motors, Ltd.", "Overview"="Stock Overview"\nSources: S&P Global'],
76
- ['no i mean talk about the trends in school', 'Title: "Should private schools be banned? (United Kingdom)"\nCollections: YouGov Trackers\nDatasets: YouGovTrackerValueV2\nChart Type: survey:timeseries\nSources: YouGov'],
77
- ['Exchange rate Moroccan dirham to euro 29 October 2025', 'Title: "Conversion rate from EUR to MAD"\nCollections: Foreign Exchange Rates\nDatasets: Forex\nChart Type: exchange:currency\nSources: Xignite'],
78
  ]
79
  scores = model.predict(pairs)
80
  print(scores.shape)
@@ -82,13 +82,13 @@ print(scores.shape)
82
 
83
  # Or rank different texts based on similarity to a single text
84
  ranks = model.rank(
85
- 'Thanks, now you have everything pick the most important item or 2 or three if you find it really appropriate from each group. Just simplify this list a bit, to make sure I have my micro nutrients, vitamins, whatever checked off.',
86
  [
87
- 'Title: "Natural Grocers by Vitamin Cottage Overview"\nCollections: Companies\nDatasets: InstrumentClosePrice1Day\nChart Type: timeseries:eav_v3\nCanonical forms: "Natural Grocers by Vitamin Cottage"="closing_price"',
88
- 'Title: "Nicola Sturgeon fame & popularity tracker (United Kingdom)"\nCollections: YouGov Trackers\nDatasets: YouGovTrackerValueV2\nChart Type: survey:timeseries\nSources: YouGov',
89
- 'Title: "Hino Motors Overview"\nCollections: Companies\nChart Type: company:finance\nCanonical forms: "Hino Motors"="Hino Motors, Ltd.", "Overview"="Stock Overview"\nSources: S&P Global',
90
- 'Title: "Should private schools be banned? (United Kingdom)"\nCollections: YouGov Trackers\nDatasets: YouGovTrackerValueV2\nChart Type: survey:timeseries\nSources: YouGov',
91
- 'Title: "Conversion rate from EUR to MAD"\nCollections: Foreign Exchange Rates\nDatasets: Forex\nChart Type: exchange:currency\nSources: Xignite',
92
  ]
93
  )
94
  # [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
@@ -129,8 +129,8 @@ You can finetune this model on your own dataset.
129
 
130
  | Metric | Value |
131
  |:-------------|:-----------|
132
- | pearson | 0.8381 |
133
- | **spearman** | **0.8388** |
134
 
135
  <!--
136
  ## Bias, Risks and Limitations
@@ -150,19 +150,19 @@ You can finetune this model on your own dataset.
150
 
151
  #### Unnamed Dataset
152
 
153
- * Size: 20,347 training samples
154
  * Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
155
  * Approximate statistics based on the first 1000 samples:
156
- | | sentence_0 | sentence_1 | label |
157
- |:--------|:-----------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------|:---------------------------------------------------------------|
158
- | type | string | string | float |
159
- | details | <ul><li>min: 1 characters</li><li>mean: 84.39 characters</li><li>max: 943 characters</li></ul> | <ul><li>min: 74 characters</li><li>mean: 180.44 characters</li><li>max: 396 characters</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.45</li><li>max: 1.0</li></ul> |
160
  * Samples:
161
- | sentence_0 | sentence_1 | label |
162
- |:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------|
163
- | <code>Thanks, now you have everything pick the most important item or 2 or three if you find it really appropriate from each group. Just simplify this list a bit, to make sure I have my micro nutrients, vitamins, whatever checked off.</code> | <code>Title: "Natural Grocers by Vitamin Cottage Overview"<br>Collections: Companies<br>Datasets: InstrumentClosePrice1Day<br>Chart Type: timeseries:eav_v3<br>Canonical forms: "Natural Grocers by Vitamin Cottage"="closing_price"</code> | <code>0.0</code> |
164
- | <code>How do people feel about Nicola Sturgeon?</code> | <code>Title: "Nicola Sturgeon fame & popularity tracker (United Kingdom)"<br>Collections: YouGov Trackers<br>Datasets: YouGovTrackerValueV2<br>Chart Type: survey:timeseries<br>Sources: YouGov</code> | <code>1.0</code> |
165
- | <code>Create a skit about hino. It should be a horror theme and humor in the end. Without the need of driving a truck. it can be about hino genuine spareparts or technician services</code> | <code>Title: "Hino Motors Overview"<br>Collections: Companies<br>Chart Type: company:finance<br>Canonical forms: "Hino Motors"="Hino Motors, Ltd.", "Overview"="Stock Overview"<br>Sources: S&P Global</code> | <code>0.5</code> |
166
  * Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
167
  ```json
168
  {
@@ -308,40 +308,48 @@ You can finetune this model on your own dataset.
308
  ### Training Logs
309
  | Epoch | Step | Training Loss | validation_spearman |
310
  |:------:|:----:|:-------------:|:-------------------:|
311
- | 0.1572 | 100 | - | 0.7137 |
312
- | 0.3145 | 200 | - | 0.7573 |
313
- | 0.4717 | 300 | - | 0.7748 |
314
- | 0.6289 | 400 | - | 0.7888 |
315
- | 0.7862 | 500 | 0.5153 | 0.8000 |
316
- | 0.9434 | 600 | - | 0.8039 |
317
- | 1.0 | 636 | - | 0.8044 |
318
- | 1.1006 | 700 | - | 0.8065 |
319
- | 1.2579 | 800 | - | 0.8167 |
320
- | 1.4151 | 900 | - | 0.8164 |
321
- | 1.5723 | 1000 | 0.445 | 0.8192 |
322
- | 1.7296 | 1100 | - | 0.8225 |
323
- | 1.8868 | 1200 | - | 0.8287 |
324
- | 2.0 | 1272 | - | 0.8284 |
325
- | 2.0440 | 1300 | - | 0.8281 |
326
- | 2.2013 | 1400 | - | 0.8255 |
327
- | 2.3585 | 1500 | 0.4102 | 0.8276 |
328
- | 2.5157 | 1600 | - | 0.8305 |
329
- | 2.6730 | 1700 | - | 0.8343 |
330
- | 2.8302 | 1800 | - | 0.8301 |
331
- | 2.9874 | 1900 | - | 0.8351 |
332
- | 3.0 | 1908 | - | 0.8355 |
333
- | 3.1447 | 2000 | 0.3904 | 0.8336 |
334
- | 3.3019 | 2100 | - | 0.8319 |
335
- | 3.4591 | 2200 | - | 0.8319 |
336
- | 3.6164 | 2300 | - | 0.8308 |
337
- | 3.7736 | 2400 | - | 0.8331 |
338
- | 3.9308 | 2500 | 0.3741 | 0.8370 |
339
- | 4.0 | 2544 | - | 0.8383 |
340
- | 4.0881 | 2600 | - | 0.8369 |
341
- | 4.2453 | 2700 | - | 0.8385 |
342
- | 4.4025 | 2800 | - | 0.8368 |
343
- | 4.5597 | 2900 | - | 0.8370 |
344
- | 4.7170 | 3000 | 0.3643 | 0.8388 |
 
 
 
 
 
 
 
 
345
 
346
 
347
  ### Framework Versions
 
4
  - cross-encoder
5
  - reranker
6
  - generated_from_trainer
7
+ - dataset_size:27981
8
  - loss:BinaryCrossEntropyLoss
9
  base_model: Alibaba-NLP/gte-multilingual-reranker-base
10
  pipeline_tag: text-ranking
 
23
  type: validation
24
  metrics:
25
  - type: pearson
26
+ value: 0.8683862942248027
27
  name: Pearson
28
  - type: spearman
29
+ value: 0.8672220121041904
30
  name: Spearman
31
  ---
32
 
 
70
  model = CrossEncoder("cross_encoder_model_id")
71
  # Get scores for pairs of texts
72
  pairs = [
73
+ ["How has Kody Clemens' batting performance changed over the last few seasons?", 'Title: "Kody Clemens"\nCollections: MLB\nDatasets: BaseballPlayers\nChart Type: athlete:baseball'],
74
+ ['What amount of accrued liabilities does Walmart have?', 'Title: "Walmart Balance Sheet"\nCollections: Companies\nChart Type: company:finance\nCanonical forms: "Walmart"="Walmart Inc.", "Balance Sheet"="Financials Overview"\nSources: S&P Global'],
75
+ ['์ด ์„ฑ์žฅ ์ „๋žต์€ ์„ธ๊ณ„ํ™”์™€ ํƒˆ๊ณต์—…ํ™” ์†์—์„œ ์ „ํ†ต์  ์ œ์กฐ์—… ๋ณดํ˜ธ์— ์ดˆ์ ์„ ๋งž์ถ”๋ฉฐ, ์ด๋Š” ํ•ด๋‹น ๋ถ€๋ฌธ\n์˜ ํ’ˆ์งˆ๊ณผ ์ƒ์‚ฐ์„ฑ์„ ์œ ์ง€ํ•˜๋ฉด์„œ ๊ฐ€๊ฒฉ์„ ๋‚ฎ๊ฒŒ ์œ ์ง€ํ•ด์•ผ ํ•จ์„ ์š”๊ตฌํ•œ๋‹ค. ํ•ต์‹ฌ์€ ์ž„๊ธˆ ์ ˆ์ œ๋ฅผ ํ†ตํ•œ\n๋…ธ๋™ ๋น„์šฉ ํ†ต์ œ์— ์žˆ๋‹ค(Johnston, ๋ณธ ์ฑ…). ์™ธ๋ถ€ ์ˆ˜์š”๊ฐ€ ๋‚ด์ˆ˜ ๋ถ€์กฑ์„ ์ƒ์‡„ํ•˜๋Š” ํ•œ, ์ž„๊ธˆ ์ ˆ์ œ๋Š” ์„ฑ\n์žฅ์„ ์ €ํ•ดํ•˜์ง€ ์•Š๋Š”๋‹ค.\n์ž„๊ธˆ ์ ˆ์ œ์™€ ์ œ์กฐ์—… ๋…ธ๋™์˜ ์งˆ์  ๋ณด์กด์€ ๋…ธ๋™์‹œ์žฅ ๋‚ด๋ถ€์ž ๋ณดํ˜ธ, ํˆฌ์ž ๋ฐ ๊ธฐ์ˆ  ๊ด€๋ จ ์ œ์กฐ์—… ๋…ธ\n์กฐ์™€์˜ ๊ธด๋ฐ€ํ•œ ํ˜‘๋ ฅ, ๊ทธ๋ฆฌ๊ณ  ์ถ”๊ฐ€ ๊ต์œก ๋˜๋Š” ํ‰์ƒ๊ต์œก ๊ธฐ๊ด€๊ณผ์˜ ์—ฐ๊ณ„๋กœ ๊ธฐ์ˆ  ํ–ฅ์ƒ์„ ํ†ตํ•ด ๋‹ฌ์„ฑ\n๋œ๋‹ค. ์ œ์กฐ์—… ํ•ต์‹ฌ ๊ทผ๋กœ์ž๋“ค์€ ์ž„๊ธˆ ์–ต์ œ์™€ ๊ธฐ์—… ๋‚ด ์ง๋ฌด ๋ณ€๊ฒฝ ์˜ํ–ฅ ๋˜๋Š” ๊ทผ๋กœ ์‹œ๊ฐ„ ๋ณ€๋™๊ณผ ๊ฐ™์€\n๋‚ด๋ถ€ ์œ ์—ฐ์„ฑ์„ ๋Œ€๊ฐ€๋กœ ๊ณ ์šฉ ๋ณดํ˜ธ๋ฅผ ์•ฝ์†๋ฐ›๋Š”๋‹ค. ๊ณต์žฅ ๋‹จ์œ„ ๋…ธ๋™ ๋Œ€ํ‘œ๋“ค์€ ๋‹จ๊ธฐ ์ž„๊ธˆ ์ธ์ƒ๋ณด๋‹ค\n์žฅ๊ธฐ ํˆฌ์ž์™€ ๊ณ ์šฉ ์•ˆ์ •์„ ์„ ํ˜ธํ•˜๋ฏ€๋กœ, ์ง€์—ญ ๊ณต์žฅ ๋‹จ์œ„ ํ˜‘์•ฝ์ด ๋ˆ„์ ๋˜์–ด ๋…ธ์กฐ์˜ ์ž„๊ธˆ ์–ต์ œ๋ผ๋Š”\n๋ถ€๋ฌธ๋ณ„ ์ •์ฑ…์„ ํ˜•์„ฑํ•œ๋‹ค.\n์ˆ˜์ถœ ์—ญ๋Ÿ‰์ด ์ด ์ „๋žต์˜ ํ•ต์‹ฌ์ด๋ฏ€๋กœ ์‹ค์งˆ ํ™˜์œจ์ด ์ค‘๋Œ€ํ•œ ๊ด€์‹ฌ์‚ฌ์ด๋‹ค. ์žฌ์ •ยทํ†ตํ™” ์ •์ฑ… ์™„ํ™”๋‚˜\n์ž„๊ธˆ ์ธ์ƒ ๋“ฑ ์‹ค์งˆ ํ™˜์œจ์— ๋ถ€์ •์  ์˜ํ–ฅ์„ ๋ฏธ์น  ์ˆ˜ ์žˆ๋Š” ์ •์ฑ…๋“ค์€ ์ œ๋„์ ยท์ •์น˜์ ์œผ๋กœ ์–ต์ œ๋œ๋‹ค.\n์ด๋Ÿฌํ•œ ์ •์ฑ… ๋Œ€์‘์€ ๊ต์œก ๋ฐ ๋ณด์œก์— ๋Œ€ํ•œ ์žฌ์ • ์ง€์ถœ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ๋…ธ๋™ ์‹œ์žฅ ์ •์ฑ…์—๋„ ํŒŒ๊ธ‰ ํšจ๊ณผ\n๋ฅผ ๋ฏธ์นœ๋‹ค.\n์ˆ˜์š” ์ž๊ทน์ด ๋ถˆ๊ฐ€๋Šฅํ•œ ์ƒํ™ฉ์—์„œ, ์ตœ์ € ์ž„๊ธˆ์„ ๋‚ฎ์ถ”๊ธฐ ์œ„ํ•œ ๊ณต๊ธ‰ ์ธก๋ฉด์˜ ์กฐ์น˜๊ฐ€ ๋„์ž…๋œ๋‹ค. ์ด\n์ „๋žต์€ ๋˜ํ•œ ๊ตญ๋‚ด ์„œ๋น„์Šค๋ฅผ ์ €๋ ดํ•˜๊ฒŒ ๋งŒ๋“œ๋Š” ์ €๋ ดํ•˜๊ณ  ์œ ์—ฐํ•œ ์„œ๋น„์Šค ๋ถ€๋ฌธ์˜ ์ถœํ˜„์— ์˜์กดํ•œ๋‹ค.\n๋”ฐ๋ผ์„œ ์ด์ค‘ํ™”์™€ ๊ณต๊ธ‰ ์ธก๋ฉด์˜ ๋…ธ๋™ ์‹œ์žฅ ์ •์ฑ…์€ ๊ฒฝ์ œ ์ „๋ฌธํ™” ํŒจํ„ด์— ์ง์ ‘์ ์œผ๋กœ ๊ธฐ์—ฌํ•œ๋‹ค\n(Palier and Thelen 2010; Hassel 2014). ๊ธฐ์—…๋“ค์€ ์‚ฐ์—… ๊ตฌ์กฐ์กฐ์ •์„ ํ†ตํ•ด ์ƒ์‚ฐ ๊ณผ์ •์˜ ์ƒ์‚ฐ์„ฑ์ด\n๋‚ฎ์€ ์„œ๋น„์Šค ๋ถ€๋ฌธ์„ ๊ณ ์ƒ์‚ฐ์„ฑ ์ œ์กฐ ๋ถ€๋ฌธ์—์„œ ๋ถ„๋ฆฌํ•ด๋‚ธ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ๊ธฐ์—…์€ ๋‚ด๋ถ€์ ์œผ๋กœ ๋…ธ๋™๋ ฅ\n์„ ์„ธ๋ถ„ํ™”ํ•˜๊ณ  ๋…ธ๋™ ์‹œ์žฅ ์ด์›ํ™”๋ฅผ ํ—ˆ์šฉํ•˜๋Š” ์‹œ์žฅ ๊ทœ์น™ ๋ณ€ํ™”๋ฅผ ๋„์ž…ํ•œ๋‹ค.', 'Title: "South Korea Exports"\nCollections: World Bank Indicators\nDatasets: WorldBankIndicatorsData\nChart Type: timeseries:eav_v3\nCanonical forms: "Exports"="exports_of_goods_and_services"\nSources: The World Bank'],
76
+ ['AYANEO Pocket Ace compact high-performance 2025', 'Title: "Mitsui High-tec Overview"\nCollections: Companies\nChart Type: company:finance\nCanonical forms: "Mitsui High-tec"="Mitsui High-tec, Inc.", "Overview"="Stock Overview"\nSources: S&P Global'],
77
+ ['Traorรฉ 2024 trade exchanges China Senegal 2.5 billion 2019', 'Title: "Senegal Exports"\nCollections: World Bank Indicators\nDatasets: WorldBankIndicatorsData\nChart Type: timeseries:eav_v3\nCanonical forms: "Exports"="exports_of_goods_and_services"\nSources: The World Bank'],
78
  ]
79
  scores = model.predict(pairs)
80
  print(scores.shape)
 
82
 
83
  # Or rank different texts based on similarity to a single text
84
  ranks = model.rank(
85
+ "How has Kody Clemens' batting performance changed over the last few seasons?",
86
  [
87
+ 'Title: "Kody Clemens"\nCollections: MLB\nDatasets: BaseballPlayers\nChart Type: athlete:baseball',
88
+ 'Title: "Walmart Balance Sheet"\nCollections: Companies\nChart Type: company:finance\nCanonical forms: "Walmart"="Walmart Inc.", "Balance Sheet"="Financials Overview"\nSources: S&P Global',
89
+ 'Title: "South Korea Exports"\nCollections: World Bank Indicators\nDatasets: WorldBankIndicatorsData\nChart Type: timeseries:eav_v3\nCanonical forms: "Exports"="exports_of_goods_and_services"\nSources: The World Bank',
90
+ 'Title: "Mitsui High-tec Overview"\nCollections: Companies\nChart Type: company:finance\nCanonical forms: "Mitsui High-tec"="Mitsui High-tec, Inc.", "Overview"="Stock Overview"\nSources: S&P Global',
91
+ 'Title: "Senegal Exports"\nCollections: World Bank Indicators\nDatasets: WorldBankIndicatorsData\nChart Type: timeseries:eav_v3\nCanonical forms: "Exports"="exports_of_goods_and_services"\nSources: The World Bank',
92
  ]
93
  )
94
  # [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
 
129
 
130
  | Metric | Value |
131
  |:-------------|:-----------|
132
+ | pearson | 0.8684 |
133
+ | **spearman** | **0.8672** |
134
 
135
  <!--
136
  ## Bias, Risks and Limitations
 
150
 
151
  #### Unnamed Dataset
152
 
153
+ * Size: 27,981 training samples
154
  * Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
155
  * Approximate statistics based on the first 1000 samples:
156
+ | | sentence_0 | sentence_1 | label |
157
+ |:--------|:----------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------|:---------------------------------------------------------------|
158
+ | type | string | string | float |
159
+ | details | <ul><li>min: 6 characters</li><li>mean: 90.6 characters</li><li>max: 993 characters</li></ul> | <ul><li>min: 72 characters</li><li>mean: 172.97 characters</li><li>max: 458 characters</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.44</li><li>max: 1.0</li></ul> |
160
  * Samples:
161
+ | sentence_0 | sentence_1 | label |
162
+ |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------|
163
+ | <code>How has Kody Clemens' batting performance changed over the last few seasons?</code> | <code>Title: "Kody Clemens"<br>Collections: MLB<br>Datasets: BaseballPlayers<br>Chart Type: athlete:baseball</code> | <code>1.0</code> |
164
+ | <code>What amount of accrued liabilities does Walmart have?</code> | <code>Title: "Walmart Balance Sheet"<br>Collections: Companies<br>Chart Type: company:finance<br>Canonical forms: "Walmart"="Walmart Inc.", "Balance Sheet"="Financials Overview"<br>Sources: S&P Global</code> | <code>0.75</code> |
165
+ | <code>์ด ์„ฑ์žฅ ์ „๋žต์€ ์„ธ๊ณ„ํ™”์™€ ํƒˆ๊ณต์—…ํ™” ์†์—์„œ ์ „ํ†ต์  ์ œ์กฐ์—… ๋ณดํ˜ธ์— ์ดˆ์ ์„ ๋งž์ถ”๋ฉฐ, ์ด๋Š” ํ•ด๋‹น ๋ถ€๋ฌธ<br>์˜ ํ’ˆ์งˆ๊ณผ ์ƒ์‚ฐ์„ฑ์„ ์œ ์ง€ํ•˜๋ฉด์„œ ๊ฐ€๊ฒฉ์„ ๋‚ฎ๊ฒŒ ์œ ์ง€ํ•ด์•ผ ํ•จ์„ ์š”๊ตฌํ•œ๋‹ค. ํ•ต์‹ฌ์€ ์ž„๊ธˆ ์ ˆ์ œ๋ฅผ ํ†ตํ•œ<br>๋…ธ๋™ ๋น„์šฉ ํ†ต์ œ์— ์žˆ๋‹ค(Johnston, ๋ณธ ์ฑ…). ์™ธ๋ถ€ ์ˆ˜์š”๊ฐ€ ๋‚ด์ˆ˜ ๋ถ€์กฑ์„ ์ƒ์‡„ํ•˜๋Š” ํ•œ, ์ž„๊ธˆ ์ ˆ์ œ๋Š” ์„ฑ<br>์žฅ์„ ์ €ํ•ดํ•˜์ง€ ์•Š๋Š”๋‹ค.<br>์ž„๊ธˆ ์ ˆ์ œ์™€ ์ œ์กฐ์—… ๋…ธ๋™์˜ ์งˆ์  ๋ณด์กด์€ ๋…ธ๋™์‹œ์žฅ ๋‚ด๋ถ€์ž ๋ณดํ˜ธ, ํˆฌ์ž ๋ฐ ๊ธฐ์ˆ  ๊ด€๋ จ ์ œ์กฐ์—… ๋…ธ<br>์กฐ์™€์˜ ๊ธด๋ฐ€ํ•œ ํ˜‘๋ ฅ, ๊ทธ๋ฆฌ๊ณ  ์ถ”๊ฐ€ ๊ต์œก ๋˜๋Š” ํ‰์ƒ๊ต์œก ๊ธฐ๊ด€๊ณผ์˜ ์—ฐ๊ณ„๋กœ ๊ธฐ์ˆ  ํ–ฅ์ƒ์„ ํ†ตํ•ด ๋‹ฌ์„ฑ<br>๋œ๋‹ค. ์ œ์กฐ์—… ํ•ต์‹ฌ ๊ทผ๋กœ์ž๋“ค์€ ์ž„๊ธˆ ์–ต์ œ์™€ ๊ธฐ์—… ๋‚ด ์ง๋ฌด ๋ณ€๊ฒฝ ์˜ํ–ฅ ๋˜๋Š” ๊ทผ๋กœ ์‹œ๊ฐ„ ๋ณ€๋™๊ณผ ๊ฐ™์€<br>๋‚ด๋ถ€ ์œ ์—ฐ์„ฑ์„ ๋Œ€๊ฐ€๋กœ ๊ณ ์šฉ ๋ณดํ˜ธ๋ฅผ ์•ฝ์†๋ฐ›๋Š”๋‹ค. ๊ณต์žฅ ๋‹จ์œ„ ๋…ธ๋™ ๋Œ€ํ‘œ๋“ค์€ ๋‹จ๊ธฐ ์ž„๊ธˆ ์ธ์ƒ๋ณด๋‹ค<br>์žฅ๊ธฐ ํˆฌ์ž์™€ ๊ณ ์šฉ ์•ˆ์ •์„ ์„ ํ˜ธํ•˜๋ฏ€๋กœ, ์ง€์—ญ ๊ณต์žฅ ๋‹จ์œ„ ํ˜‘์•ฝ์ด ๋ˆ„์ ๋˜์–ด ๋…ธ์กฐ์˜ ์ž„๊ธˆ ์–ต์ œ๋ผ๋Š”<br>๋ถ€๋ฌธ๋ณ„ ์ •์ฑ…์„ ํ˜•์„ฑํ•œ๋‹ค.<br>์ˆ˜์ถœ ์—ญ๋Ÿ‰์ด ์ด ์ „๋žต์˜ ํ•ต์‹ฌ์ด๋ฏ€๋กœ ์‹ค์งˆ ํ™˜์œจ์ด ์ค‘๋Œ€ํ•œ ๊ด€์‹ฌ์‚ฌ์ด๋‹ค. ์žฌ์ •ยทํ†ตํ™” ์ •์ฑ… ์™„ํ™”๋‚˜<br>์ž„๊ธˆ ์ธ์ƒ ๋“ฑ ์‹ค์งˆ ํ™˜์œจ์— ๋ถ€์ •์  ์˜ํ–ฅ์„ ๋ฏธ์น  ์ˆ˜ ์žˆ๋Š” ์ •์ฑ…๋“ค์€ ์ œ๋„์ ยท์ •์น˜์ ์œผ๋กœ ์–ต์ œ๋œ๋‹ค.<br>์ด๋Ÿฌํ•œ ์ •์ฑ… ๋Œ€์‘์€ ๊ต์œก ๋ฐ ๋ณด์œก์— ๋Œ€ํ•œ ์žฌ์ • ์ง€์ถœ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ๋…ธ๋™ ์‹œ์žฅ ์ •์ฑ…์—๋„ ํŒŒ๊ธ‰ ํšจ๊ณผ<br>๋ฅผ ๋ฏธ์นœ๋‹ค.<br>์ˆ˜์š” ์ž๊ทน์ด ๋ถˆ๊ฐ€๋Šฅํ•œ ์ƒํ™ฉ์—์„œ, ์ตœ์ € ์ž„๊ธˆ์„ ๋‚ฎ์ถ”๊ธฐ ์œ„ํ•œ ๊ณต๊ธ‰ ์ธก๋ฉด์˜ ์กฐ์น˜๊ฐ€ ๋„์ž…๋œ๋‹ค. ์ด<br>์ „๋žต์€ ๋˜ํ•œ ๊ตญ๋‚ด ์„œ๋น„์Šค๋ฅผ ์ €๋ ดํ•˜๊ฒŒ ๋งŒ๋“œ๋Š” ์ €๋ ดํ•˜๊ณ  ์œ ์—ฐํ•œ ์„œ๋น„์Šค ๋ถ€๋ฌธ์˜ ์ถœํ˜„์— ์˜์กดํ•œ๋‹ค.<br>๋”ฐ๋ผ์„œ ์ด์ค‘ํ™”์™€ ๊ณต๊ธ‰ ์ธก๋ฉด์˜ ๋…ธ๋™ ์‹œ์žฅ ์ •์ฑ…์€ ๊ฒฝ์ œ ์ „๋ฌธํ™” ํŒจํ„ด์— ์ง์ ‘์ ์œผ๋กœ ๊ธฐ์—ฌํ•œ๋‹ค<br>(Palier and Thelen 2010; Hassel 2014). ๊ธฐ์—…๋“ค์€ ์‚ฐ์—… ๊ตฌ์กฐ์กฐ์ •์„ ํ†ตํ•ด ์ƒ์‚ฐ ๊ณผ์ •์˜ ์ƒ์‚ฐ์„ฑ์ด<br>๋‚ฎ์€ ์„œ๋น„์Šค ๋ถ€๋ฌธ์„ ๊ณ ์ƒ์‚ฐ์„ฑ ์ œ์กฐ ๋ถ€๋ฌธ์—์„œ ๋ถ„๋ฆฌํ•ด๋‚ธ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ๊ธฐ์—…์€ ๋‚ด๋ถ€์ ์œผ๋กœ ๋…ธ๋™๋ ฅ<br>์„ ์„ธ๋ถ„ํ™”ํ•˜๊ณ  ๋…ธ๋™ ์‹œ์žฅ ์ด์›ํ™”๋ฅผ ํ—ˆ์šฉํ•˜๋Š” ์‹œ์žฅ ๊ทœ์น™ ๋ณ€ํ™”๋ฅผ ๋„์ž…ํ•œ๋‹ค.</code> | <code>Title: "South Korea Exports"<br>Collections: World Bank Indicators<br>Datasets: WorldBankIndicatorsData<br>Chart Type: timeseries:eav_v3<br>Canonical forms: "Exports"="exports_of_goods_and_services"<br>Sources: The World Bank</code> | <code>0.75</code> |
166
  * Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
167
  ```json
168
  {
 
308
  ### Training Logs
309
  | Epoch | Step | Training Loss | validation_spearman |
310
  |:------:|:----:|:-------------:|:-------------------:|
311
+ | 0.1143 | 100 | - | 0.7674 |
312
+ | 0.2286 | 200 | - | 0.8007 |
313
+ | 0.3429 | 300 | - | 0.8089 |
314
+ | 0.4571 | 400 | - | 0.8222 |
315
+ | 0.5714 | 500 | 0.4787 | 0.8286 |
316
+ | 0.6857 | 600 | - | 0.8312 |
317
+ | 0.8 | 700 | - | 0.8344 |
318
+ | 0.9143 | 800 | - | 0.8409 |
319
+ | 1.0 | 875 | - | 0.8459 |
320
+ | 1.0286 | 900 | - | 0.8440 |
321
+ | 1.1429 | 1000 | 0.4205 | 0.8414 |
322
+ | 1.2571 | 1100 | - | 0.8431 |
323
+ | 1.3714 | 1200 | - | 0.8549 |
324
+ | 1.4857 | 1300 | - | 0.8534 |
325
+ | 1.6 | 1400 | - | 0.8544 |
326
+ | 1.7143 | 1500 | 0.3894 | 0.8511 |
327
+ | 1.8286 | 1600 | - | 0.8575 |
328
+ | 1.9429 | 1700 | - | 0.8606 |
329
+ | 2.0 | 1750 | - | 0.8598 |
330
+ | 2.0571 | 1800 | - | 0.8613 |
331
+ | 2.1714 | 1900 | - | 0.8596 |
332
+ | 2.2857 | 2000 | 0.3693 | 0.8605 |
333
+ | 2.4 | 2100 | - | 0.8613 |
334
+ | 2.5143 | 2200 | - | 0.8621 |
335
+ | 2.6286 | 2300 | - | 0.8638 |
336
+ | 2.7429 | 2400 | - | 0.8632 |
337
+ | 2.8571 | 2500 | 0.3535 | 0.8630 |
338
+ | 2.9714 | 2600 | - | 0.8650 |
339
+ | 3.0 | 2625 | - | 0.8635 |
340
+ | 3.0857 | 2700 | - | 0.8642 |
341
+ | 3.2 | 2800 | - | 0.8662 |
342
+ | 3.3143 | 2900 | - | 0.8664 |
343
+ | 3.4286 | 3000 | 0.3375 | 0.8652 |
344
+ | 3.5429 | 3100 | - | 0.8642 |
345
+ | 3.6571 | 3200 | - | 0.8655 |
346
+ | 3.7714 | 3300 | - | 0.8645 |
347
+ | 3.8857 | 3400 | - | 0.8650 |
348
+ | 4.0 | 3500 | 0.3391 | 0.8662 |
349
+ | 4.1143 | 3600 | - | 0.8660 |
350
+ | 4.2286 | 3700 | - | 0.8654 |
351
+ | 4.3429 | 3800 | - | 0.8671 |
352
+ | 4.4571 | 3900 | - | 0.8672 |
353
 
354
 
355
  ### Framework Versions
eval/CrossEncoderCorrelationEvaluator_validation_results.csv CHANGED
@@ -1,6 +1,6 @@
1
  epoch,steps,Pearson_Correlation,Spearman_Correlation
2
- 1.0,636,0.8050961988795169,0.8044347672638916
3
- 2.0,1272,0.8267567950795853,0.8284146931811501
4
- 3.0,1908,0.8351882809975475,0.8355004054548
5
- 4.0,2544,0.8381740944766652,0.8382614031363851
6
- 5.0,3180,0.8368434817201468,0.8374989674723212
 
1
  epoch,steps,Pearson_Correlation,Spearman_Correlation
2
+ 1.0,875,0.8439432763505988,0.8458671064120614
3
+ 2.0,1750,0.8620830630332061,0.8598071837330882
4
+ 3.0,2625,0.8647110382297245,0.8634806082829799
5
+ 4.0,3500,0.8657839457819247,0.8662180172158931
6
+ 5.0,4375,0.8674826818176335,0.8663049346758942
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:f6dde1675c82135fb9296d9c990693ce3373c5982f7f01cd53a72fb674e86d82
3
  size 1223854204
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:114b68cfdf858f07524e1430e10da39644ef568a420e07c0adab488c8841daeb
3
  size 1223854204
training_info.txt CHANGED
@@ -1,5 +1,5 @@
1
  Base Model: Alibaba-NLP/gte-multilingual-reranker-base
2
- Training Samples: 20347
3
  Epochs: 5
4
  Batch Size: 32
5
  Learning Rate: 2e-05
 
1
  Base Model: Alibaba-NLP/gte-multilingual-reranker-base
2
+ Training Samples: 27981
3
  Epochs: 5
4
  Batch Size: 32
5
  Learning Rate: 2e-05