LocalOptimum commited on
Commit
489b83d
·
verified ·
1 Parent(s): f7e81b6

Upload chinese-crypto-importance v1.1

Browse files
Files changed (4) hide show
  1. README.md +21 -21
  2. model.pt +1 -1
  3. model.safetensors +1 -1
  4. news_importance_config.json +19 -19
README.md CHANGED
@@ -17,24 +17,24 @@ metrics:
17
  - accuracy
18
  - pearsonr
19
  model-index:
20
- - name: chinese-crypto-importance (v1.0)
21
  results:
22
  - task:
23
  type: text-classification
24
  name: News Importance Binning
25
  metrics:
26
  - type: mae
27
- value: 8.35
28
  name: MAE
29
  - type: accuracy
30
- value: 70.1%
31
  name: Bin Accuracy
32
  - type: pearsonr
33
- value: 0.575
34
  name: Pearson r
35
  ---
36
 
37
- # Chinese Crypto News Importance Scoring Model | 中文加密货币新闻重要性评分模型 (v1.0)
38
 
39
  ## 模型描述 | Model Description
40
 
@@ -51,11 +51,11 @@ This model is LoRA fine-tuned from [LocalOptimum/chinese-crypto-sentiment](https
51
 
52
  ## 训练数据 | Training Data
53
 
54
- - 数据量 | Size: 3364 条中文加密货币新闻样本 | 3364 Chinese crypto news samples
55
- - 数据来源 | Source: EventAlpha / WatchTower 采集的 3281 条新闻 + 83 条推文 | 3281 news articles + 83 tweets collected via EventAlpha / WatchTower
56
  - 标注方式 | Labeling: 自动四维评分管线 + 规则修正 | 4-axis automatic scoring pipeline with rule-based cleanup
57
- - 划分方式 | Split: 随机划分,训练集 2859 / 验证集 505 | Random split with 2859 train and 505 validation samples
58
- - 平均分数 | Average Score: 41.0
59
 
60
  ### 标注维度 | Scoring Axes
61
 
@@ -70,10 +70,10 @@ This model is LoRA fine-tuned from [LocalOptimum/chinese-crypto-sentiment](https
70
 
71
  | Bin | Score Range | Count | Share | 含义 / Interpretation |
72
  |---|---:|---:|---:|---|
73
- | `noise` | 0-25 | 379 | 11.3% | Low-signal, duplicate, digest, or weakly relevant content |
74
- | `low` | 25-50 | 2272 | 67.5% | Routine updates that rarely move the market on their own |
75
- | `medium` | 50-75 | 682 | 20.3% | Tradeable developments with meaningful but limited impact |
76
- | `high` | 75-100 | 31 | 0.9% | Major events that may materially change price or risk appetite |
77
 
78
  ## 性能指标 | Performance Metrics
79
 
@@ -81,10 +81,10 @@ This model is LoRA fine-tuned from [LocalOptimum/chinese-crypto-sentiment](https
81
 
82
  | 指标 Metric | 数值 Value |
83
  |---|---:|
84
- | MAE | 8.35 |
85
- | Bin Accuracy | 70.1% |
86
- | Pearson r | 0.575 |
87
- | Best Epoch | 5 |
88
 
89
  ## 分数解释 | Score Interpretation
90
 
@@ -158,7 +158,7 @@ print(pipe("比特币突破关键阻力位并创下阶段新高"))
158
  - 基础模型 | Base Model: `LocalOptimum/chinese-crypto-sentiment`
159
  - 模型结构 | Architecture: BERT backbone + classification head + regression head
160
  - 最大长度 | Max Length: 256
161
- - 训练轮数 | Epochs: 10(Early Stopping patience=3,最佳 epoch=5
162
  - 批次大小 | Batch Size: 16
163
  - 学习率 | Learning Rate: 2e-5
164
  - LoRA: `r=16`, `alpha=32`, `dropout=0.05`
@@ -203,7 +203,7 @@ Apache-2.0
203
  author={Onefly},
204
  year={2026},
205
  howpublished={\url{https://huggingface.co/LocalOptimum/chinese-crypto-importance}},
206
- note={LoRA fine-tuned from LocalOptimum/chinese-crypto-sentiment, 3364 samples, MAE=8.35, BinAcc=70.1%}
207
  }
208
  ```
209
 
@@ -219,7 +219,7 @@ Apache-2.0
219
 
220
  - 首个公开的重要性评分模型版本
221
  - 支持双头输出:连续重要性分数 + 4 档重要性分类
222
- - 基于 3364 条中文加密货币新闻样本完成训练
223
- - 当前验证指标:MAE=8.35,Bin Accuracy=70.1%,Pearson r=0.575
224
 
225
  如有问题或建议,欢迎提 issue 或 PR。
 
17
  - accuracy
18
  - pearsonr
19
  model-index:
20
+ - name: chinese-crypto-importance (v1.1)
21
  results:
22
  - task:
23
  type: text-classification
24
  name: News Importance Binning
25
  metrics:
26
  - type: mae
27
+ value: 6.87
28
  name: MAE
29
  - type: accuracy
30
+ value: 61.8%
31
  name: Bin Accuracy
32
  - type: pearsonr
33
+ value: 0.532
34
  name: Pearson r
35
  ---
36
 
37
+ # Chinese Crypto News Importance Scoring Model | 中文加密货币新闻重要性评分模型 (v1.1)
38
 
39
  ## 模型描述 | Model Description
40
 
 
51
 
52
  ## 训练数据 | Training Data
53
 
54
+ - 数据量 | Size: 20286 条中文加密货币新闻样本 | 20286 Chinese crypto news samples
55
+ - 数据来源 | Source: EventAlpha / WatchTower 采集的 19729 条新闻 + 557 条推文 | 19729 news articles + 557 tweets collected via EventAlpha / WatchTower
56
  - 标注方式 | Labeling: 自动四维评分管线 + 规则修正 | 4-axis automatic scoring pipeline with rule-based cleanup
57
+ - 划分方式 | Split: 随机划分,训练集 17243 / 验证集 3043 | Random split with 17243 train and 3043 validation samples
58
+ - 平均分数 | Average Score: 41.7
59
 
60
  ### 标注维度 | Scoring Axes
61
 
 
70
 
71
  | Bin | Score Range | Count | Share | 含义 / Interpretation |
72
  |---|---:|---:|---:|---|
73
+ | `noise` | 0-25 | 1626 | 8.0% | Low-signal, duplicate, digest, or weakly relevant content |
74
+ | `low` | 25-50 | 14773 | 72.8% | Routine updates that rarely move the market on their own |
75
+ | `medium` | 50-75 | 3840 | 18.9% | Tradeable developments with meaningful but limited impact |
76
+ | `high` | 75-100 | 47 | 0.2% | Major events that may materially change price or risk appetite |
77
 
78
  ## 性能指标 | Performance Metrics
79
 
 
81
 
82
  | 指标 Metric | 数值 Value |
83
  |---|---:|
84
+ | MAE | 6.87 |
85
+ | Bin Accuracy | 61.8% |
86
+ | Pearson r | 0.532 |
87
+ | Best Epoch | 4 |
88
 
89
  ## 分数解释 | Score Interpretation
90
 
 
158
  - 基础模型 | Base Model: `LocalOptimum/chinese-crypto-sentiment`
159
  - 模型结构 | Architecture: BERT backbone + classification head + regression head
160
  - 最大长度 | Max Length: 256
161
+ - 训练轮数 | Epochs: 10(Early Stopping patience=3,最佳 epoch=4
162
  - 批次大小 | Batch Size: 16
163
  - 学习率 | Learning Rate: 2e-5
164
  - LoRA: `r=16`, `alpha=32`, `dropout=0.05`
 
203
  author={Onefly},
204
  year={2026},
205
  howpublished={\url{https://huggingface.co/LocalOptimum/chinese-crypto-importance}},
206
+ note={LoRA fine-tuned from LocalOptimum/chinese-crypto-sentiment, 20286 samples, MAE=6.87, BinAcc=61.8%}
207
  }
208
  ```
209
 
 
219
 
220
  - 首个公开的重要性评分模型版本
221
  - 支持双头输出:连续重要性分数 + 4 档重要性分类
222
+ - 基于 20286 条中文加密货币新闻样本完成训练
223
+ - 当前验证指标:MAE=6.87,Bin Accuracy=61.8%,Pearson r=0.532
224
 
225
  如有问题或建议,欢迎提 issue 或 PR。
model.pt CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d9b8147a6586ee2872183eb9321d549727370dfe477bac46c58c1e8d7549980f
3
  size 420517423
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:269cce4fff0f4f5f398bdbd320745f5c21db1ed33826f60bf2b312c86973975e
3
  size 420517423
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:f0f95a5d410b802ffad59c5ad34e0984ec546025d90c4da0b317919ff9879307
3
  size 419828528
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:20b8f7009f4fcfa23c0a97d9d30353b1608988f7e7f03445c49ee4e89a3bd562
3
  size 419828528
news_importance_config.json CHANGED
@@ -10,33 +10,33 @@
10
  "high"
11
  ],
12
  "bin_edges": [
13
- 25.0,
14
- 50.0,
15
- 75.0
16
  ],
17
  "max_length": 256,
18
  "metrics": {
19
- "epoch": 5,
20
- "loss": 0.4292711278413261,
21
- "mae": 8.35,
22
- "bin_accuracy": 70.1,
23
- "pearson_r": 0.575
24
  },
25
  "dataset": {
26
- "samples": 3364,
27
- "train_samples": 2859,
28
- "eval_samples": 505,
29
- "average_score": 41.0,
30
  "bin_counts": {
31
- "noise": 379,
32
- "low": 2272,
33
- "medium": 682,
34
- "high": 31
35
  },
36
  "source_type_counts": {
37
- "news": 3281,
38
- "tweet": 83
39
  }
40
  },
41
- "version": "v1.0"
42
  }
 
10
  "high"
11
  ],
12
  "bin_edges": [
13
+ 20.0,
14
+ 35.0,
15
+ 50.0
16
  ],
17
  "max_length": 256,
18
  "metrics": {
19
+ "epoch": 4,
20
+ "loss": 0.5274954286986246,
21
+ "mae": 6.87,
22
+ "bin_accuracy": 61.8,
23
+ "pearson_r": 0.532
24
  },
25
  "dataset": {
26
+ "samples": 20286,
27
+ "train_samples": 17243,
28
+ "eval_samples": 3043,
29
+ "average_score": 41.7,
30
  "bin_counts": {
31
+ "noise": 1626,
32
+ "low": 14773,
33
+ "medium": 3840,
34
+ "high": 47
35
  },
36
  "source_type_counts": {
37
+ "news": 19729,
38
+ "tweet": 557
39
  }
40
  },
41
+ "version": "v1.1"
42
  }