LocalOptimum
/

chinese-crypto-sentiment

@@ -1,193 +1,193 @@
----
-language: zh
-license: apache-2.0
-tags:
-- sentiment-analysis
-- chinese
-- finance
-- finbert
-- crypto
-- text-classification
-datasets:
-- custom
-metrics:
-- accuracy
-- f1
-- precision
-- recall
-model-index:
-- name: Chinese Financial Sentiment Analysis (Crypto)
-  results:
-  - task:
-      type: text-classification
-      name: Sentiment Analysis
-    metrics:
-    - type: accuracy
-      value: 0.645
-      name: Accuracy
-    - type: f1
-      value: 0.6365
-      name: F1 Score
-    - type: precision
-      value: 0.6394
-      name: Precision
-    - type: recall
-      value: 0.645
-      name: Recall
----
-# Chinese Financial Sentiment Analysis Model (Crypto Focus)
-中文金融情感分析模型（加密货币领域）
-## 模型描述 | Model Description
-本模型基于 `yiyanghkust/finbert-tone-chinese` 微调，专门用于分析中文加密货币相关新闻和社交媒体内容的情感倾向。模型可以识别三种情感类别：正面（Positive）、中性（Neutral）和负面（Negative）。
-This model is fine-tuned from `yiyanghkust/finbert-tone-chinese` and specifically designed for sentiment analysis of Chinese cryptocurrency-related news and social media content. It can classify text into three sentiment categories: Positive, Neutral, and Negative.
-## 训练数据 | Training Data
-- **数据量 | Size**: 1000条人工标注的中文金融新闻 | 1000 manually annotated Chinese financial news articles
-- **数据来源 | Source**: 加密货币相关新闻和推文 | Cryptocurrency-related news and tweets
-- **标注方式 | Annotation**: AI辅助 + 人工修正 | AI-assisted + Manual correction
-- **数据分布 | Distribution**:
-  - Positive（正面）: 420条 (42.0%)
-  - Neutral（中性）: 420条 (42.0%)
-  - Negative（负面）: 160条 (16.0%)
-## 性能指标 | Performance Metrics
-在200条测试集上的表现 | Performance on 200 test samples:
-| 指标 Metric | 数值 Value |
-|-------------|-----------|
-| 准确率 Accuracy | 64.50% |
-| F1分数 F1 Score | 63.65% |
-| 精确率 Precision | 63.94% |
-| 召回率 Recall | 64.50% |
-## 使用方法 | Usage
-### 快速开始 | Quick Start
-```python
-from transformers import AutoTokenizer, AutoModelForSequenceClassification
-import torch
-# 加载模型和分词器 | Load model and tokenizer
-model_name = "YOUR_USERNAME/sentiment-finetuned-1000"  # 替换为你的用户名
-tokenizer = AutoTokenizer.from_pretrained(model_name)
-model = AutoModelForSequenceClassification.from_pretrained(model_name)
-# 分析文本 | Analyze text
-text = "比特币突破10万美元创历史新高"
-inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
-# 预测 | Predict
-with torch.no_grad():
-    outputs = model(**inputs)
-    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
-    predicted_class = torch.argmax(predictions, dim=-1).item()
-# 结果映射 | Result mapping
-labels = ['positive', 'neutral', 'negative']
-sentiment = labels[predicted_class]
-confidence = predictions[0][predicted_class].item()
-print(f"情感: {sentiment}")
-print(f"置信度: {confidence:.4f}")
-```
-### 批量处理 | Batch Processing
-```python
-texts = [
-    "币安获得阿布扎比监管授权",
-    "以太坊完成Fusaka升级",
-    "某交易所遭攻击损失100万美元"
-]
-inputs = tokenizer(texts, return_tensors="pt", truncation=True,
-                   max_length=128, padding=True)
-with torch.no_grad():
-    outputs = model(**inputs)
-    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
-    predicted_classes = torch.argmax(predictions, dim=-1)
-labels = ['positive', 'neutral', 'negative']
-for text, pred in zip(texts, predicted_classes):
-    print(f"{text} -> {labels[pred]}")
-```
-## 训练参数 | Training Configuration
-- **基础模型 | Base Model**: yiyanghkust/finbert-tone-chinese
-- **训练轮数 | Epochs**: 5
-- **批次大小 | Batch Size**: 16
-- **学习率 | Learning Rate**: 2e-5
-- **最大序列长度 | Max Length**: 128
-- **训练设备 | Device**: NVIDIA GeForce RTX 3060 Laptop GPU
-- **训练时间 | Training Time**: ~5分钟 | ~5 minutes
-## 适用场景 | Use Cases
-- ✅ 加密货币新闻情感分析
-- ✅ 社交媒体舆情监控
-- ✅ 金融市场情绪指标
-- ✅ 实时新闻情感跟踪
-- ✅ 投资决策辅助参考
-## 局限性 | Limitations
-- ⚠️ 主要针对加密货币领域的金融新闻，其他金融领域可能表现不佳
-- ⚠️ 负面样本相对较少（16%），对负面情感的识别可能不够敏感
-- ⚠️ 短文本（少于10字）的分析准确率可能下降
-- ⚠️ 仅支持简体中文
-- ⚠️ 模型不能替代人工判断，仅供参考
-## 许可证 | License
-Apache-2.0
-## 引用 | Citation
-如果使用本模型，请引用：
-```bibtex
-@misc{watchtower-sentiment-2025,
-  title={Chinese Financial Sentiment Analysis Model (Crypto Focus)},
-  author={WatchTower Team},
-  year={2025},
-  howpublished={\url{https://huggingface.co/YOUR_USERNAME/sentiment-finetuned-1000}},
-  note={Fine-tuned from yiyanghkust/finbert-tone-chinese}
-}
-```
-## 基础模型 | Base Model
-本模型基于以下模型微调：
-- [yiyanghkust/finbert-tone-chinese](https://huggingface.co/yiyanghkust/finbert-tone-chinese)
-感谢原作者的贡献！
-## 更新日志 | Changelog
-### v2.0 (2025-12-09)
-- ✅ 扩充训练数据至1000条
-- ✅ 修正标注错误，提升数据质量
-- ✅ 优化类别分布，提升模型平衡性
-- ✅ F1分数提升2.01%（0.6165 → 0.6365）
-### v1.0 (Initial Release)
-- 基于500条标注数据的初始版本
-## 联系方式 | Contact
-如有问题或建议，欢迎提 issue 或 PR。
----
-**维护者 | Maintainer**: WatchTower Team
-**最后更新 | Last Updated**: 2025-12-09

+---
+language: zh
+license: apache-2.0
+tags:
+- sentiment-analysis
+- chinese
+- finance
+- finbert
+- crypto
+- text-classification
+datasets:
+- custom
+metrics:
+- accuracy
+- f1
+- precision
+- recall
+model-index:
+- name: Chinese Financial Sentiment Analysis (Crypto)
+  results:
+  - task:
+      type: text-classification
+      name: Sentiment Analysis
+    metrics:
+    - type: accuracy
+      value: 0.645
+      name: Accuracy
+    - type: f1
+      value: 0.6365
+      name: F1 Score
+    - type: precision
+      value: 0.6394
+      name: Precision
+    - type: recall
+      value: 0.645
+      name: Recall
+---
+# Chinese Financial Sentiment Analysis Model (Crypto Focus)
+中文金融情感分析模型（加密货币领域）
+## 模型描述 | Model Description
+本模型基于 `yiyanghkust/finbert-tone-chinese` 微调，专门用于分析中文加密货币相关新闻和社交媒体内容的情感倾向。模型可以识别三种情感类别：正面（Positive）、中性（Neutral）和负面（Negative）。
+This model is fine-tuned from `yiyanghkust/finbert-tone-chinese` and specifically designed for sentiment analysis of Chinese cryptocurrency-related news and social media content. It can classify text into three sentiment categories: Positive, Neutral, and Negative.
+## 训练数据 | Training Data
+- **数据量 | Size**: 1000条人工标注的中文金融新闻 | 1000 manually annotated Chinese financial news articles
+- **数据来源 | Source**: 加密货币相关新闻和推文 | Cryptocurrency-related news and tweets
+- **标注方式 | Annotation**: AI辅助 + 人工修正 | AI-assisted + Manual correction
+- **数据分布 | Distribution**:
+  - Positive（正面）: 420条 (42.0%)
+  - Neutral（中性）: 420条 (42.0%)
+  - Negative（负面）: 160条 (16.0%)
+## 性能指标 | Performance Metrics
+在200条测试集上的表现 | Performance on 200 test samples:
+| 指标 Metric | 数值 Value |
+|-------------|-----------|
+| 准确率 Accuracy | 64.50% |
+| F1分数 F1 Score | 63.65% |
+| 精确率 Precision | 63.94% |
+| 召回率 Recall | 64.50% |
+## 使用方法 | Usage
+### 快速开始 | Quick Start
+```python
+from transformers import AutoTokenizer, AutoModelForSequenceClassification
+import torch
+# 加载模型和分词器 | Load model and tokenizer
+model_name = "LocalOptimum/chinese-crypto-sentiment"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForSequenceClassification.from_pretrained(model_name)
+# 分析文本 | Analyze text
+text = "比特币突破10万美元创历史新高"
+inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
+# 预测 | Predict
+with torch.no_grad():
+    outputs = model(**inputs)
+    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
+    predicted_class = torch.argmax(predictions, dim=-1).item()
+# 结果映射 | Result mapping
+labels = ['positive', 'neutral', 'negative']
+sentiment = labels[predicted_class]
+confidence = predictions[0][predicted_class].item()
+print(f"情感: {sentiment}")
+print(f"置信度: {confidence:.4f}")
+```
+### 批量处理 | Batch Processing
+```python
+texts = [
+    "币安获得阿布扎比监管授权",
+    "以太坊完成Fusaka升级",
+    "某交易所遭攻击损失100万美元"
+]
+inputs = tokenizer(texts, return_tensors="pt", truncation=True,
+                   max_length=128, padding=True)
+with torch.no_grad():
+    outputs = model(**inputs)
+    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
+    predicted_classes = torch.argmax(predictions, dim=-1)
+labels = ['positive', 'neutral', 'negative']
+for text, pred in zip(texts, predicted_classes):
+    print(f"{text} -> {labels[pred]}")
+```
+## 训练参数 | Training Configuration
+- **基础模型 | Base Model**: yiyanghkust/finbert-tone-chinese
+- **训练轮数 | Epochs**: 5
+- **批次大小 | Batch Size**: 16
+- **学习率 | Learning Rate**: 2e-5
+- **最大序列长度 | Max Length**: 128
+- **训练设备 | Device**: NVIDIA GeForce RTX 3060 Laptop GPU
+- **训练时间 | Training Time**: ~5分钟 | ~5 minutes
+## 适用场景 | Use Cases
+- ✅ 加密货币新闻情感分析
+- ✅ 社交媒体舆情监控
+- ✅ 金融市场情绪指标
+- ✅ 实时新闻情感跟踪
+- ✅ 投资决策辅助参考
+## 局限性 | Limitations
+- ⚠️ 主要针对加密货币领域的金融新闻，其他金融领域可能表现不佳
+- ⚠️ 负面样本相对较少（16%），对负面情感的识别可能不够敏感
+- ⚠️ 短文本（少于10字）的分析准确率可能下降
+- ⚠️ 仅支持简体中文
+- ⚠️ 模型不能替代人工判断，仅供参考
+## 许可证 | License
+Apache-2.0
+## 引用 | Citation
+如果使用本模型，请引用：
+```bibtex
+@misc{watchtower-sentiment-2025,
+  title={Chinese Financial Sentiment Analysis Model (Crypto Focus)},
+  author={Onefly},
+  year={2025},
+  howpublished={\url{https://huggingface.co/YOUR_USERNAME/sentiment-finetuned-1000}},
+  note={Fine-tuned from yiyanghkust/finbert-tone-chinese}
+}
+```
+## 基础模型 | Base Model
+本模型基于以下模型微调：
+- [yiyanghkust/finbert-tone-chinese](https://huggingface.co/yiyanghkust/finbert-tone-chinese)
+感谢原作者的贡献！
+## 更新日志 | Changelog
+### v2.0 (2025-12-09)
+- ✅ 扩充训练数据至1000条
+- ✅ 修正标注错误，提升数据质量
+- ✅ 优化类别分布，提升模型平衡性
+- ✅ F1分数提升2.01%（0.6165 → 0.6365）
+### v1.0 (Initial Release)
+- 基于500条标注数据的初始版本
+## 联系方式 | Contact
+如有问题或建议，欢迎提 issue 或 PR。
+---
+**维护者 | Maintainer**: Onefly
+**最后更新 | Last Updated**: 2025-12-09