abocide commited on
Commit
a9ff445
·
verified ·
1 Parent(s): b5786c4

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +207 -0
README.md ADDED
@@ -0,0 +1,207 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ license: apache-2.0
4
+ base_model: Qwen/Qwen2.5-7B-Instruct
5
+ tags:
6
+ - llama-factory
7
+ - full
8
+ - generated_from_trainer
9
+ - finance
10
+ - chinese
11
+ - sft
12
+ - reasoning
13
+ - financial-ai
14
+ model-index:
15
+ - name: Qwen2.5-7B-Instruct-R1-forfinance
16
+ results: []
17
+ ---
18
+
19
+ # Qwen2.5-7B-Instruct-R1-forfinance
20
+
21
+ ## 模型简介 / Model Description
22
+
23
+ **Qwen2.5-7B-Instruct-R1-forfinance** 是一个专门针对金融领域进行微调的大语言模型。该模型基于 Qwen2.5-7B-Instruct 进行全量微调,结合了开源金融问答数据集和高质量的思维链推理数据。
24
+
25
+ **Qwen2.5-7B-Instruct-R1-forfinance** is a large language model specifically fine-tuned for the financial domain. This model is based on Qwen2.5-7B-Instruct with full parameter fine-tuning, combining open-source financial Q&A datasets with high-quality chain-of-thought reasoning data.
26
+
27
+ ## 数据集 / Training Data
28
+
29
+ ### 数据来源 / Data Sources
30
+
31
+ 1. **开源金融问答数据集** / Open-source financial Q&A datasets
32
+ 2. **DeepSeek-R1 生成的思维链数据** / Chain-of-thought data generated by DeepSeek-R1
33
+ - 使用 DeepSeek-R1 进行推理生成思维链数据 / Use DeepSeek-R1 for inference to generate chain-of-thought data
34
+ - 通过 GPT-5 对生成的回答进行质量评分 / Quality scoring of generated responses using GPT-5
35
+ - 筛选高质量回答作为训练数据 / Select high-quality responses as training data
36
+
37
+ ### 数据内容 / Data Content
38
+
39
+ - **基础金融知识问答** / Basic financial knowledge Q&A
40
+ - **金融计算题** / Financial calculation problems
41
+ - **金融概念解释** / Financial concept explanations
42
+ - **思维链推理** / Chain-of-thought reasoning
43
+
44
+ 数据质量控制:使用 GPT-5 对 DeepSeek-R1 的回答进行评分,只选择高质量的回答作为 SFT 训练数据。
45
+
46
+ Quality control: GPT-5 was used to score DeepSeek-R1's responses, and only high-quality answers were selected as SFT training data.
47
+
48
+ ## 训练详情 / Training Details
49
+
50
+ ### 基础模型 / Base Model
51
+ - **模型 / Model**: Qwen2.5-7B-Instruct
52
+ - **微调方式 / Fine-tuning Method**: 全量微调 (Full Fine-tuning)
53
+ - **训练类型 / Training Type**: 监督微调 (Supervised Fine-Tuning, SFT)
54
+
55
+ ### 训练环境 / Training Environment
56
+ - **硬件 / Hardware**: 8 × NVIDIA A100 GPU
57
+ - **分布式训练 / Distributed Training**: 多GPU并行训练 (Multi-GPU parallel training)
58
+
59
+ ### 训练超参数 / Training Hyperparameters
60
+
61
+ - **学习率 / Learning Rate**: 1e-05
62
+ - **训练批次大小 / Train Batch Size**: 1
63
+ - **评估批次大小 / Eval Batch Size**: 8
64
+ - **随机种子 / Seed**: 42
65
+ - **分布式类型 / Distributed Type**: multi-GPU
66
+ - **设备数量 / Number of Devices**: 8
67
+ - **梯度累积步数 / Gradient Accumulation Steps**: 16
68
+ - **总训练批次大小 / Total Train Batch Size**: 128
69
+ - **总评估批次大小 / Total Eval Batch Size**: 64
70
+ - **优化器 / Optimizer**: AdamW (betas=(0.9,0.999), epsilon=1e-08)
71
+ - **学习率调度器 / LR Scheduler**: Linear
72
+ - **预热比例 / Warmup Ratio**: 0.03
73
+ - **训练轮数 / Epochs**: 2.0
74
+
75
+ ### 训练结果 / Training Results
76
+
77
+ - **最终训练损失 / Final Training Loss**: 0.7332
78
+ - **训练步数 / Training Steps**: 312
79
+ - **训练时长 / Training Runtime**: 6450.97 秒 (seconds)
80
+ - **训练样本处理速度 / Samples per Second**: 6.168
81
+ - **训练步骤处理速度 / Steps per Second**: 0.048
82
+
83
+ ## 快速开始 / Quick Start
84
+
85
+ ### 模型推理 / Model Inference
86
+
87
+ 我们提供了一个简单的推理脚本 `inference.py`,可以直接使用模型进行金融问答。
88
+
89
+ We provide a simple inference script `inference.py` for direct financial Q&A using the model.
90
+
91
+ #### 使用方法 / Usage
92
+
93
+ ```python
94
+ import torch
95
+ from transformers import AutoModelForCausalLM, AutoTokenizer
96
+
97
+ # 使用你本地的检查点路径 / Use your local checkpoint path
98
+ model_path = "/root/Qwen2.5-7B-Instruct-R1-forfinance/"
99
+
100
+ # 加载模型和分词器 / Load model and tokenizer
101
+ print("正在加载模型... / Loading model...")
102
+ model = AutoModelForCausalLM.from_pretrained(
103
+ model_path,
104
+ torch_dtype=torch.bfloat16, # 根据config.json中的torch_dtype
105
+ device_map="auto",
106
+ trust_remote_code=True # 如果需要的话
107
+ )
108
+
109
+ tokenizer = AutoTokenizer.from_pretrained(
110
+ model_path,
111
+ trust_remote_code=True
112
+ )
113
+
114
+ print("模型加载完成!/ Model loaded successfully!")
115
+
116
+ # 准备输入 / Prepare input
117
+ prompt = "假设你是一位金融行业专家,请回答下列问题。\n在宏观分析中,描述在既定利率水平下产品市场达到均衡状态的曲线是什么?\n请一步步思考。"
118
+
119
+ messages = [
120
+ {"role": "system", "content": "You are Qwen, created by Alibaba Cloud. You are a helpful assistant."},
121
+ {"role": "user", "content": prompt}
122
+ ]
123
+
124
+ # 应用聊天模板 / Apply chat template
125
+ text = tokenizer.apply_chat_template(
126
+ messages,
127
+ tokenize=False,
128
+ add_generation_prompt=True
129
+ )
130
+
131
+ # 编码输入 / Encode input
132
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
133
+
134
+ # 生成回答 / Generate response
135
+ print("正在生成回答... / Generating response...")
136
+ with torch.no_grad(): # 节省显存 / Save GPU memory
137
+ generated_ids = model.generate(
138
+ **model_inputs,
139
+ max_new_tokens=2048,
140
+ do_sample=True,
141
+ temperature=0.7,
142
+ top_p=0.8,
143
+ repetition_penalty=1.05,
144
+ pad_token_id=tokenizer.eos_token_id
145
+ )
146
+
147
+ # 解码生成的tokens / Decode generated tokens
148
+ generated_ids = [
149
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
150
+ ]
151
+
152
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
153
+
154
+ # 输出结果 / Output result
155
+ print("模型回答 / Model Response:")
156
+ print(response)
157
+ ```
158
+
159
+ #### 运行推理脚本 / Run Inference Script
160
+
161
+ ```bash
162
+ # 确保模型路径正确 / Ensure the model path is correct
163
+ python inference.py
164
+ ```
165
+
166
+ ### 环境要求 / Requirements
167
+
168
+ - **Python**: ≥ 3.8
169
+ - **PyTorch**: ≥ 2.0
170
+ - **Transformers**: ≥ 4.55.0
171
+ - **GPU**: 建议使用 NVIDIA GPU with CUDA support
172
+ - **显存 / GPU Memory**: 建议 ≥ 16GB (推荐 24GB+)
173
+
174
+ ## 后续计划 / Future Plans
175
+
176
+ **强化学习训练** / Reinforcement Learning Training
177
+ - 计划使用 GRPO (Group Relative Policy Optimization) 进行强化学习训练 / Plan to use GRPO for reinforcement learning training
178
+ - 进一步提升模型在金融领域的表现和安全性 / Further improve model performance and safety in the financial domain
179
+
180
+ We plan to conduct reinforcement learning training using GRPO (Group Relative Policy Optimization) to further improve the model's performance and safety in the financial domain.
181
+
182
+ ## 使用场景 / Use Cases
183
+
184
+ - **金融知识问答** / Financial knowledge Q&A
185
+ - **金融计算和分析** / Financial calculations and analysis
186
+ - **投资建议咨询** / Investment advice consultation
187
+ - **金融概念解释** / Financial concept explanations
188
+ - **风险评估** / Risk assessment
189
+
190
+ ## 限制和注意事项 / Limitations and Disclaimers
191
+
192
+ ⚠️ **重要提醒** / Important Notice:
193
+ - 本模型仅供学习和研究使用,不构成投资建议 / This model is for educational and research purposes only and does not constitute investment advice
194
+ - 在实际应用中请谨慎使用,并结合专业判断 / Please use with caution in practical applications and combine with professional judgment
195
+ - 模型可能存在幻觉和错误,请进行事实核查 / The model may have hallucinations and errors, please fact-check the outputs
196
+
197
+ ⚠️ This model is for educational and research purposes only and does not constitute investment advice. Please use with caution in practical applications and combine with professional judgment. The model may have hallucinations and errors, please fact-check the outputs.
198
+
199
+ ## 技术框架版本 / Framework Versions
200
+
201
+ - **Transformers**: 4.55.0
202
+ - **PyTorch**: 2.6.0+cu124
203
+ - **Datasets**: 3.6.0
204
+ - **Tokenizers**: 0.21.1
205
+
206
+
207
+