Create README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,149 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
language:
|
| 4 |
+
- en
|
| 5 |
+
pipeline_tag: text-generation
|
| 6 |
+
tags:
|
| 7 |
+
- qwen
|
| 8 |
+
- qwen2.5
|
| 9 |
+
- sentiment-analysis
|
| 10 |
+
- chinese
|
| 11 |
+
- freeze-tuning
|
| 12 |
+
- llama-factory
|
| 13 |
+
metrics:
|
| 14 |
+
- accuracy
|
| 15 |
+
base_model:
|
| 16 |
+
- Qwen/Qwen2.5-Coder-1.5B-Instruct
|
| 17 |
+
---
|
| 18 |
+
|
| 19 |
+
# Qwen2.5-Coder-Sentiment-Freeze
|
| 20 |
+
This is a fine-tuned version of `Qwen/Qwen2.5-Coder-1.5B-Instruct` specialized for **Chinese sentiment analysis**.
|
| 21 |
+
|
| 22 |
+
The model was trained using the efficient **freeze training** method on the **ChnSentiCorp** dataset. By training only the last 6 layers, this approach achieved a significant performance boost, increasing accuracy from **91.6% to 97.8%** on the evaluation set.
|
| 23 |
+
|
| 24 |
+
This model is designed to classify Chinese text into **positive (1)** or **negative (0)** sentiment and output the result in a clean JSON format.
|
| 25 |
+
|
| 26 |
+
## 🚀 Full Tutorial & Repository
|
| 27 |
+
|
| 28 |
+
This model was trained as part of a comprehensive, beginner-friendly tutorial that walks through every step of the process, from data preparation to evaluation and deployment.
|
| 29 |
+
|
| 30 |
+
### 👉 [**GitHub Repository: IIIIQIIII/MSJ-Factory**](https://github.com/IIIIQIIII/MSJ-Factory)
|
| 31 |
+
|
| 32 |
+
The repository includes:
|
| 33 |
+
- 💻 A step-by-step Google Colab notebook.
|
| 34 |
+
- ⚙️ All training and evaluation scripts.
|
| 35 |
+
- 📊 Detailed explanations of the training method and results.
|
| 36 |
+
|
| 37 |
+
**If you find this model or the tutorial helpful, please give the repository a ⭐️ star! It helps support the author's work.**
|
| 38 |
+
|
| 39 |
+
[](https://github.com/IIIIQIIII/MSJ-Factory)
|
| 40 |
+
[](https://colab.research.google.com/github/IIIIQIIII/MSJ-Factory/blob/main/Qwen2_5_Sentiment_Fine_tuning_Tutorial.ipynb)
|
| 41 |
+
|
| 42 |
+
## 💡 How to Use
|
| 43 |
+
This model follows a specific instruction format to ensure reliable JSON output. Use the prompt template below for the best results.
|
| 44 |
+
|
| 45 |
+
```python
|
| 46 |
+
import torch
|
| 47 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 48 |
+
|
| 49 |
+
# Model repository on Hugging Face Hub
|
| 50 |
+
model_name = "FutureMa/Qwen2.5-Coder-Sentiment-Freeze"
|
| 51 |
+
|
| 52 |
+
# Load tokenizer and model
|
| 53 |
+
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
| 54 |
+
model = AutoModelForCausalLM.from_pretrained(
|
| 55 |
+
model_name,
|
| 56 |
+
torch_dtype=torch.bfloat16,
|
| 57 |
+
device_map="auto"
|
| 58 |
+
)
|
| 59 |
+
|
| 60 |
+
# --- Define the text for sentiment analysis ---
|
| 61 |
+
text = "这个酒店的服务态度非常好,房间也很干净!" # Example: "The service at this hotel was excellent, and the room was very clean!"
|
| 62 |
+
|
| 63 |
+
# --- Create the prompt using the required template ---
|
| 64 |
+
prompt = f"""请对以下中文文本进行情感分析,判断其情感倾向。
|
| 65 |
+
|
| 66 |
+
任务说明:
|
| 67 |
+
- 分析文本表达的整体情感态度
|
| 68 |
+
- 判断是正面(1)还是负面(0)
|
| 69 |
+
|
| 70 |
+
文本内容:
|
| 71 |
+
```sentence
|
| 72 |
+
{text}
|
| 73 |
+
```
|
| 74 |
+
|
| 75 |
+
输出格式:
|
| 76 |
+
```json
|
| 77 |
+
{{
|
| 78 |
+
"sentiment": 0 or 1
|
| 79 |
+
}}
|
| 80 |
+
```"""
|
| 81 |
+
|
| 82 |
+
messages = [{"role": "user", "content": prompt}]
|
| 83 |
+
text_input = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
|
| 84 |
+
model_inputs = tokenizer([text_input], return_tensors="pt").to(model.device)
|
| 85 |
+
|
| 86 |
+
# --- Generate the response ---
|
| 87 |
+
generated_ids = model.generate(
|
| 88 |
+
**model_inputs,
|
| 89 |
+
max_new_tokens=256,
|
| 90 |
+
temperature=0.1
|
| 91 |
+
)
|
| 92 |
+
response = tokenizer.batch_decode(generated_ids[:, model_inputs.input_ids.shape[1]:], skip_special_tokens=True)[0]
|
| 93 |
+
|
| 94 |
+
print(f"Input Text: {text}")
|
| 95 |
+
print(f"Model Output:\n{response}")
|
| 96 |
+
|
| 97 |
+
# Expected output for the example text:
|
| 98 |
+
# Input Text: 这个酒店的服务态度非常好,房间也很干净!
|
| 99 |
+
# Model Output:
|
| 100 |
+
# ```json
|
| 101 |
+
# {
|
| 102 |
+
# "sentiment": 1
|
| 103 |
+
# }
|
| 104 |
+
# ```
|
| 105 |
+
```
|
| 106 |
+
|
| 107 |
+
## 📊 Evaluation Results
|
| 108 |
+
|
| 109 |
+
The fine-tuned model shows significant improvements across all key metrics compared to the base model. It demonstrates perfect precision, meaning it makes no false positive predictions on the test set.
|
| 110 |
+
|
| 111 |
+
| Model | Accuracy | Precision | Recall | F1-Score |
|
| 112 |
+
|--------------------------------------|----------|-----------|----------|----------|
|
| 113 |
+
| Base Model (Qwen2.5-Coder-1.5B) | 91.62% | 98.57% | 83.13% | 90.20% |
|
| 114 |
+
| **This Model (Fine-tuned)** | **97.77%** | **100.00%** | **95.18%** | **97.53%** |
|
| 115 |
+
|
| 116 |
+
## ⚙️ Training Details
|
| 117 |
+
|
| 118 |
+
- **Base Model**: [`Qwen/Qwen2.5-Coder-1.5B-Instruct`](https://huggingface.co/Qwen/Qwen2.5-Coder-1.5B-Instruct)
|
| 119 |
+
- **Dataset**: A balanced 3,000-sample subset of [ChnSentiCorp](https://huggingface.co/datasets/seamew/ChnSentiCorp).
|
| 120 |
+
- **Training Method**: Freeze Training (only the last 6 layers and embeddings were trained).
|
| 121 |
+
- **Framework**: [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory)
|
| 122 |
+
- **Hardware**: T4 GPU (via Google Colab)
|
| 123 |
+
- **Training Time**: ~20 minutes
|
| 124 |
+
|
| 125 |
+
## ⚠️ Limitations and Bias
|
| 126 |
+
|
| 127 |
+
- **Task-Specific**: This model is highly specialized for Chinese sentiment analysis. It may not perform well on other NLP tasks or languages without further fine-tuning.
|
| 128 |
+
- **Binary Classification**: The model is designed for binary (positive/negative) classification and does not capture neutral or mixed sentiment.
|
| 129 |
+
- **Data Bias**: The model inherits any biases present in the ChnSentiCorp dataset, which primarily consists of hotel and product reviews. Its performance may vary on text from different domains.
|
| 130 |
+
|
| 131 |
+
## Citation
|
| 132 |
+
|
| 133 |
+
If you use this model or the associated tutorial in your work, please cite the repository:
|
| 134 |
+
|
| 135 |
+
```bibtex
|
| 136 |
+
@misc{msj-factory-2025,
|
| 137 |
+
title={Qwen2.5-Coder Sentiment Analysis Fine-tuning Tutorial},
|
| 138 |
+
author={MASHIJIAN},
|
| 139 |
+
year={2025},
|
| 140 |
+
howpublished={\url{https://github.com/IIIIQIIII/MSJ-Factory}}
|
| 141 |
+
}
|
| 142 |
+
```
|
| 143 |
+
|
| 144 |
+
## Acknowledgments
|
| 145 |
+
This work would not be possible without the incredible open-source tools and models from the community:
|
| 146 |
+
- [Qwen Team](https://github.com/QwenLM) for the powerful base model.
|
| 147 |
+
- [Hugging Face](https://huggingface.co/) for the `transformers` library and model hosting.
|
| 148 |
+
- [hiyouga](https://github.com/hiyouga) for the excellent `LLaMA-Factory` framework.
|
| 149 |
+
- Google Colab for providing accessible GPU resources.
|