|
|
--- |
|
|
base_model: fnlp/bart-large-chinese |
|
|
library_name: peft |
|
|
pipeline_tag: summarization |
|
|
--- |
|
|
|
|
|
# Model Card for LoRA Fine-tuned Chinese BART |
|
|
|
|
|
这是一个基于 [`fnlp/bart-large-chinese`](https://huggingface.co/fnlp/bart-large-chinese) 模型进行 LoRA 微调的中文摘要模型,训练任务为中文新闻标题或摘要生成,适用于中文短文本压缩和提炼。 |
|
|
|
|
|
## Model Details |
|
|
|
|
|
### Model Description |
|
|
|
|
|
本模型使用 PEFT 框架对 `fnlp/bart-large-chinese` 进行参数高效微调,采用了 LoRA(Low-Rank Adaptation)技术,仅调整注意力中的部分权重矩阵,使得训练过程更轻量。 |
|
|
|
|
|
- **Developed by:** [Your Name or Organization] |
|
|
- **Model type:** Seq2Seq(BART) |
|
|
- **Language(s):** Chinese |
|
|
- **License:** Same as base model (assumed Apache 2.0, verify if needed) |
|
|
- **Finetuned from model:** fnlp/bart-large-chinese |
|
|
|
|
|
## Uses |
|
|
|
|
|
### Direct Use |
|
|
|
|
|
可用于中文摘要任务,如新闻标题生成、内容压缩等。 |
|
|
|
|
|
### Out-of-Scope Use |
|
|
|
|
|
不适用于多语言摘要、多文档总结或事实一致性要求极高的任务。 |
|
|
|
|
|
## Bias, Risks, and Limitations |
|
|
|
|
|
该模型基于公开中文数据集进行训练,可能在处理敏感内容、歧视性语言或特定社会群体时存在偏差。 |
|
|
|
|
|
### Recommendations |
|
|
|
|
|
建议仅在清洗干净的中文文本数据上使用该模型,避免用于决策支持或敏感领域。 |
|
|
|
|
|
## How to Get Started with the Model |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM |
|
|
from peft import PeftModel |
|
|
|
|
|
base_model = AutoModelForSeq2SeqLM.from_pretrained("fnlp/bart-large-chinese") |
|
|
peft_model = PeftModel.from_pretrained(base_model, "your-username/your-model-name") |
|
|
tokenizer = AutoTokenizer.from_pretrained("fnlp/bart-large-chinese") |
|
|
|
|
|
inputs = tokenizer("据报道,苹果将在下月发布新款iPhone。", return_tensors="pt") |
|
|
summary_ids = peft_model.generate(**inputs, max_length=30) |
|
|
print(tokenizer.decode(summary_ids[0], skip_special_tokens=True)) |
|
|
```` |
|
|
|
|
|
## Training Details |
|
|
|
|
|
### Training Data |
|
|
|
|
|
微调数据来自预处理后的中文新闻摘要数据集(如LCSTS),分为训练集与验证集,并使用了 `datasets` 库保存和加载。 |
|
|
|
|
|
### Training Procedure |
|
|
|
|
|
#### Preprocessing |
|
|
|
|
|
* 使用 HuggingFace tokenizer 编码输入/输出文本 |
|
|
* 设置 `max_source_length` 和 `max_target_length` |
|
|
|
|
|
#### Training Hyperparameters |
|
|
|
|
|
* **Epochs:** 4 |
|
|
* **Batch Size:** 64 |
|
|
* **Learning Rate:** 2e-5 |
|
|
* **Evaluation Steps:** 5000 |
|
|
* **Save Steps:** 10000 |
|
|
* **Precision:** fp16 |
|
|
* **LoRA Config:** r=8, alpha=16, dropout=0.1 |
|
|
* **Target Modules:** `q_proj`, `v_proj` |
|
|
|
|
|
## Evaluation |
|
|
|
|
|
### Testing Data |
|
|
|
|
|
使用与训练集相同来源的 held-out 验证集。 |
|
|
|
|
|
### Metrics |
|
|
|
|
|
使用 ROUGE-1 / ROUGE-2 / ROUGE-L 评估自动摘要质量,中文评估按“字”或“词”颗粒度使用 `jieba` 分词。 |
|
|
|
|
|
### Results |
|
|
|
|
|
示例结果: |
|
|
|
|
|
| Metric | Score | |
|
|
| ------- | ----- | |
|
|
| ROUGE-1 | 0.35 | |
|
|
| ROUGE-2 | 0.19 | |
|
|
| ROUGE-L | 0.31 | |
|
|
|