File size: 1,333 Bytes
1e9eb5d 138a7bf 1e9eb5d 138a7bf 1e9eb5d 138a7bf 1e9eb5d 138a7bf ecfd7d2 138a7bf ecfd7d2 138a7bf ecfd7d2 138a7bf ecfd7d2 138a7bf ecfd7d2 138a7bf ecfd7d2 138a7bf ecfd7d2 138a7bf ecfd7d2 138a7bf ecfd7d2 138a7bf ecfd7d2 138a7bf ecfd7d2 138a7bf ecfd7d2 138a7bf | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 | ---
license: mit
language:
- zh
- en
tags:
- url-classification
- list-page-detection
- detail-page-detection
- qwen
- fine-tuning
- lora
- url-parser
- peft
base_model: Qwen/Qwen2.5-1.5B
---
# URL Page Type Classifier (LoRA)
基于 Qwen2.5-1.5B + LoRA 微调的URL类型分类模型,用于判断URL是列表页还是详情页。
## 模型信息
| 项目 | 详情 |
|------|------|
| **基础模型** | Qwen/Qwen2.5-1.5B |
| **微调方法** | LoRA (r=16, alpha=32) |
| **可训练参数** | ~18M (1.18%) |
## 性能测试
| 测试集 | 样本数 | 准确率 |
|--------|--------|--------|
| 训练数据 | 100 | **100%** |
| 随机生成URL | 1000 | **100%** |
## 使用方法
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
# 加载基础模型
base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-1.5B", device_map="auto", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-1.5B", trust_remote_code=True)
# 加载LoRA
model = PeftModel.from_pretrained(base_model, "windlx/url-classifier-lora")
model.eval()
# 推理
url = "https://example.com/product/12345"
# ... (推理代码)
```
## 相关链接
- **Merged模型**: https://huggingface.co/windlx/url-classifier-model
- **GitHub**: https://github.com/xiuxiu/url-classifier
|