File size: 3,312 Bytes
6617c11
 
 
 
 
 
 
 
 
 
354fbb6
 
 
6617c11
f3e611a
354fbb6
6617c11
354fbb6
 
 
 
 
 
 
6617c11
354fbb6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6617c11
354fbb6
 
6617c11
354fbb6
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
---
base_model: unsloth/DeepSeek-R1-0528-Qwen3-8B
tags:
- text-generation-inference
- transformers
- unsloth
- qwen3
license: apache-2.0
language:
- en
datasets:
- piyawudk/spam-ham-reasoning-dataset-small
pipeline_tag: text-classification
---

# Phishing Detection via Reasoning LLM

### **Why Phishing Matters?**
- Phishing attacks are becoming more widespread due to the rapid growth of the internet.
- These attacks cause billions of dollars in losses every year.
- Traditional research has relied on:
  - Statistical methods
  - Transformer models
- While these methods achieve strong predictive accuracy, they lack clear justifications for their classifications.

---

### **Enter Large Language Models (LLMs)**
- LLMs show strong potential for textual analysis.
- Especially promising are reasoning-based LLMs:
  - They can break down complex problems into step-by-step reasoning.
- This study explores fine-tuning LLMs for phishing and scam detection using the Qwen3-8B model.

---

## **Research Focus**
The author compares three main techniques for improving phishing detection:
1. Training Methods
  - Supervised Fine-Tuning (SFT): mimics expert-labelled data.
  - Guided Reinforcement Learning (GRPO): explores and adapts through self-improvement.

2. Model Starting Point
  - Fine-tuning a raw base model.
  - Fine-tuning an instruction-aware assistant (already aligned to follow directions).

3. Verification Layer
  - Adding a verifier to refine or correct the model’s first response.

---

## **Evaluation & Dataset**
- Models were tested against:
  - ML methods (like logistic regression)
  - BERT and ModernBERT
  - Other proprietary LLMs (like OpenAI and Gemini) and open-source LLMs (DeepSeek R1 and Qwen3)
- A [new dataset](https://huggingface.co/datasets/piyawudk/spam-ham-reasoning-dataset-small) was created from a public scam-reporting forum to ensure recency and relevance.

---

## **Key Findings**
1. SFT vs GRPO
  - SFT: Higher recall (catches more phishing attempts).
  - GRPO: Higher precision (reduces false positives).
  - Trade-off: sensitivity vs reliability.

2. Starting Point Matters
  - Beginning with an instruction-tuned model is critical for success.

3. Verifier Effects
  - A verifier doesn’t boost accuracy overall.
  - Instead, it acts as a “specialisation amplifier”, reinforcing each model’s natural strengths and weaknesses.

---

## **Takeaways**
- Fine-tuned open-source LLMs still trail behind simple ML models in raw performance.
- However, they excel in providing transparent, context-based justifications for their classifications.
- Proprietary LLMs outperform all tested models, showing that with the right methods, LLMs can:
  - Accurately detect fraudulent texts
  - Explain their reasoning
- This opens a promising direction for future phishing detection research.

---

## Results
(Read the paper for the full results and analysis.)
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6796c6d8bf532f775c5b31ee/8Qju57zO1DmpQ51qJb7kH.png)

---

## Usage
After converting to GGUF, you can use this model via Ollama. See [this collection](https://huggingface.co/collections/piyawudk/phishme-6870368402b51dfe8cae622e) for Ollama makefile and run! 

Note: this model was fine-tuned using the [Unsloth framework](https://unsloth.ai/)