| | --- |
| | base_model: unsloth/DeepSeek-R1-0528-Qwen3-8B |
| | tags: |
| | - text-generation-inference |
| | - transformers |
| | - unsloth |
| | - qwen3 |
| | license: apache-2.0 |
| | language: |
| | - en |
| | datasets: |
| | - piyawudk/spam-ham-reasoning-dataset-small |
| | pipeline_tag: text-classification |
| | --- |
| | |
| | # Phishing Detection via Reasoning LLM |
| |
|
| | ### **Why Phishing Matters?** |
| | - Phishing attacks are becoming more widespread due to the rapid growth of the internet. |
| | - These attacks cause billions of dollars in losses every year. |
| | - Traditional research has relied on: |
| | - Statistical methods |
| | - Transformer models |
| | - While these methods achieve strong predictive accuracy, they lack clear justifications for their classifications. |
| |
|
| | --- |
| |
|
| | ### **Enter Large Language Models (LLMs)** |
| | - LLMs show strong potential for textual analysis. |
| | - Especially promising are reasoning-based LLMs: |
| | - They can break down complex problems into step-by-step reasoning. |
| | - This study explores fine-tuning LLMs for phishing and scam detection using the Qwen3-8B model. |
| |
|
| | --- |
| |
|
| | ## **Research Focus** |
| | The author compares three main techniques for improving phishing detection: |
| | 1. Training Methods |
| | - Supervised Fine-Tuning (SFT): mimics expert-labelled data. |
| | - Guided Reinforcement Learning (GRPO): explores and adapts through self-improvement. |
| |
|
| | 2. Model Starting Point |
| | - Fine-tuning a raw base model. |
| | - Fine-tuning an instruction-aware assistant (already aligned to follow directions). |
| |
|
| | 3. Verification Layer |
| | - Adding a verifier to refine or correct the model’s first response. |
| |
|
| | --- |
| |
|
| | ## **Evaluation & Dataset** |
| | - Models were tested against: |
| | - ML methods (like logistic regression) |
| | - BERT and ModernBERT |
| | - Other proprietary LLMs (like OpenAI and Gemini) and open-source LLMs (DeepSeek R1 and Qwen3) |
| | - A [new dataset](https://huggingface.co/datasets/piyawudk/spam-ham-reasoning-dataset-small) was created from a public scam-reporting forum to ensure recency and relevance. |
| |
|
| | --- |
| |
|
| | ## **Key Findings** |
| | 1. SFT vs GRPO |
| | - SFT: Higher recall (catches more phishing attempts). |
| | - GRPO: Higher precision (reduces false positives). |
| | - Trade-off: sensitivity vs reliability. |
| |
|
| | 2. Starting Point Matters |
| | - Beginning with an instruction-tuned model is critical for success. |
| |
|
| | 3. Verifier Effects |
| | - A verifier doesn’t boost accuracy overall. |
| | - Instead, it acts as a “specialisation amplifier”, reinforcing each model’s natural strengths and weaknesses. |
| |
|
| | --- |
| |
|
| | ## **Takeaways** |
| | - Fine-tuned open-source LLMs still trail behind simple ML models in raw performance. |
| | - However, they excel in providing transparent, context-based justifications for their classifications. |
| | - Proprietary LLMs outperform all tested models, showing that with the right methods, LLMs can: |
| | - Accurately detect fraudulent texts |
| | - Explain their reasoning |
| | - This opens a promising direction for future phishing detection research. |
| |
|
| | --- |
| |
|
| | ## Results |
| | (Read the paper for the full results and analysis.) |
| |  |
| |
|
| | --- |
| |
|
| | ## Usage |
| | After converting to GGUF, you can use this model via Ollama. See [this collection](https://huggingface.co/collections/piyawudk/phishme-6870368402b51dfe8cae622e) for Ollama makefile and run! |
| |
|
| | Note: this model was fine-tuned using the [Unsloth framework](https://unsloth.ai/) |