--- base_model: unsloth/DeepSeek-R1-0528-Qwen3-8B tags: - text-generation-inference - transformers - unsloth - qwen3 license: apache-2.0 language: - en datasets: - piyawudk/spam-ham-reasoning-dataset-small pipeline_tag: text-classification --- # Phishing Detection via Reasoning LLM ### **Why Phishing Matters?** - Phishing attacks are becoming more widespread due to the rapid growth of the internet. - These attacks cause billions of dollars in losses every year. - Traditional research has relied on: - Statistical methods - Transformer models - While these methods achieve strong predictive accuracy, they lack clear justifications for their classifications. --- ### **Enter Large Language Models (LLMs)** - LLMs show strong potential for textual analysis. - Especially promising are reasoning-based LLMs: - They can break down complex problems into step-by-step reasoning. - This study explores fine-tuning LLMs for phishing and scam detection using the Qwen3-8B model. --- ## **Research Focus** The author compares three main techniques for improving phishing detection: 1. Training Methods - Supervised Fine-Tuning (SFT): mimics expert-labelled data. - Guided Reinforcement Learning (GRPO): explores and adapts through self-improvement. 2. Model Starting Point - Fine-tuning a raw base model. - Fine-tuning an instruction-aware assistant (already aligned to follow directions). 3. Verification Layer - Adding a verifier to refine or correct the model’s first response. --- ## **Evaluation & Dataset** - Models were tested against: - ML methods (like logistic regression) - BERT and ModernBERT - Other proprietary LLMs (like OpenAI and Gemini) and open-source LLMs (DeepSeek R1 and Qwen3) - A [new dataset](https://huggingface.co/datasets/piyawudk/spam-ham-reasoning-dataset-small) was created from a public scam-reporting forum to ensure recency and relevance. --- ## **Key Findings** 1. SFT vs GRPO - SFT: Higher recall (catches more phishing attempts). - GRPO: Higher precision (reduces false positives). - Trade-off: sensitivity vs reliability. 2. Starting Point Matters - Beginning with an instruction-tuned model is critical for success. 3. Verifier Effects - A verifier doesn’t boost accuracy overall. - Instead, it acts as a “specialisation amplifier”, reinforcing each model’s natural strengths and weaknesses. --- ## **Takeaways** - Fine-tuned open-source LLMs still trail behind simple ML models in raw performance. - However, they excel in providing transparent, context-based justifications for their classifications. - Proprietary LLMs outperform all tested models, showing that with the right methods, LLMs can: - Accurately detect fraudulent texts - Explain their reasoning - This opens a promising direction for future phishing detection research. --- ## Results (Read the paper for the full results and analysis.) ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6796c6d8bf532f775c5b31ee/8Qju57zO1DmpQ51qJb7kH.png) --- ## Usage After converting to GGUF, you can use this model via Ollama. See [this collection](https://huggingface.co/collections/piyawudk/phishme-6870368402b51dfe8cae622e) for Ollama makefile and run! Note: this model was fine-tuned using the [Unsloth framework](https://unsloth.ai/)