Llama Email Fraud Detector (bf16)

A fine-tuned Llama-3.2-3B-Instruct model for structured email fraud/phishing analysis. Given an email, the model outputs a detailed JSON verdict including 11 threat type labels, a 0-100 risk score, human-readable reasoning, and a suggested action.

This is the explanation layer of a dual-model anti-fraud pipeline. The discriminative model (cunxin/roberta-email-fraud-detector, 99.5% accuracy, <50ms) provides a fast binary pre-screen; its result is passed to this generative model as a [CLASSIFIER HINT] prior. The final verdict is reconciled by the backend service.

For GPUs with limited VRAM (< 12 GB), use the AWQ 4-bit quantized version: cunxin/llama-email-fraud-detector-awq.


基于 Llama-3.2-3B-Instruct 微调的结构化邮件欺诈/钓鱼分析模型。输入一封邮件,模型输出包含 11 种威胁类型标签、0-100 风险分数、可读推理说明和建议操作的详细 JSON 判决。

本模型是双模型反欺诈流水线的解释层。判别模型(cunxin/roberta-email-fraud-detector,99.5% 准确率,<50ms)提供快速二元预筛,其结果以 [CLASSIFIER HINT] 先验传入本生成式模型。最终判决由后端服务融合。

低显存 GPU(< 12 GB)请使用 AWQ 4-bit 量化版:cunxin/llama-email-fraud-detector-awq

Model Details / 模型详情

Architecture LlamaForCausalLM (Decoder-only Transformer with RoPE, GQA, SwiGLU)
Base Model meta-llama/Llama-3.2-3B-Instruct
Parameters 3,237,063,680 (3.2B)
Fine-Tuning LoRA (r=16, alpha=32, dropout=0.1) merged into base weights
Trainable Parameters 24,313,856 (0.75% of total)
Precision bfloat16
Model Size 6.4 GB
Context Window 4,096 tokens (inference) / 2,048 tokens (training)
Vocabulary 128,256 tokens

Lineage / 模型血统

meta-llama/Llama-3.2-3B-Instruct (Meta AI — instruction-tuned on 3T tokens)
    │
    ├── LoRA fine-tuning (r=16, alpha=32, 7 target modules)
    │     Training: ~12K email conversations with structured JSON labels
    │     Hint injection: 75% correct / 15% adversarial / 10% no hint
    │
    ├── Merge LoRA adapters into base weights
    │
    ├─► cunxin/llama-email-fraud-detector (this model, bf16, 6.4 GB)
    │
    └─► cunxin/llama-email-fraud-detector-awq (AWQ 4-bit, 2.2 GB)

Output Format / 输出格式

The model outputs structured JSON with 6 fields:

模型输出包含 6 个字段的结构化 JSON:

{
  "is_fraud": true,
  "risk_score": 95,
  "confidence_level": 0.97,
  "detected_threats": ["DOMAIN_MISMATCH", "CREDENTIAL_REQUEST", "URGENCY_FEAR"],
  "reason": "The sender domain 'amaz0n-verify.com' typosquats amazon.com. The email requests account credentials via a suspicious URL and uses urgency tactics to pressure immediate action.",
  "suggestion": "Do not click any links. Do not enter any credentials. Report this email as phishing to your IT department."
}

Threat Types & Scoring / 威胁类型与评分

The model detects 11 threat categories. Risk score = sum of triggered threat points (capped at 100).

模型检测 11 种威胁类别。风险分数 = 触发的威胁点数之和(上限 100)。

Label / 标签 Points / 分值 Description / 描述
CREDENTIAL_REQUEST 35 Asks for passwords, SSN, credit card numbers / 请求密码、身份证号、信用卡号
DOMAIN_MISMATCH 30 Sender domain does not match claimed organization / 发件人域名与声称的组织不匹配
URL_DISCREPANCY 30 Links point to suspicious or mismatched domains / 链接指向可疑或不匹配的域名
TOO_GOOD_TO_BE_TRUE 30 Unrealistic promises (lottery wins, free money) / 不切实际的承诺(中奖、免费资金)
PROMPT_INJECTION 30 Attempts to manipulate AI analysis / 试图操纵 AI 分析
URGENCY_FEAR 15 Pressure tactics ("act now", "account suspended") / 施压策略("立即行动"、"账号已冻结")
REPLY_TO_MISMATCH 15 Reply-To address differs from sender / 回复地址与发件人不同
GENERIC_SALUTATION 8 Impersonal greeting ("Dear Customer") / 非个人化称呼("尊敬的客户")
ANOMALOUS_TIMING 8 Sent at unusual hours for the timezone / 在不寻常的时间发送
MISSING_SIGNATURE 8 No professional email signature / 缺少专业邮件签名
GRAMMAR_ANOMALY 5 Unusual grammar or spelling patterns / 异常的语法或拼写模式

RULE D: Any high-weight threat (>=30 pts) forces is_fraud = true.

规则 D:任何高权重威胁(>=30 分)强制 is_fraud = true

Dual-Model Pipeline / 双模型流水线

This model is designed to work with cunxin/roberta-email-fraud-detector in a reconciled pipeline:

本模型设计为与 cunxin/roberta-email-fraud-detector 配合使用:

Email Input
    │
    ├──► RoBERTa (discriminative, <50ms)
    │         │
    │         ▼
    │     is_fraud=True/False, confidence=0.97, risk_score=99
    │         │
    │         ▼  [CLASSIFIER HINT]
    ├──► Llama (generative, ~1-3s) ◄── this model
    │         │
    │         ▼
    │     Full JSON analysis (11 threat types, reasoning, suggestion)
    │
    ▼
Reconciliation (backend)
    │
    ▼
Final verdict

STEP 4 Hint Rules / 提示规则

When a [CLASSIFIER HINT] is provided, the model applies these rules after its own independent analysis:

Scenario / 场景 Action / 操作
Both agree / 两者一致 Keep generative result / 保留生成式结果
Hint=FRAUD, gen=safe, risk < 40 Follow hint (classifier caught something subtle) / 遵循提示
Hint=FRAUD, gen=safe, RULE D triggered Override hint (generative has hard evidence) / 推翻提示
Hint=NOT FRAUD, gen=fraud, RULE D triggered Override hint (generative has hard evidence) / 推翻提示
Hint=NOT FRAUD, gen=fraud, risk < 60 Follow hint (only weak signals) / 遵循提示

Usage / 使用方法

With vLLM (Recommended) / 使用 vLLM(推荐)

# Set in .env
MODEL_PATH=cunxin/llama-email-fraud-detector

# Start service
docker compose --profile gpu up -d

# Test
curl -X POST http://localhost:8000/predict_generative \
  -H "Content-Type: application/json" \
  -d '{"sender":"security@amaz0n-verify.com","subject":"URGENT: Account locked","content":"Click to verify: http://amaz0n-secure.xyz"}'

With Transformers / 使用 Transformers

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch, json

model_name = "cunxin/llama-email-fraud-detector"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto")

email = json.dumps({
    "date": "2026-02-25T10:00:00Z",
    "sender": "security@amaz0n-verify.com",
    "recipient": "you@example.com",
    "subject": "URGENT: Your account has been locked",
    "content": "Click here to verify: http://amaz0n-secure.xyz/verify"
})

messages = [
    {"role": "system", "content": "You are an email fraud analyst. Analyze the email and return a JSON verdict."},
    {"role": "user", "content": f"Analyze the following email:\n{email}"}
]

input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
output = model.generate(input_ids, max_new_tokens=512, temperature=0.1)
response = tokenizer.decode(output[0][input_ids.shape[-1]:], skip_special_tokens=True)
print(response)

Training / 训练详情

Fine-Tuning Method / 微调方法

Method LoRA (Low-Rank Adaptation) via peft + SFT via trl.SFTTrainer
LoRA Rank 16
LoRA Alpha 32 (effective scale = 2x)
LoRA Dropout 0.1
Target Modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Optimizer AdamW (weight_decay=0.05, label_smoothing=0.05)
LR Schedule Cosine with 10% linear warmup
Learning Rate 2e-4
Epochs 3
Batch Size 2 (grad_accum=8, effective=16)
Max Sequence Length 2,048 tokens
Mixed Precision FP16 (CUDA)
Loss Causal LM (next-token prediction on assistant turn only)

Training Data / 训练数据

~12,000 email conversations generated from the Enron corpus and synthetic sources:

约 12,000 个邮件对话,从 Enron 语料库和合成数据生成:

Source / 来源 Count / 数量 Role / 用途
Fraud emails (train) ~500 per class Primary training / 主要训练集
Normal emails (train) ~500 per class Primary training / 主要训练集
Correction pool (optional) ~14K fraud + ~15K normal Extra training via --include-correction
AI-generated modern emails ~800 per class Extra training via --include-aigen

Each email is converted into a 3-turn chat conversation (system prompt -> user email -> assistant JSON verdict) with:

  • 75% include a correct [CLASSIFIER HINT] (teach trust of accurate priors)
  • 15% include an adversarial wrong hint (teach RULE D override)
  • 10% have no hint (teach independent reasoning)
  • 60% of fraud examples include [HEURISTIC ANALYSIS] context

Hardware Requirements / 硬件要求

Configuration / 配置 VRAM / 显存
bf16 (this model) >= 12 GB (RTX 3060 desktop, RTX 4070+)
AWQ 4-bit (llama-email-fraud-detector-awq) >= 6 GB (RTX 3050, 3060 laptop)

Intended Use / 预期用途

  • Email fraud/phishing detection with detailed threat analysis and human-readable reasoning / 提供详细威胁分析和可读推理的邮件欺诈/钓鱼检测
  • Explanation layer for automated email security systems / 自动化邮件安全系统的解释层
  • Part of a multi-model pipeline (discriminative pre-screen + generative analysis + reconciliation) / 多模型流水线的组成部分
  • Microsoft Office Add-in integration for Outlook/Word / Microsoft Office 插件集成

Limitations / 局限性

  • Primarily trained on English emails; may underperform on other languages / 主要在英文邮件上训练
  • Training data includes Enron corpus (early 2000s); modern attack patterns partially covered by AI-generated synthetic data / 训练数据包含 Enron 语料库,现代攻击模式部分由 AI 生成合成数据覆盖
  • Inference latency ~1-3s per email (use RoBERTa for real-time filtering) / 推理延迟约 1-3 秒/封
  • Structured JSON output depends on prompt engineering; edge cases may produce malformed JSON / 边缘情况可能产生格式错误的 JSON

Related Models / 相关模型

Model / 模型 Type / 类型 Size / 大小 Speed / 速度 Use Case / 用途
cunxin/roberta-email-fraud-detector Discriminative 475 MB <50ms Fast binary pre-screen / 快速二元预筛
cunxin/llama-email-fraud-detector (this) Generative 6.4 GB ~1-3s Detailed threat analysis / 详细威胁分析
cunxin/llama-email-fraud-detector-awq Generative (quantized) 2.2 GB ~1-3s Same as above, for low VRAM / 同上,低显存版

Citation / 引用

@misc{cunxin2025llama-email-fraud,
  title={Llama Email Fraud Detector},
  author={cunxin},
  year={2025},
  url={https://huggingface.co/cunxin/llama-email-fraud-detector}
}
Downloads last month
600
Safetensors
Model size
3B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for cunxin/llama-email-fraud-detector

Adapter
(670)
this model
Quantizations
1 model

Dataset used to train cunxin/llama-email-fraud-detector

Evaluation results

  • Threat Types on Enron Email + Synthetic (held-out test set)
    self-reported
    11.000