---
license: gemma
language:
- ko
pipeline_tag: text-generation
tags:
- spam-detection
- explainable-ai
- on-device
- korean
datasets:
- Devocean-06/Spam_QA-Corpus
---

<p align="left">
  <img src="https://huggingface.co/Devocean-06/Spam_Filter-gemma/resolve/main/skitty.png" width="50%"/>
</p>

# Devocean-06/Spam_Filter-gemma

> Update @ 2025.10.19: First release of Spam filter XAI

<!-- Provide a quick summary of what the model is/does. -->

**Resources and Technical Documentation**:
* [Gemma3 Model](https://huggingface.co/google/gemma-3-4b-it)
* [Training Dataset](https://huggingface.co/datasets/Devocean-06/Spam_QA-Corpus)

**Model Developers**: SK Devoceon-06 On device LLM

## Model Information

- Skitty is an explainable small language model (sLLM) that classifies spam messages and provides brief reasoning for each decision.

---

## Description

- Skitty was trained on an updated 2025 spam message dataset collected through the Smart Police Big Data Platform in South Korea.  
- The model leverages deduplication, curriculum sampling, and off-policy distillation to improve both classification accuracy and interpretability.

## Data and Preprocessing

- **Data source**: 2025 Smart Police Big Data Platform spam message dataset  
- **Dataset**: [Devocean-06/Spam_QA-Corpus](https://huggingface.co/datasets/Devocean-06/Spam_QA-Corpus)
- **Format**: Alpaca instruction format (instruction, input, output)
- **Deduplication**: Performed near-duplicate removal using SimHash filtering  
- **Sampling strategy**: Applied curriculum-based sampling to control difficulty and improve generalization  
- **Labeling**: Trained using hard-label supervision after label confidence refinement

## Training and Distillation

- Utilized off-policy distillation to compress the decision process of a large teacher LLM into a smaller student model  
- Instead of directly mimicking the teacher's text generation, the model distills the reasoning trace for spam detection  
- Combined curriculum learning with hard-label distillation to balance accuracy, interpretability, and generalization

---

## Training Configuration

### Base Model
- **Base Model**: [google/gemma-3-4b-it](https://huggingface.co/google/gemma-3-4b-it)
- **Training Framework**: [Axolotl](https://github.com/OpenAccess-AI-Collective/axolotl)
- **Fine-tuning Method**: QLoRA (Quantized Low-Rank Adaptation)

### Hyperparameters

| Parameter | Value | Description |
|-----------|-------|-------------|
| **Quantization** | 4-bit | Load pretrained model in 4-bit |
| **Adapter** | QLoRA | Low-rank adaptation method |
| **LoRA Rank (r)** | 16 | Rank of low-rank matrices |
| **LoRA Alpha** | 32 | Scaling factor for LoRA |
| **LoRA Dropout** | 0.05 | Dropout rate for LoRA layers |
| **Target Modules** | attention + MLP | Applied to q,k,v,o,up,down,gate projections |
| **Sequence Length** | 1500 | Maximum input sequence length |
| **Sample Packing** | True | Pack multiple samples into one sequence |
| **Micro Batch Size** | 10 | Batch size per GPU |
| **Gradient Accumulation** | 15 | Effective batch size: 150 |
| **Number of Epochs** | 5 | Total training epochs |
| **Learning Rate** | 2e-5 | Peak learning rate |
| **LR Scheduler** | Cosine | Cosine annealing schedule |
| **Warmup Steps** | 10 | Learning rate warmup steps |
| **Optimizer** | AdamW (8-bit) | 8-bit quantized AdamW |
| **Weight Decay** | 0.0 | L2 regularization |
| **Precision** | BF16 | Brain floating point 16 |
| **Gradient Checkpointing** | True | Save memory by recomputing gradients |
| **Flash Attention** | True | Optimized attention kernel |

### Training Monitoring
- **Logging Steps**: 100
- **Evaluation Steps**: 50
- **Save Steps**: 50
- **Evaluation Strategy**: Steps-based
- **Tracking**: Weights & Biases (wandb)

---

## Running with the `vllm` API

You can initialize the model and processor for inference with `pipeline` as follows.

```sh
vllm serve Devocean-06/Spam_Filter-gemma
```

```python
from openai import OpenAI

client = OpenAI(
    base_url="model-endpoint",
    api_key="api-key"  
)

SYSTEM_PROMPT = """당신은 스팸 문자로 판정한 근거를 생성하는 대형 언어 모델입니다.
아래 기준에 따라 스팸여부 판정의 근거를 간단명료하게 한 문장으로 작성해 주세요. 출력 포맷은 XAI 설명에 적합하도록 일관성 있게 템플릿 형식으로 고정되어야 하며, 스팸 여부 및 그 근거를 명쾌하게 제시해야 합니다.

**1. 판정 근거(한 문장, 템플릿):**
- **개인 정보 요구:** 신분증, 비밀번호, 카드 번호 등 개인 정보를 요구했기 때문입니다.
- **기타 특이사항:** 위 항목 외에 스팸으로 의심되는 다른 패턴이 있습니다.
- **발신자/수신자:** 발신 번호가 일반적이지 않거나 불분명하기 때문입니다.
- **내용의 목적:** 금융 상품, 대출, 도박, 투자, 불법 복제 등의 홍보나 권유가 포함되어 있기 때문입니다.
- **심리적 압박:** 긴급성, 공포, 호기심을 유발하여 즉각적인 행동을 유도했기 때문입니다. (예: "기간 한정", "지금 즉시", "클릭하지 않으면 불이익")
- **링크/URL:** 일반적이지 않은 짧은 URL, 단축 URL 또는 의심스러운 링크가 포함되어 있기 때문입니다.

**2. 필수 조건**
- 반드시 출력 형식에 따라서 [스팸 판정 이유] 템플릿을 사용해야 합니다.
- 스팸으로 판정한 이유에 대해서 구체적인 이유로 100자 이상으로 설명해야 합니다.
- 반드시 위 판정 근거를 먼저 언급한 뒤에 출력 형식에 맞게 스팸 판정 이유를 생성해야 합니다.
- 스팸 판정 이유 생성 시, 위 스팸 문자는 ~~ 으로 시작해야합니다.
- 그리고 전제조건은 모두 스팸 문자로 분류된 형식이니 스팸이 아니라고 언급하면 안됩니다.

### 출력 형식 예시
- 판정 근거 : 개인정보 요구
- 스팸 판정 이유: 위 스팸 문자는 개인정보를 요구하는 스팸으로 아파트 분양 및 부동산 투자 권유가 포함되어 있으며, 긴급성을 강조하여 즉각적인 행동을 유도하고 있습니다."""

response = client.chat.completions.create(
        model="Devocean-06/Spam_Filter-gemma",
        messages=[
            {"role": "system", "content": SYSTEM_PROMPT},
            {"role": "user", "content": user_message}
        ],
        temperature=0.7,
        max_tokens=2048
    )
print(response.choices[0].message.content)

```
## 🧠 Example Output
```sh
- 판정 근거: 내용의 목적  
- 스팸 판정 이유: 위 스팸 문자는 금융 상품과 대출 관련 권유 내용을 포함하고 있으며,
  ‘지금 바로’, ‘즉시 신청’과 같은 심리적 압박 어구를 사용하여 수신자의 행동을 유도하고 있습니다.
```

---

## Software

Training was conducted using the **Axolotl framework**, a flexible and efficient fine-tuning system designed for large language models.

Axolotl enables seamless configuration and execution of full fine-tuning, LoRA, and DPO pipelines through simple YAML-based workflows. It integrates with PyTorch and Hugging Face Transformers, supporting distributed strategies such as FSDP and DeepSpeed for optimized performance on multi-GPU environments.

This framework streamlines experimentation and scaling by allowing researchers to define training parameters, datasets, and model behaviors declaratively — reducing boilerplate and ensuring reproducible results across setups.

**Key Features Used:**
- QLoRA for parameter-efficient fine-tuning
- 4-bit quantization during training
- Flash Attention for faster training
- Gradient checkpointing for memory efficiency
- Alpaca dataset format support

---

## Citation

```bibtex
@misc{Devocean-06/Spam_Filter-gemma,
  author       = { {SK Devoceon-06 On device LLM} },
  title        = { Spam filter & XAI },
  year         = 2025,
  url          = { https://huggingface.co/Devocean-06/Spam_Filter-gemma },
  publisher    = { Hugging Face }
}
```

---

## License

This model is released under the Gemma license. Please refer to the original [Gemma license](https://ai.google.dev/gemma/terms) for usage terms and conditions.