Spam_Filter-gemma / README.md

Update README.md

cbcd3f7 verified 3 months ago

8.24 kB

	---
	license: gemma
	language:
	- ko
	pipeline_tag: text-generation
	tags:
	- spam-detection
	- explainable-ai
	- on-device
	- korean
	datasets:
	- Devocean-06/Spam_QA-Corpus
	---

	<p align="left">
	<img src="https://huggingface.co/Devocean-06/Spam_Filter-gemma/resolve/main/skitty.png" width="50%"/>
	</p>

	# Devocean-06/Spam_Filter-gemma

	> Update @ 2025.10.19: First release of Spam filter XAI

	<!-- Provide a quick summary of what the model is/does. -->

	Resources and Technical Documentation:
	* [Gemma3 Model](https://huggingface.co/google/gemma-3-4b-it)
	* [Training Dataset](https://huggingface.co/datasets/Devocean-06/Spam_QA-Corpus)

	Model Developers: SK Devoceon-06 On device LLM

	## Model Information

	- Skitty is an explainable small language model (sLLM) that classifies spam messages and provides brief reasoning for each decision.

	---

	## Description

	- Skitty was trained on an updated 2025 spam message dataset collected through the Smart Police Big Data Platform in South Korea.
	- The model leverages deduplication, curriculum sampling, and off-policy distillation to improve both classification accuracy and interpretability.

	## Data and Preprocessing

	- Data source: 2025 Smart Police Big Data Platform spam message dataset
	- Dataset: [Devocean-06/Spam_QA-Corpus](https://huggingface.co/datasets/Devocean-06/Spam_QA-Corpus)
	- Format: Alpaca instruction format (instruction, input, output)
	- Deduplication: Performed near-duplicate removal using SimHash filtering
	- Sampling strategy: Applied curriculum-based sampling to control difficulty and improve generalization
	- Labeling: Trained using hard-label supervision after label confidence refinement

	## Training and Distillation

	- Utilized off-policy distillation to compress the decision process of a large teacher LLM into a smaller student model
	- Instead of directly mimicking the teacher's text generation, the model distills the reasoning trace for spam detection
	- Combined curriculum learning with hard-label distillation to balance accuracy, interpretability, and generalization

	---

	## Training Configuration

	### Base Model
	- Base Model: [google/gemma-3-4b-it](https://huggingface.co/google/gemma-3-4b-it)
	- Training Framework: [Axolotl](https://github.com/OpenAccess-AI-Collective/axolotl)
	- Fine-tuning Method: QLoRA (Quantized Low-Rank Adaptation)

	### Hyperparameters

	\| Parameter \| Value \| Description \|
	\|-----------\|-------\|-------------\|
	\| Quantization \| 4-bit \| Load pretrained model in 4-bit \|
	\| Adapter \| QLoRA \| Low-rank adaptation method \|
	\| LoRA Rank (r) \| 16 \| Rank of low-rank matrices \|
	\| LoRA Alpha \| 32 \| Scaling factor for LoRA \|
	\| LoRA Dropout \| 0.05 \| Dropout rate for LoRA layers \|
	\| Target Modules \| attention + MLP \| Applied to q,k,v,o,up,down,gate projections \|
	\| Sequence Length \| 1500 \| Maximum input sequence length \|
	\| Sample Packing \| True \| Pack multiple samples into one sequence \|
	\| Micro Batch Size \| 10 \| Batch size per GPU \|
	\| Gradient Accumulation \| 15 \| Effective batch size: 150 \|
	\| Number of Epochs \| 5 \| Total training epochs \|
	\| Learning Rate \| 2e-5 \| Peak learning rate \|
	\| LR Scheduler \| Cosine \| Cosine annealing schedule \|
	\| Warmup Steps \| 10 \| Learning rate warmup steps \|
	\| Optimizer \| AdamW (8-bit) \| 8-bit quantized AdamW \|
	\| Weight Decay \| 0.0 \| L2 regularization \|
	\| Precision \| BF16 \| Brain floating point 16 \|
	\| Gradient Checkpointing \| True \| Save memory by recomputing gradients \|
	\| Flash Attention \| True \| Optimized attention kernel \|

	### Training Monitoring
	- Logging Steps: 100
	- Evaluation Steps: 50
	- Save Steps: 50
	- Evaluation Strategy: Steps-based
	- Tracking: Weights & Biases (wandb)

	---

	## Running with the `vllm` API

	You can initialize the model and processor for inference with `pipeline` as follows.

	```sh
	vllm serve Devocean-06/Spam_Filter-gemma
	```

	```python
	from openai import OpenAI

	client = OpenAI(
	base_url="model-endpoint",
	api_key="api-key"
	)

	SYSTEM_PROMPT = """당신은 스팸 문자로 판정한 근거를 생성하는 대형 언어 모델입니다.
	아래 기준에 따라 스팸여부 판정의 근거를 간단명료하게 한 문장으로 작성해 주세요. 출력 포맷은 XAI 설명에 적합하도록 일관성 있게 템플릿 형식으로 고정되어야 하며, 스팸 여부 및 그 근거를 명쾌하게 제시해야 합니다.

	1. 판정 근거(한 문장, 템플릿):
	- 개인 정보 요구: 신분증, 비밀번호, 카드 번호 등 개인 정보를 요구했기 때문입니다.
	- 기타 특이사항: 위 항목 외에 스팸으로 의심되는 다른 패턴이 있습니다.
	- 발신자/수신자: 발신 번호가 일반적이지 않거나 불분명하기 때문입니다.
	- 내용의 목적: 금융 상품, 대출, 도박, 투자, 불법 복제 등의 홍보나 권유가 포함되어 있기 때문입니다.
	- 심리적 압박: 긴급성, 공포, 호기심을 유발하여 즉각적인 행동을 유도했기 때문입니다. (예: "기간 한정", "지금 즉시", "클릭하지 않으면 불이익")
	- 링크/URL: 일반적이지 않은 짧은 URL, 단축 URL 또는 의심스러운 링크가 포함되어 있기 때문입니다.

	2. 필수 조건
	- 반드시 출력 형식에 따라서 [스팸 판정 이유] 템플릿을 사용해야 합니다.
	- 스팸으로 판정한 이유에 대해서 구체적인 이유로 100자 이상으로 설명해야 합니다.
	- 반드시 위 판정 근거를 먼저 언급한 뒤에 출력 형식에 맞게 스팸 판정 이유를 생성해야 합니다.
	- 스팸 판정 이유 생성 시, 위 스팸 문자는 ~~ 으로 시작해야합니다.
	- 그리고 전제조건은 모두 스팸 문자로 분류된 형식이니 스팸이 아니라고 언급하면 안됩니다.

	### 출력 형식 예시
	- 판정 근거 : 개인정보 요구
	- 스팸 판정 이유: 위 스팸 문자는 개인정보를 요구하는 스팸으로 아파트 분양 및 부동산 투자 권유가 포함되어 있으며, 긴급성을 강조하여 즉각적인 행동을 유도하고 있습니다."""

	response = client.chat.completions.create(
	model="Devocean-06/Spam_Filter-gemma",
	messages=[
	{"role": "system", "content": SYSTEM_PROMPT},
	{"role": "user", "content": user_message}
	],
	temperature=0.7,
	max_tokens=2048
	)
	print(response.choices[0].message.content)

	```
	## 🧠 Example Output
	```sh
	- 판정 근거: 내용의 목적
	- 스팸 판정 이유: 위 스팸 문자는 금융 상품과 대출 관련 권유 내용을 포함하고 있으며,
	‘지금 바로’, ‘즉시 신청’과 같은 심리적 압박 어구를 사용하여 수신자의 행동을 유도하고 있습니다.
	```

	---

	## Software

	Training was conducted using the Axolotl framework, a flexible and efficient fine-tuning system designed for large language models.

	Axolotl enables seamless configuration and execution of full fine-tuning, LoRA, and DPO pipelines through simple YAML-based workflows. It integrates with PyTorch and Hugging Face Transformers, supporting distributed strategies such as FSDP and DeepSpeed for optimized performance on multi-GPU environments.

	This framework streamlines experimentation and scaling by allowing researchers to define training parameters, datasets, and model behaviors declaratively — reducing boilerplate and ensuring reproducible results across setups.

	Key Features Used:
	- QLoRA for parameter-efficient fine-tuning
	- 4-bit quantization during training
	- Flash Attention for faster training
	- Gradient checkpointing for memory efficiency
	- Alpaca dataset format support

	---

	## Citation

	```bibtex
	@misc{Devocean-06/Spam_Filter-gemma,
	author = { {SK Devoceon-06 On device LLM} },
	title = { Spam filter & XAI },
	year = 2025,
	url = { https://huggingface.co/Devocean-06/Spam_Filter-gemma },
	publisher = { Hugging Face }
	}
	```

	---

	## License

	This model is released under the Gemma license. Please refer to the original [Gemma license](https://ai.google.dev/gemma/terms) for usage terms and conditions.