--- language: vi tags: - spam-detection - vietnamese - transformer license: apache-2.0 datasets: - visolex/ViSpamReviews metrics: - accuracy - f1 model-index: - name: visobert-spam-classification results: - task: type: text-classification name: Spam Detection (Multi-Class) dataset: name: ViSpamReviews type: custom metrics: - name: Accuracy type: accuracy value: - name: F1 Score type: f1 value: base_model: - uitnlp/visobert pipeline_tag: text-classification --- # ViSoBERT-Spam-MultiClass Fine-tuned from [`uitnlp/visobert`](https://huggingface.co/uitnlp/visobert) on **ViSpamReviews** for **multi-class** spam classification. * **Task**: 4-way classification (`SpamLabel`: 0=NO-SPAM, 1=SPAM-1, 2=SPAM-2, 3=SPAM-3) * **Dataset**: [ViSpamReviews](https://huggingface.co/datasets/visolex/ViSpamReviews) * **Hyperparameters** * Batch size: 32 * LR: 3e-5 * Epochs: 100 * Max seq len: 256 ## Usage ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("visolex/visobert-spam-classification") model = AutoModelForSequenceClassification.from_pretrained("visolex/visobert-spam-classification") text = "Chỉ nói về thương hiệu thôi." inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=256) pred = model(**inputs).logits.argmax(dim=-1).item() label_map = {0: "NO-SPAM",1: "SPAM-1",2: "SPAM-2",3: "SPAM-3"} print(label_map[pred]) ```