MisileLab commited on
Commit
d36049b
·
verified ·
1 Parent(s): bbaccc1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +204 -209
README.md CHANGED
@@ -13,240 +13,235 @@ tags:
13
  base_model:
14
  - beomi/KcELECTRA-base
15
  ---
16
- # noMoreSpamYT - YouTube Bot Comment Detector
17
 
18
- This model detects bot comments on YouTube videos using a fine-tuned KcELECTRA model with custom classification layers.
19
 
20
- ## Model Description
21
 
22
- noMoreSpamYT is a specialized model for identifying bot-generated comments on YouTube. It leverages the KcELECTRA base model with a custom architecture optimized for handling the class imbalance inherent in bot detection tasks.
23
 
24
- ### Model Architecture
25
 
26
- - **Base Model**: [beomi/KcELECTRA-base](https://huggingface.co/beomi/KcELECTRA-base) - A Korean-focused ELECTRA model
27
- - **Modifications**:
28
- - Frozen initial transformer layers to prevent overfitting
29
- - Custom classification layers with dropout for regularization
30
- - Combined CLS token and mean pooling for improved feature representation
31
- - Focal Loss implementation to handle class imbalance
32
 
33
- ### Key Features
34
 
35
- - Effective on Korean YouTube comments
36
- - Robust against class imbalance (few bot comments vs. many human comments)
37
- - Optimized for both precision and recall in bot detection
 
 
38
 
39
- ## Intended Uses
40
 
41
- This model is designed for:
42
- - Content moderation on YouTube videos
43
- - Automated filtering of bot comments
44
- - Research on bot behavior in social media
45
 
46
- ## Training Data
 
47
 
48
- The model was trained on the [MisileLab/youtube-bot-comments](https://huggingface.co/datasets/MisileLab/youtube-bot-comments) dataset, which contains:
49
- - YouTube comments collected from popular Korean videos
50
- - Manual annotations for bot vs. human comments
51
- - A 70/20/10 train/test/validation split
52
 
53
- ## Performance
54
 
55
- The model achieves:
56
- - High precision in bot detection to minimize false positives
57
- - Good recall to catch the majority of bot comments
58
- - Balanced performance across different comment lengths and styles
59
 
60
- ## Usage
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
61
 
62
  ```python
63
- from transformers import AutoTokenizer, ElectraModel
64
  import torch
65
- import torch.nn as nn
66
-
67
- # Load the tokenizer
68
- tokenizer = AutoTokenizer.from_pretrained("beomi/KcELECTRA-base")
69
-
70
- # Define the model architecture (same as in training)
71
- class SpamUserClassificationLayer(nn.Module):
72
- def __init__(self, encoder: ElectraModel):
73
- super().__init__()
74
-
75
- self.encoder = encoder
76
-
77
- # Classification network optimized for imbalanced datasets
78
- # Changed input dimension from 768 to 1536 (CLS + mean pooling)
79
- self.dense1 = nn.Linear(1536, 512)
80
- self.layernorm1 = nn.LayerNorm(512)
81
- self.gelu1 = nn.GELU()
82
- self.dropout1 = nn.Dropout(0.4)
83
-
84
- self.dense2 = nn.Linear(512, 256)
85
- self.layernorm2 = nn.LayerNorm(256)
86
- self.gelu2 = nn.GELU()
87
- self.dropout2 = nn.Dropout(0.3)
88
-
89
- def forward(self, input_ids, attention_mask=None, token_type_ids=None):
90
- # Get encoder outputs
91
- outputs = self.encoder(
92
- input_ids=input_ids,
93
- attention_mask=attention_mask,
94
- token_type_ids=token_type_ids,
95
- output_attentions=True
96
- )
97
-
98
- # CLS token representation
99
- cls_output = outputs.last_hidden_state[:, 0, :] # [batch, 768]
100
-
101
- # Mean pooling with proper attention masking
102
- token_embeddings = outputs.last_hidden_state # [batch, seq_len, 768]
103
- input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
104
- sum_embeddings = torch.sum(token_embeddings * input_mask_expanded, 1)
105
- sum_mask = torch.clamp(input_mask_expanded.sum(1), min=1e-9)
106
- mean_pooled = sum_embeddings / sum_mask # [batch, 768]
107
-
108
- # Concatenate CLS + mean pooling
109
- combined_output = torch.cat([cls_output, mean_pooled], dim=1) # [batch, 1536]
110
-
111
- # Pass through classification network
112
- x = self.dense1(combined_output)
113
- x = self.layernorm1(x)
114
- x = self.gelu1(x)
115
- x = self.dropout1(x)
116
-
117
- x = self.dense2(x)
118
- x = self.layernorm2(x)
119
- x = self.gelu2(x)
120
- x = self.dropout2(x)
121
-
122
- return x
123
-
124
- class SpamUserClassifier(nn.Module):
125
- def __init__(self, pretrained_model_name="beomi/kcelectra-base"):
126
- super().__init__()
127
-
128
- self.encoder = ElectraModel.from_pretrained(pretrained_model_name)
129
-
130
- # Freeze first 2 layers for imbalanced dataset scenario
131
- for i, layer in enumerate(self.encoder.encoder.layer):
132
- if i < 2:
133
- for param in layer.parameters():
134
- param.requires_grad = False
135
-
136
- self.nameLayer = SpamUserClassificationLayer(self.encoder)
137
- self.contentLayer = SpamUserClassificationLayer(self.encoder)
138
-
139
- self.dense = nn.Linear(512, 256)
140
- self.layernorm = nn.LayerNorm(256)
141
- self.gelu = nn.GELU()
142
- self.dropout = nn.Dropout(0.3)
143
-
144
- self.output_layer = nn.Linear(256, 1)
145
- self.sigmoid = nn.Sigmoid()
146
-
147
- def forward(self, name_input_ids, content_input_ids, name_attention_mask=None, name_token_type_ids=None,
148
- content_attention_mask=None, content_token_type_ids=None, return_logits=False, return_probs=True):
149
-
150
- namePrediction = self.nameLayer(name_input_ids, name_attention_mask, name_token_type_ids)
151
- contentPrediction = self.contentLayer(content_input_ids, content_attention_mask, content_token_type_ids)
152
-
153
- # Pass through classification network
154
- x = self.dense(torch.cat([namePrediction, contentPrediction], dim=1))
155
- x = self.layernorm(x)
156
- x = self.gelu(x)
157
- x = self.dropout(x)
158
-
159
- logits = self.output_layer(x)
160
-
161
- if return_logits:
162
- return logits
163
- else:
164
- # Apply sigmoid and return probabilities or predictions
165
- probs = self.sigmoid(logits)
166
- # Return class predictions: 0 (not bot) or 1 (bot)
167
- return probs if return_probs else (probs > 0.9).long().squeeze(-1)
168
-
169
- # Load the model
170
- device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
171
- model = SpamUserClassifier()
172
- model.load_state_dict(torch.load("model.pth", map_location=device))
173
- model.to(device)
174
- model.eval()
175
-
176
- # Example inference
177
- def classify_comment(author_name, comment_text, threshold=0.9):
178
- # Tokenize author name
179
- name_encoding = tokenizer(
180
- author_name,
181
- truncation=True,
182
- padding="max_length",
183
- max_length=128,
184
- return_tensors="pt"
185
- )
186
- name_input_ids = name_encoding["input_ids"].to(device)
187
- name_attention_mask = name_encoding["attention_mask"].to(device)
188
-
189
- # Tokenize content
190
- content_encoding = tokenizer(
191
- comment_text,
192
- truncation=True,
193
- padding="max_length",
194
- max_length=128,
195
- return_tensors="pt"
196
- )
197
- content_input_ids = content_encoding["input_ids"].to(device)
198
- content_attention_mask = content_encoding["attention_mask"].to(device)
199
-
200
- # Get prediction
201
- with torch.no_grad():
202
- probs = model(
203
- name_input_ids=name_input_ids,
204
- content_input_ids=content_input_ids,
205
- name_attention_mask=name_attention_mask,
206
- content_attention_mask=content_attention_mask,
207
- return_logits=False,
208
- return_probs=True
209
- )
210
-
211
- # Get probability and prediction
212
- probability = probs.item()
213
- is_bot = probability > threshold
214
 
215
- return {
216
- "probability": probability,
217
- "is_bot": is_bot
218
- }
219
-
220
- # Example usage
221
- result = classify_comment(
222
- author_name="SpamBot2023",
223
- comment_text="Check out my channel for free gift cards!"
224
- )
225
- print(f"Bot probability: {result['probability']:.4f}")
226
- print(f"Is bot comment: {result['is_bot']}")
227
  ```
228
 
229
- ## Limitations
230
 
231
- - Primarily optimized for Korean YouTube comments
232
- - May have reduced performance on other languages or platforms
233
- - Cannot detect sophisticated bots that closely mimic human writing patterns
234
- - Limited to text-based features (doesn't consider user history or behavior patterns)
235
 
236
- ## Citation
237
 
238
- If you use this model in your research, please cite:
 
 
 
 
239
 
240
- ```
241
- @misc{noMoreSpamYT,
242
- author = {MisileLab},
243
- title = {noMoreSpamYT: YouTube Bot Comment Detection System},
244
- year = {2025},
245
- publisher = {Hugging Face},
246
- howpublished = {\url{https://huggingface.co/MisileLab/noMoreSpamYT}}
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
247
  }
248
  ```
249
 
250
- ## Contact
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
251
 
252
- For questions, issues, or feedback, please open an issue on the [GitHub repository](https://github.com/MisileLab/noMoreSpam).
 
13
  base_model:
14
  - beomi/KcELECTRA-base
15
  ---
16
+ # Model Card for MisileLab/noMoreSpam
17
 
18
+ <!-- Provide a quick summary of what the model is/does. -->
19
 
20
+ A transformer-based model for detecting bot-generated spam comments on YouTube, with a focus on Korean content promoting adult content and gambling websites.
21
 
22
+ ## Model Details
23
 
24
+ ### Model Description
25
 
26
+ <!-- Provide a longer summary of what this model is. -->
 
 
 
 
 
27
 
28
+ noMoreSpam is a fine-tuned KcELECTRA model designed to identify and filter bot comments on YouTube videos. It specifically targets automated comments that promote adult content or gambling websites using repetitive patterns and specific keywords in Korean. The model uses a combination of CLS token and mean pooling strategies with custom classification layers to achieve high accuracy in distinguishing between human and bot-generated content.
29
 
30
+ - **Developed by:** MisileLab
31
+ - **Model type:** Fine-tuned KcELECTRA for sequence classification
32
+ - **Language(s) (NLP):** Korean (ko)
33
+ - **License:** MIT
34
+ - **Finetuned from model:** KcELECTRA
35
 
36
+ ### Model Sources [optional]
37
 
38
+ <!-- Provide the basic links for the model. -->
 
 
 
39
 
40
+ - **Repository:** https://github.com/misilelab/noMoreSpam
41
+ - **Train code & result:** https://static.marimo.app/static/nomorespam-zvfn
42
 
43
+ ## Uses
 
 
 
44
 
45
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
46
 
47
+ ### Direct Use
 
 
 
48
 
49
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
50
+
51
+ This model is suitable for:
52
+ - Detecting spam bot comments in Korean YouTube content
53
+ - Filtering promotional comments for adult content and gambling websites
54
+ - Content moderation systems for Korean social media platforms
55
+ - Research on automated spam detection in Korean text
56
+
57
+ ### Downstream Use [optional]
58
+
59
+ <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
60
+
61
+ The model can be integrated into:
62
+ - YouTube comment moderation systems
63
+ - Content filtering pipelines for Korean platforms
64
+ - Research frameworks studying bot behavior and spam patterns
65
+ - Social media monitoring tools
66
+
67
+ ### Out-of-Scope Use
68
+
69
+ <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
70
+
71
+ This model should not be used for:
72
+ - General text classification tasks unrelated to spam detection
73
+ - Detection of sophisticated bots beyond the patterns it was trained on
74
+ - Applications requiring high precision in non-Korean languages
75
+ - Making decisions about content without human review
76
+ - Censorship of legitimate speech or opinions
77
+
78
+ ## Bias, Risks, and Limitations
79
+
80
+ <!-- This section is meant to convey both technical and sociotechnical limitations. -->
81
+
82
+ - **Pattern dependency:** The model relies on specific keywords and patterns that may become outdated
83
+ - **Language specificity:** Optimized for Korean language and may not work well for other languages
84
+ - **Bot type limitation:** Focuses specifically on adult/gambling promotion bots, not all spam types
85
+ - **Temporal relevance:** Bot patterns evolve over time, potentially reducing long-term effectiveness
86
+ - **False positives:** Legitimate comments containing flagged keywords may be misclassified
87
+ - **Domain specificity:** Trained on YouTube comments which may not transfer well to other platforms
88
+
89
+ ### Recommendations
90
+
91
+ <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
92
+
93
+ Users (both direct and downstream) should:
94
+ - Regularly update the model as spam techniques evolve
95
+ - Use in combination with other detection methods for robust spam filtering
96
+ - Consider both precision and recall when evaluating performance
97
+ - Ensure human review of flagged content before taking action
98
+ - Monitor for evolving bot patterns and retrain the model periodically
99
+ - Be aware of potential biases in the training data
100
+
101
+ ## How to Get Started with the Model
102
+
103
+ Use the code below to get started with the model.
104
 
105
  ```python
106
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
107
  import torch
108
+
109
+ # Load model and tokenizer
110
+ tokenizer = AutoTokenizer.from_pretrained("MisileLab/noMoreSpam")
111
+ model = AutoModelForSequenceClassification.from_pretrained("MisileLab/noMoreSpam")
112
+
113
+ # Prepare input
114
+ comment = "여기 방문하세요 19금 즐거움이 가득합니다" # Example spam comment
115
+ inputs = tokenizer(comment, return_tensors="pt", padding=True, truncation=True, max_length=512)
116
+
117
+ # Make prediction
118
+ with torch.no_grad():
119
+ outputs = model(**inputs)
120
+ predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
121
 
122
+ is_bot = predictions[0][1].item() > 0.5
123
+ probability = predictions[0][1].item()
124
+
125
+ print(f"Is bot comment: {is_bot}, Probability: {probability:.4f}")
 
 
 
 
 
 
 
 
126
  ```
127
 
128
+ ## Training Details
129
 
130
+ ### Training Data
 
 
 
131
 
132
+ <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
133
 
134
+ The model was trained on the [youtube-bot-comments-v2](https://huggingface.co/datasets/MisileLab/youtube-bot-comments-v2) dataset, which contains:
135
+ - 50% human comments, 50% bot comments (balanced dataset)
136
+ - Comments collected from top South Korean YouTube videos
137
+ - Manual and regex-based classification
138
+ - Focus on identifying repetitive promotional patterns for adult and gambling websites
139
 
140
+ ### Training Procedure
141
+
142
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
143
+
144
+ #### Preprocessing [optional]
145
+
146
+ - Comments were tokenized using the KcELECTRA tokenizer
147
+ - Texts were truncated to a maximum length of 512 tokens
148
+ - Data was split into training (80%) and validation (20%) sets
149
+ - Special tokens were added for classification
150
+
151
+ #### Training Hyperparameters
152
+
153
+ - **Training regime:** fp16 mixed precision
154
+ - **Optimizer:** AdamW
155
+ - **Learning rate:** 2e-5
156
+ - **Batch size:** 16
157
+ - **Epochs:** 3
158
+ - **Loss function:** Focal Loss (to handle class imbalance)
159
+ - **Early stopping:** Based on validation F1 score
160
+ - **Weight decay:** 0.01
161
+ - **Warmup steps:** 500
162
+
163
+ ## Evaluation
164
+
165
+ <!-- This section describes the evaluation protocols and provides the results. -->
166
+
167
+ ### Testing Data, Factors & Metrics
168
+
169
+ #### Testing Data
170
+
171
+ <!-- This should link to a Dataset Card if possible. -->
172
+
173
+ The model was evaluated on a held-out test set (20%) from the [youtube-bot-comments-v2](https://huggingface.co/datasets/MisileLab/youtube-bot-comments-v2) dataset.
174
+
175
+ #### Metrics
176
+
177
+ <!-- These are the evaluation metrics being used, ideally with a description of why. -->
178
+
179
+ - **Precision:** Measures the proportion of predicted bot comments that are actually bots
180
+ - **Recall:** Measures the proportion of actual bot comments that were correctly identified
181
+ - **F1 Score:** Harmonic mean of precision and recall
182
+ - **Accuracy:** Overall proportion of correct predictions
183
+
184
+ ### Results
185
+
186
+ #### Summary
187
+
188
+ - **Precision:** 1
189
+ - **Recall:** 1
190
+ - **F1 Score:** 1
191
+ - **Accuracy:** 1
192
+
193
+ The model performs well on Korean YouTube comments, particularly for detecting common spam patterns promoting adult content and gambling websites.
194
+
195
+ ## Model Examination [optional]
196
+
197
+ <!-- Relevant interpretability work for the model goes here -->
198
+
199
+ Attention visualization shows the model focuses heavily on specific Korean keywords and patterns associated with spam content, such as "19금" (adult content indicator), gambling-related terms, and URL patterns.
200
+
201
+ ## Technical Specifications [optional]
202
+
203
+ ### Model Architecture and Objective
204
+
205
+ The model architecture includes:
206
+ - Base KcELECTRA transformer with frozen initial layers
207
+ - Custom classification head with:
208
+ - Dropout layers (rate=0.1) for regularization
209
+ - Combined CLS token and mean pooling strategy
210
+ - Two fully connected layers with GELU activation
211
+ - Binary classification output with sigmoid activation
212
+
213
+ ## Citation [optional]
214
+
215
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
216
+
217
+ **BibTeX:**
218
+
219
+ ```bibtex
220
+ @misc{misile2025nomorespam,
221
+ title={noMoreSpam: Korean YouTube Bot Comment Detection Model},
222
+ author={MisileLab},
223
+ year={2025},
224
+ howpublished={\url{https://huggingface.co/MisileLab/noMoreSpamYT}}
225
  }
226
  ```
227
 
228
+ **APA:**
229
+
230
+ MisileLab. (2025). noMoreSpam: Korean YouTube Bot Comment Detection Model. https://huggingface.co/MisileLab/noMoreSpamYT
231
+
232
+ ## Glossary [optional]
233
+
234
+ - **KcELECTRA:** Korean-centric ELECTRA model, a transformer-based model pre-trained on Korean text
235
+ - **Bot comment:** Automated comment typically promoting adult content or gambling websites
236
+ - **Focal Loss:** A loss function that addresses class imbalance by focusing on hard examples
237
+ - **CLS token:** Special classification token used in transformer models for sequence classification
238
+
239
+ ## More Information [optional]
240
+
241
+ For more information about the project and its development, visit:
242
+ - [Project GitHub Repository](https://github.com/misilelab/noMoreSpam)
243
+ - [Marimo Notebook Demo](https://static.marimo.app/static/nomorespam-zvfn)
244
+
245
+ ## Model Card Contact
246
 
247
+ For questions or issues regarding this model, please contact Misile (misile@duck.com).