Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -20,6 +20,7 @@ A text classification model for content moderation with age-appropriate filterin
|
|
| 20 |
- **Dual-mode filtering:** <13 (strict) vs 13+ (laxed)
|
| 21 |
- **6 content categories:** Safe, Harassment, Swearing (reaction), Swearing (aggressive), Hate Speech, Spam
|
| 22 |
- **PII Detection:** Emails, phones, addresses, credit cards, SSN
|
|
|
|
| 23 |
- **Social Media Protection:**
|
| 24 |
- <13: Block all social media sharing
|
| 25 |
- 13+: Allow, block only if grooming detected
|
|
@@ -49,16 +50,18 @@ result = filter.check("DM me privately, don't tell parents", age=14)
|
|
| 49 |
# -> BLOCKED (grooming detected)
|
| 50 |
```
|
| 51 |
|
| 52 |
-
##
|
| 53 |
|
| 54 |
-
|
| 55 |
-
|
| 56 |
-
|
|
| 57 |
-
|
| 58 |
-
|
|
| 59 |
-
|
|
| 60 |
-
|
|
| 61 |
-
|
|
|
|
|
|
|
|
| 62 |
|
| 63 |
## Social Media Rules
|
| 64 |
|
|
|
|
| 20 |
- **Dual-mode filtering:** <13 (strict) vs 13+ (laxed)
|
| 21 |
- **6 content categories:** Safe, Harassment, Swearing (reaction), Swearing (aggressive), Hate Speech, Spam
|
| 22 |
- **PII Detection:** Emails, phones, addresses, credit cards, SSN
|
| 23 |
+
- **Unicode Deobfuscation:** Detects circled letters (β), double-struck (β), fullwidth, mathematical symbols
|
| 24 |
- **Social Media Protection:**
|
| 25 |
- <13: Block all social media sharing
|
| 26 |
- 13+: Allow, block only if grooming detected
|
|
|
|
| 50 |
# -> BLOCKED (grooming detected)
|
| 51 |
```
|
| 52 |
|
| 53 |
+
## Unicode Deobfuscation
|
| 54 |
|
| 55 |
+
Automatically detects and normalizes unicode bypass attempts:
|
| 56 |
+
|
| 57 |
+
| Technique | Example | Normalized |
|
| 58 |
+
|-----------|---------|------------|
|
| 59 |
+
| Circled letters | `ββ€ββ` | `fuck` |
|
| 60 |
+
| Double-struck | `ββ` | `CH` |
|
| 61 |
+
| Fullwidth | `οΌ¦` | `F` |
|
| 62 |
+
| Mathematical | `π` | `f` |
|
| 63 |
+
|
| 64 |
+
**All obfuscated text is normalized before moderation checks.**
|
| 65 |
|
| 66 |
## Social Media Rules
|
| 67 |
|