Commit
·
18b366c
1
Parent(s):
4b7d0f7
Update README.md
Browse files
README.md
CHANGED
|
@@ -2,4 +2,66 @@
|
|
| 2 |
license: other
|
| 3 |
license_name: govtech-singapore
|
| 4 |
license_link: LICENSE
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 5 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
license: other
|
| 3 |
license_name: govtech-singapore
|
| 4 |
license_link: LICENSE
|
| 5 |
+
language:
|
| 6 |
+
- en
|
| 7 |
+
- ms
|
| 8 |
+
- ta
|
| 9 |
+
- zh
|
| 10 |
+
pipeline_tag: text-classification
|
| 11 |
+
tags:
|
| 12 |
+
- classifier
|
| 13 |
+
- safety
|
| 14 |
+
- moderation
|
| 15 |
+
- multilingual
|
| 16 |
---
|
| 17 |
+
|
| 18 |
+
# LionGuard 2 Lite
|
| 19 |
+
LionGuard 2 Lite is a multilingual content moderation classifier tuned for English/Singlish, Chinese, Malay, and Tamil in the Singapore context.
|
| 20 |
+
|
| 21 |
+
It leverages Google's `embeddinggemma-300m` (768-dimensional embeddings) with a multi-head classifier to return fine-grained scores for the following categories:
|
| 22 |
+
- Overall safety (`binary`)
|
| 23 |
+
- Hate (`hateful_l1`, `hateful_l2`)
|
| 24 |
+
- Insults (`insults`)
|
| 25 |
+
- Sexual content (`sexual_l1`, `sexual_l2`)
|
| 26 |
+
- Physical violence (`physical_violence`)
|
| 27 |
+
- Self-harm (`self_harm_l1`, `self_harm_l2`)
|
| 28 |
+
- Other misconduct (`all_other_misconduct_l1`, `all_other_misconduct_l2`)
|
| 29 |
+
|
| 30 |
+
---
|
| 31 |
+
|
| 32 |
+
# Taxonomy
|
| 33 |
+
|
| 34 |
+
| S/N | Category | Level | Description | Example |
|
| 35 |
+
|-----|-----------------------|-----------------------------------------|-----------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
| 36 |
+
| **1** | Hate | **1 (Discriminatory)** | Derogatory or generalized negative statements targeting a protected group. | “All Chinese people are so scheming.”<br>“Malays are always late and lazy.”<br>“Indians are too noisy; I avoid them whenever possible.” |
|
| 37 |
+
| | | **2 (Hate Speech)** | Explicit calls for harm or violence against a protected group. | “We should kick all Chinese people out of the country.”<br>“Burn down their temple…”<br>“Kill the Indians in Little India; they don’t belong here.” |
|
| 38 |
+
| **2** | Insults | *no sub-levels* | Demeans, humiliates, or mocks without referencing a protected trait. | “You’re so dumb, you probably failed your PSLE.”<br>“Your singing is worse than karaoke at a dodgy KTV.” |
|
| 39 |
+
| **3** | Sexual | **1 (Not appropriate for minors)** | Mild-to-moderate sexual content. | “Let’s talk about how condoms work.” <br>“We had a one-night stand last weekend.” |
|
| 40 |
+
| | | **2 (Not appropriate for all ages)** | Explicit or graphic sexual content. | “Always think they very yandao … but actually is just one ** bin.” |
|
| 41 |
+
| **4** | Physical Violence | *no sub-levels* | Glorification of violence or threats of physical harm. | “If I see him at Clarke Quay, I’m going to whack him.”<br>“Someone should plant a bomb at that office…” |
|
| 42 |
+
| **5** | Self-Harm | **1 (Ideation)** | Expressions of suicidal thoughts or encouragement of self-harm. | “I’m so stressed … I feel like ending it all.”<br>“Failing my poly exams made me want to cut myself.” |
|
| 43 |
+
| | | **2 (Action / Suicide)** | Descriptions of ongoing or imminent self-harm behavior. | “I’ve locked myself in my room and taken a bunch of pills.”<br>“I’m on the rooftop at Toa Payoh, ready to jump.” |
|
| 44 |
+
| **6** | All Other Misconduct | **1 (Generally not socially accepted)**| Unethical or immoral behavior not necessarily illegal. | “Let’s spread fake rumours about her …”<br>“How to secretly record someone’s private conversation?” |
|
| 45 |
+
| | | **2 (Illegal activities)** | Instructions or credible threats of serious harm; facilitation of crimes. | “Anyone know where to buy illegal knives in Geylang?”<br>“Let’s hack that e-commerce site to get credit card details.” |
|
| 46 |
+
|
| 47 |
+
---
|
| 48 |
+
|
| 49 |
+
# Usage
|
| 50 |
+
|
| 51 |
+
```python
|
| 52 |
+
import numpy as np
|
| 53 |
+
from sentence_transformers import SentenceTransformer
|
| 54 |
+
from transformers import AutoModel
|
| 55 |
+
|
| 56 |
+
# Load model directly from Hub
|
| 57 |
+
model = AutoModel.from_pretrained("govtech/lionguard-2-lite", trust_remote_code=True)
|
| 58 |
+
|
| 59 |
+
# Download model from the 🤗 Hub
|
| 60 |
+
embedding_model = SentenceTransformer("google/embeddinggemma-300m")
|
| 61 |
+
# Add prompt instructions to generate embeddings that are optimized to classify texts according to preset labels
|
| 62 |
+
formatted_texts = [f"task: classification | query: {c}" for c in texts]
|
| 63 |
+
embeddings = embedding_model.encode(formatted_texts) # NOTE: use encode() instead of encode_documents()
|
| 64 |
+
|
| 65 |
+
# Run inference
|
| 66 |
+
results = model.predict(embeddings)
|
| 67 |
+
```
|