LikoKIko
/

OpenCensor-H1

Text Classification

profanity-detection

text-embeddings-inference

Model card Files Files and versions

OpenCensor-H1 / README.md

LikoKIko's picture

Update README.md

72c039d verified about 2 months ago

|

history blame contribute delete

2.84 kB

	---
	language:
	- he
	license: cc-by-sa-4.0
	tags:
	- text-classification
	- profanity-detection
	- hebrew
	- bert
	- alephbert
	library_name: transformers
	base_model: onlplab/alephbert-base
	datasets:
	- custom
	metrics:
	- accuracy
	- precision
	- recall
	- f1
	---

	# OpenCensor-Hebrew

	This is a fine tuned AlephBERT model that finds bad words ( profanity ) in Hebrew text.

	You give the model a Hebrew sentence.
	It returns:
	- a score between 0 and 1
	- a yes/no flag (based on a cutoff you choose)

	Meaning of the score:
	- 0 = clean, 1 = has profanity
	- Recommended cutoff from tests: 0.49 ( you can change it )

	![Validation F1 per Epoch](valf1perepoch.png)
	![Final Test Metrics](testmetrics.png)
	![Best Threshold](bestthreshold.png)

	## How to use

	```python
	import torch
	from transformers import AutoTokenizer, AutoModelForSequenceClassification

	KModel = "LikoKIko/OpenCensor-Hebrew"
	KCutoff = 0.49 # best threshold from training
	KMaxLen = 512 # number of tokens (not characters)

	tokenizer = AutoTokenizer.from_pretrained(KModel)
	model = AutoModelForSequenceClassification.from_pretrained(KModel, num_labels=1).eval()

	text = "some hebrew text here"
	inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=KMaxLen)

	with torch.inference_mode():
	score = torch.sigmoid(model(**inputs).logits).item()
	KHasProfanity = int(score >= KCutoff)

	print({"score": round(score, 4), "KHasProfanity": KHasProfanity})
	````

	Note: If the text is very long, it is cut at `KMaxLen` tokens.

	## About this model

	- Base: `onlplab/alephbert-base`
	- Task: binary classification (clean / profanity)
	- Language: Hebrew
	- Max length: 512 tokens
	- Training:
	- Batch size: 16
	- Epochs: 10
	- Learning rate: 0.00002
	- Loss: binary cross-entropy with logits (`BCEWithLogitsLoss`). We use `pos_weight` so the model pays more attention to the rare class. This helps when the dataset is imbalanced.
	- Scheduler: linear warmup (10%)

	### Results

	- Test Accuracy: 0.9826
	- Test Precision: 0.9812
	- Test Recall: 0.9835
	- Test F1: 0.9823
	- Best threshold: 0.49

	## Reproduce (training code)

	This model was trained with a script that:

	- Loads `onlplab/alephbert-base` with `num_labels=1`
	- Tokenizes with `max_length=512` and pads to the max length
	- Trains with AdamW, linear warmup, and mixed precision
	- Tries cutoffs from `0.1` to `0.9` on the validation set and picks the best F1
	- Saves the best checkpoint by validation F1, then reports test metrics

	## License

	CC-BY-SA-4.0

	## How to cite
	```
	```bibtex
	@misc{opencensor-hebrew,
	title = {OpenCensor-Hebrew: Hebrew Profanity Detection Model},
	author = {LikoKIko},
	year = {2025},
	url = {[https://huggingface.co/LikoKIko/OpenCensor-Hebrew](https://huggingface.co/LikoKIko/OpenCensor-Hebrew)}
	}
	```
	```