--- license: mit datasets: - Overfit-GM/turkish-toxic-language language: - tr base_model: - dbmdz/bert-base-turkish-cased pipeline_tag: text-classification library_name: transformers tags: - text-classification - toxicity-detection - turkish - bert - nlp - content-moderation --- # MeowML/ToxicBERT - Turkish Toxic Language Detection ## Model Description ToxicBERT is a fine-tuned BERT model specifically designed for detecting toxic language in Turkish text. Built upon the `dbmdz/bert-base-turkish-cased` foundation model, this classifier can identify potentially harmful, offensive, or toxic content in Turkish social media posts, comments, and general text. ## Model Details - **Model Type**: Text Classification (Binary) - **Language**: Turkish (tr) - **Base Model**: `dbmdz/bert-base-turkish-cased` - **License**: MIT - **Library**: Transformers - **Task**: Toxicity Detection ## Intended Use ### Primary Use Cases - Content moderation for Turkish social media platforms - Automated filtering of user-generated content - Research in Turkish NLP and toxicity detection - Educational purposes for understanding toxic language patterns ### Out-of-Scope Use - This model should not be used as the sole decision-maker for content moderation without human oversight - Not suitable for languages other than Turkish - Should not be used for sensitive applications without proper validation and testing ## Training Data The model was trained on the `Overfit-GM/turkish-toxic-language` dataset, which contains Turkish text samples labeled for toxicity. The dataset includes various forms of toxic content commonly found in online Turkish communications. ## Model Performance The model outputs: - **Binary Classification**: 0 (Non-toxic) or 1 (Toxic) - **Confidence Score**: Probability score indicating model confidence - **Toxic Probability**: Specific probability of the text being toxic ## Usage ### Quick Start ```python import torch from transformers import AutoTokenizer, AutoModelForSequenceClassification # Load model and tokenizer tokenizer = AutoTokenizer.from_pretrained("dbmdz/bert-base-turkish-cased") model = AutoModelForSequenceClassification.from_pretrained("MeowML/ToxicBERT") # Prepare text text = "Merhaba, nasılsın?" inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=256) # Get prediction with torch.no_grad(): outputs = model(**inputs) probabilities = torch.nn.functional.softmax(outputs.logits, dim=-1) prediction = torch.argmax(probabilities, dim=-1) toxic_probability = probabilities[0][1].item() is_toxic = bool(prediction.item()) print(f"Is toxic: {is_toxic}") print(f"Toxic probability: {toxic_probability:.4f}") ``` ### Advanced Usage with Custom Class ```python import torch from transformers import AutoTokenizer, AutoModelForSequenceClassification class ToxicLanguageDetector: def __init__(self, model_name="MeowML/ToxicBERT"): self.tokenizer = AutoTokenizer.from_pretrained("dbmdz/bert-base-turkish-cased") self.model = AutoModelForSequenceClassification.from_pretrained(model_name) self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') self.model.to(self.device) self.model.eval() def predict(self, text): inputs = self.tokenizer( text, truncation=True, padding='max_length', max_length=256, return_tensors='pt' ).to(self.device) with torch.no_grad(): outputs = self.model(**inputs) probabilities = torch.nn.functional.softmax(outputs.logits, dim=-1) prediction = torch.argmax(probabilities, dim=-1) return { 'text': text, 'is_toxic': bool(prediction.item()), 'toxic_probability': probabilities[0][1].item(), 'confidence': max(probabilities[0]).item() } # Usage detector = ToxicLanguageDetector() result = detector.predict("Merhaba, nasılsın?") print(result) ``` ## Limitations and Biases ### Limitations - The model's performance depends heavily on the training data quality and coverage - May have difficulty with context-dependent toxicity (sarcasm, irony) - Performance may vary across different Turkish dialects or informal language - Shorter texts might be more challenging to classify accurately ### Potential Biases - The model may reflect biases present in the training dataset - Certain topics, demographics, or linguistic patterns might be over- or under-represented - Regular evaluation and bias testing are recommended for production use ## Ethical Considerations - This model should be used responsibly with human oversight - False positives and negatives are expected and should be accounted for - Consider the impact on freedom of expression when implementing automated moderation - Regular auditing and updating are recommended to maintain fairness ## Technical Specifications - **Input**: Text strings (max 256 tokens) - **Output**: Binary classification with probability scores - **Model Size**: Based on BERT-base architecture - **Inference Speed**: Optimized for both CPU and GPU inference - **Memory Requirements**: Suitable for standard hardware configurations ## Citation If you use this model in your research or applications, please cite: ```bibtex @misc{meowml_toxicbert_2024, title={ToxicBERT: Turkish Toxic Language Detection}, author={MeowML}, year={2024}, publisher={Hugging Face}, url={https://huggingface.co/MeowML/ToxicBERT} } ``` ## Acknowledgments - Base model: `dbmdz/bert-base-turkish-cased` - Training dataset: `Overfit-GM/turkish-toxic-language` - Built with Hugging Face Transformers library ## Contact For questions, issues, or suggestions, please open an issue in the model repository or contact the MeowML team. --- **Disclaimer**: This model is provided for research and educational purposes. Users are responsible for ensuring appropriate and ethical use in their applications.