--- language: - fa metrics: - f1 - accuracy - precision - recall base_model: - sbunlp/fabert pipeline_tag: text-classification tags: - code --- # **Fine-Tuned FaBERT Model for Formality Classification** This repository contains a fine-tuned version of **FABERT**, a pre-trained language model designed for **formality classification**. This model has been specifically trained to classify text as **formal** or **informal**, making it ideal for applications in content moderation, social media monitoring, and customer support automation. ## **Model Overview** - **Architecture:** Built on the **FABERT** model, a transformer-based architecture optimized for NLP tasks. - **Task:** **Formality Classification** – distinguishing between formal and informal language in text. - **Fine-Tuning:** The model has been fine-tuned on a custom dataset containing a variety of formal and informal text. ## **Key Features** - **Multilingual Support:** This model is capable of classifying text in multiple languages, ensuring robustness in diverse linguistic contexts. - **High Performance:** Fine-tuned to provide accurate predictions for formal vs. informal text classification. - **Efficient for Deployment:** Optimized for real-time use in environments like social media platforms, content moderation tools, and communication systems. ## **How to Use the Model** You can use this model in your Python code with the Hugging Face `transformers` library and PyTorch. The following code snippet demonstrates how to tokenize text, make predictions, and classify whether the text is formal or informal. ```python import torch from transformers import AutoTokenizer, AutoModelForSequenceClassification # Load the pre-trained tokenizer and model tokenizer = AutoTokenizer.from_pretrained("faimlab/fabert_formality_classifier") model = AutoModelForSequenceClassification.from_pretrained("faimlab/fabert_formality_classifier") # Ensure the model runs on GPU if available device = torch.device("cuda" if torch.cuda.is_available() else "cpu") model.to(device) # Example input text input_text = "Please find attached the report for your review." # Tokenize the input inputs = tokenizer(input_text, return_tensors="pt", padding=True, truncation=True, max_length=512) # Move the model and input to GPU if available inputs = {key: value.to(device) for key, value in inputs.items()} # Make predictions with torch.no_grad(): outputs = model(**inputs) logits = outputs.logits # Get the predicted label predicted_label = logits.argmax(dim=1).item() print(f"Predicted Label: {predicted_label}")