Text Classification
Transformers
Safetensors
deberta-v2
formal or informal classification
sentiment-analysis
text-embeddings-inference
Instructions to use LenDigLearn/formality-classifier-mdeberta-v3-base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use LenDigLearn/formality-classifier-mdeberta-v3-base with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="LenDigLearn/formality-classifier-mdeberta-v3-base")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("LenDigLearn/formality-classifier-mdeberta-v3-base") model = AutoModelForSequenceClassification.from_pretrained("LenDigLearn/formality-classifier-mdeberta-v3-base") - Notebooks
- Google Colab
- Kaggle
# Load model directly
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("LenDigLearn/formality-classifier-mdeberta-v3-base")
model = AutoModelForSequenceClassification.from_pretrained("LenDigLearn/formality-classifier-mdeberta-v3-base")Quick Links
formality-classifier-mdeberta-v3-base
This model can classify texts based on their formality. It classifies inputs into one of the three classes ["formal", "informal", "neutral"], with neutral pertaining to texts which do not have a clear formality, such as passive statements etc.
In selecting and generating training data, a focus was put on languages that actually have a type of formal address etc., including French, German, Italian, Portuguese and Spanish. Some samples from osyvokon/pavlick-formality-scores were also used to try and teach the model to classify English inputs.
Results
Accuracy on the test set:
| Language | Accuracy |
|---|---|
| all | 88.93% |
| English | 79.20% |
| French | 100% |
| German | 97.73% |
| Italian | 97.83% |
| Portuguese | 100% |
| Spanish | 98.53% |
Confusion Matrix:
By Language:
Usage example
from transformers import pipeline
pipe = pipeline("text-classification", model="LenDigLearn/formality-classifier-mdeberta-v3-base")
print("DE:")
texts_de = [
"Verschwinde", "Nein", "Ja", "vielleicht", "Warum bist du so?",
"Können Sie mir spontan dabei helfen?", "Bitte senden Sie uns die nötigen Unterlagen zu.", "Dies müssen Sie selbst entscheiden, wenn Sie den entsprechenden Punkt erreicht haben.", "Sie sind also Herr Müller.", "Bitte helfen Sie mir!",
"Man muss schon wissen, was dann passiert.", "Als nächstes kommen 4g Champignons und 500g Mehl dazu.", "Bananen sind krumm.", "Das ist eine Tatsache, die unumstößlich ist.", "Hilfestellungen sind unter \"Hilfe\" zu finden."
]
for text in texts_de:
print(pipe(text))
print("-----------\nEN:")
texts_en = [
"Piss off", "No", "Yes", "maybe", "Why are you like this?",
"Could you help me spontaneously?", "Please send me the necessary documents.", "You will have to decide this individually as soon as you have reached the relevant point.", "I presume you are Mr. Müller?", "Please offer me your support!",
"One would have to know what happens then.", "Then, we add 4g Mushrooms and 500g flour.", "Bananas are usually curved.", "That is an irrefutable fact.", "You can find helpful tutorials under \"help\"."
]
for text in texts_en:
print(pipe(text))
- Downloads last month
- 25,633
Model tree for LenDigLearn/formality-classifier-mdeberta-v3-base
Base model
microsoft/mdeberta-v3-base
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="LenDigLearn/formality-classifier-mdeberta-v3-base")