identrics
/

wasper_propaganda_classifier_en

Model card Files Files and versions

Nikola299 commited on Aug 20, 2024

Commit

da173ea

·

verified ·

1 Parent(s): 4f103f1

Update README.md

Files changed (1) hide show

README.md +17 -2

README.md CHANGED Viewed

@@ -56,6 +56,9 @@ To be used as a multilabel classifier to identify if the sample text contains on
 ### Example
 First install direct dependencies:
 ```
 pip install transformers torch accelerate
@@ -65,8 +68,8 @@ Then the model can be downloaded and used for inference:
 ```py
 from transformers import AutoModelForSequenceClassification, AutoTokenizer
-model = AutoModelForSequenceClassification.from_pretrained("identrics/EN_propaganda_detector", num_labels=2)
-tokenizer = AutoTokenizer.from_pretrained("identrics/BG_propaganda_detector")
 tokens = tokenizer("Our country is the most powerful country in the world!", return_tensors="pt")
 output = model(**tokens)
@@ -74,6 +77,18 @@ print(output.logits)
 ```
 ## Training Details
 The training datasets for the model consist of a balanced set totaling 734 Bulgarian examples that include both propaganda and non-propaganda content. These examples are collected from a variety of traditional media and social media sources, ensuring a diverse range of content. Aditionally, the training dataset is enriched with AI-generated samples. The total distribution of the training data is shown in the table below:

 ### Example
 First install direct dependencies:
 ```
 pip install transformers torch accelerate
 ```py
 from transformers import AutoModelForSequenceClassification, AutoTokenizer
+model = AutoModelForSequenceClassification.from_pretrained("identrics/BG_propaganda_classifier", num_labels=5)
+tokenizer = AutoTokenizer.from_pretrained("identrics/BG_propaganda_classifier")
 tokens = tokenizer("Our country is the most powerful country in the world!", return_tensors="pt")
 output = model(**tokens)
 ```
 ## Training Details
 The training datasets for the model consist of a balanced set totaling 734 Bulgarian examples that include both propaganda and non-propaganda content. These examples are collected from a variety of traditional media and social media sources, ensuring a diverse range of content. Aditionally, the training dataset is enriched with AI-generated samples. The total distribution of the training data is shown in the table below: