Sami92 commited on
Commit
cf5a1f8
·
verified ·
1 Parent(s): 7c9e74d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -14
README.md CHANGED
@@ -33,6 +33,7 @@ checkpoint = "Sami92/XLM-R-Large-Polarization-Classifier"
33
  tokenizer_kwargs = {'padding':True,'truncation':True,'max_length':512}
34
  polarization_classifier = pipeline("text-classification", model = checkpoint, tokenizer =checkpoint, **tokenizer_kwargs, device="cuda")
35
  polarization_classifier(texts)
 
36
 
37
  ## Training Details
38
 
@@ -40,15 +41,9 @@ polarization_classifier(texts)
40
 
41
  <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
42
 
43
- [More Information Needed]
44
-
45
- ### Training Procedure
46
-
47
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
48
-
49
- #### Preprocessing [optional]
50
 
51
- [More Information Needed]
52
 
53
 
54
  #### Training Hyperparameters
@@ -71,15 +66,10 @@ Supervised Training on Ashraf et al. 2024
71
 
72
  ## Evaluation
73
 
74
- <!-- This section describes the evaluation protocols and provides the results. -->
75
-
76
- ### Testing Data, Factors & Metrics
77
 
78
  #### Testing Data
79
 
80
- <!-- This should link to a Dataset Card if possible. -->
81
-
82
- [More Information Needed]
83
 
84
 
85
  ### Results
 
33
  tokenizer_kwargs = {'padding':True,'truncation':True,'max_length':512}
34
  polarization_classifier = pipeline("text-classification", model = checkpoint, tokenizer =checkpoint, **tokenizer_kwargs, device="cuda")
35
  polarization_classifier(texts)
36
+ ```
37
 
38
  ## Training Details
39
 
 
41
 
42
  <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
43
 
44
+ The trainingdata for the weakly supervised training was taken from Telegram. More specifically from a set of about 200 channels that have been subject to a fact-check from either Correctiv, dpa, Faktenfuchs or AFP. A sample of 5000 posts was chosen.
 
 
 
 
 
 
45
 
46
+ In a second step, the model was fine-tuned on the train split from Ashraf et al. 2024.
47
 
48
 
49
  #### Training Hyperparameters
 
66
 
67
  ## Evaluation
68
 
 
 
 
69
 
70
  #### Testing Data
71
 
72
+ Evaluation was performed on the test split from Ashraf et al. 2024.
 
 
73
 
74
 
75
  ### Results