Sami92
/

XLM-R-Large-ClaimDetection

@@ -8,7 +8,7 @@ pipeline_tag: text-classification
 # Model Card for Model ID
-Fine-tuned (XLM-R Large)[https://huggingface.co/FacebookAI/xlm-roberta-large] for task of classifying sentences as factual or not. The taxonomy for factual claims follows Wilms et al. 2021. The model was first trained on a Telegram dataset that was annotated using GPT-4o with this (prompt)[https://huggingface.co/Sami92/XLM-R-Large-ClaimDetection/blob/main/FactualityPrompt_GPT.txt]. In a second step it was trained on the data from Risch et al. 2021. It was tested on a sample of Telegram posts that were annotated by four trained coders.
@@ -16,47 +16,8 @@ Fine-tuned (XLM-R Large)[https://huggingface.co/FacebookAI/xlm-roberta-large] fo
 ### Model Description
-<!-- Provide a longer summary of what this model is. -->
-This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
-## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
-### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
 ## Bias, Risks, and Limitations
@@ -72,126 +33,65 @@ Users (both direct and downstream) should be made aware of the risks, biases and
 ## How to Get Started with the Model
-Use the code below to get started with the model.
-[More Information Needed]
 ## Training Details
 ### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
-### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
 #### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
-## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
-### Testing Data, Factors & Metrics
-#### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
-#### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-[More Information Needed]
-#### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
-### Results
-[More Information Needed]
-#### Summary
-## Model Examination [optional]
-<!-- Relevant interpretability work for the model goes here -->
-[More Information Needed]
-## Environmental Impact
-<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
-Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
-- **Hardware Type:** [More Information Needed]
-- **Hours used:** [More Information Needed]
-- **Cloud Provider:** [More Information Needed]
-- **Compute Region:** [More Information Needed]
-- **Carbon Emitted:** [More Information Needed]
-## Technical Specifications [optional]
-### Model Architecture and Objective
-[More Information Needed]
-### Compute Infrastructure
-[More Information Needed]
-#### Hardware
-[More Information Needed]
-#### Software
-[More Information Needed]
-## Citation [optional]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
 **BibTeX:**
-[More Information Needed]
-**APA:**
-[More Information Needed]
-## Glossary [optional]
-<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
-[More Information Needed]
-## More Information [optional]
 @misc{wilms_annotation_2021,
 	title = {Annotation {Guidelines} for {GermEval} 2021 {Shared} {Task} on the {Identification} of {Toxic}, {Engaging}, and {Fact}-{Claiming} {Comments}. {Excerpt} of an unpublished codebook of the {DEDIS} research group at {Heinrich}-{Heine}-{University} {Düsseldorf} (full version available on request)},
 	author = {Wilms, L. and Heinbach, D. and Ziegele, M.},
@@ -207,3 +107,6 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
 	author = {Risch, Julian and Stoll, Anke and Wilms, Lena and Wiegand, Michael},
 	year = {2021},
 }

 # Model Card for Model ID
+Fine-tuned [XLM-R Large](https://huggingface.co/FacebookAI/xlm-roberta-large) for task of classifying sentences as factual or not. The taxonomy for factual claims follows Wilms et al. 2021. The model was first trained on a Telegram dataset that was annotated using GPT-4o with this [prompt](https://huggingface.co/Sami92/XLM-R-Large-ClaimDetection/blob/main/FactualityPrompt_GPT.txt). In a second step it was trained on the data from Risch et al. 2021. It was tested on a sample of Telegram posts that were annotated by four trained coders.
 ### Model Description
+This model is a fine-tuned version of [XLM-R Large](https://huggingface.co/FacebookAI/xlm-roberta-large). It is trained to classify factual claims, a task that is common to automated fact-checking. It was trained in a weakly-supervised fashion. First on a weakly annotated Telegram dataset using GPT-4o and then on the manually annotated dataset from Risch et al. 2021. The datasets are German, however, the underlying model is multilingual. It was not tested how the model performs in other languages. For testing a set of Telegram posts was annotated by four trained coders and the majority label was taken. The model achieves an accuracy score of 0.9 on this dataset. On the test split of Risch et al. 2021, which is drawn from Facebook comments, the model achieves an accuracy of 0.79.
 ## Bias, Risks, and Limitations
 ## How to Get Started with the Model
+```python
+from transformers import pipeline
+texts = [
+    'WTH Riesige giftige Flugspinnen mit 4-Zoll-Beinen auf dem Weg in die Gegend von New York, während sie sich über die Ostküste ausbreiten. Zuerst kamen die gefleckten Laternenfliegen, dann die Zikaden und jetzt die Spinnen. Der Nordosten der USA bereitet sich auf eine Invasion riesiger giftiger Spinnen vor, deren Beine nur einen halben Zoll lang sind und mit dem Fallschirm durch die Luft fliegen können. cbsnews.com/news/joro-spid…',
+    'Es ist Ihnen halt nicht genug was zerstört wurde, Ermittlungen eingestellt und dann kommt die nächste Katastrophe... Wer hier an Zufälle glaubt hat nichts verstanden... <URL>',
+    'IMPFUNG MACHT FREI!!! Schickt das Video an alle eure Kontakte! Abonniert bitte unseren Kanal: <URL> Folgt unserem Chat: <URL> Verbreitet unsere Inhalte und Wissen für den Frieden',
+]
+checkpoint = "Sami92/XLM-R-Large-ClaimDetection"
+tokenizer_kwargs = {'padding':True,'truncation':True,'max_length':512}
+claimdetection = pipeline("text-classification", model = checkpoint, tokenizer =checkpoint, **tokenizer_kwargs, device="cuda")
+claimdetection(texts)
+>>>
+[{'label': 'factual', 'score': 0.9999344348907471},
+ {'label': 'non-factual', 'score': 0.9990422129631042},
+ {'label': 'non-factual', 'score': 0.9990965127944946}]
+```
 ## Training Details
 ### Training Data
+The training proceeded in two steps. First, the model was trained on a weakly annotated dataset and then on the dataset published by Risch et al. 2021. For more information on the second dataset, see the publication.
+The weak annotation was performed using GPT-4o. The prompt for labeling the data can be found [here](https://huggingface.co/Sami92/XLM-R-Large-ClaimDetection/blob/main/FactualityPrompt_GPT.txt). The data was taken from Telegram. More specifically from a set of about 200 channels that have been subject to a fact-check from either Correctiv, dpa, Faktenfuchs or AFP. The test data consists of 149 Telegram posts. The performance is as follows.
+|                | precision | recall | f1-score | support |
+|----------------|-----------|--------|----------|---------|
+| **factual**    | 0.88      | 0.92   | 0.90     | 71      |
+| **non-factual**| 0.92      | 0.88   | 0.90     | 78      |
+|                |           |        |          |         |
+| **accuracy**   |           |        | 0.90     | 149     |
+| **macro avg**  | 0.90      | 0.90   | 0.90     | 149     |
+| **weighted avg** | 0.90    | 0.90   | 0.90     | 149     |
 #### Training Hyperparameters
+Weakly-supervised Training on Telegram Data
+Epochs: 10
+Batch size: 16
+learning_rate: 2e-5
+weight_decay: 0.01
+fp16: True
+Supervised Training on Risch et al. 2021 Data
+Epochs: 10
+Batch size: 16
+learning_rate: 2e-5
+weight_decay: 0.01
+fp16: True
 **BibTeX:**
 @misc{wilms_annotation_2021,
 	title = {Annotation {Guidelines} for {GermEval} 2021 {Shared} {Task} on the {Identification} of {Toxic}, {Engaging}, and {Fact}-{Claiming} {Comments}. {Excerpt} of an unpublished codebook of the {DEDIS} research group at {Heinrich}-{Heine}-{University} {Düsseldorf} (full version available on request)},
 	author = {Wilms, L. and Heinbach, D. and Ziegele, M.},
 	author = {Risch, Julian and Stoll, Anke and Wilms, Lena and Wiegand, Michael},
 	year = {2021},
 }