poltextlab's picture
ml12 upgrade
e3617c0 verified
metadata
model-index:
  - name: poltextlab/xlm-roberta-large-illframes-migration
    results:
      - task:
          type: text-classification
        metrics:
          - name: Accuracy
            type: accuracy
            value: 54%
          - name: F1-Score
            type: f1
            value: 57%
tags:
  - text-classification
  - pytorch
metrics:
  - precision
  - recall
  - f1-score
language:
  - en
base_model:
  - xlm-roberta-large
pipeline_tag: text-classification
library_name: transformers
license: mit
extra_gated_prompt: >-
  Our models are intended for academic use only. If you are not affiliated with
  an academic institution, please provide a rationale for using our models.
  Please allow us a few business days to manually review subscriptions.
extra_gated_fields:
  Name: text
  Country: country
  Institution: text
  Institution Email: text
  Please specify your academic use case: text

xlm-roberta-large-illframes-migration

Illframes Project - Migration

(illiberal policy frames) The ILLFRAMES Project's goal is to identify illiberal policy frames in texts. The codes are mutually exclusive and unequivocal. The codebook currently only covers the policy domain of migration and covid, but other policy domains such as climate, public health, and democracy will be added in the near future.

The project's migration domain examines the negative portrayals of migration, categorizing frames that depict immigrants as threats to societal values, economic stability, and national security, often framing migration as a burden or challenge to be addressed through stringent policies and actions.

Codebook:

  • 901: This label depicts immigration negatively as changing the social, cultural, religious, and demographic makeup of society. References to the 'Great Replacement,' 'white genocide,' 'Islamisation,' 'Arabisation' and similar conspiracy theories and processes belong to this category.
  • 902: This label depicts migrants as representing increased competition on the labour and housing markets. Policy proposals advocating for harsh conditions for immigrant to work belong to this category. The term 'economic migrant' or 'economic refugee' should be coded here.
  • 903: This label depicts migrants as illegal and/or unlawful. Obtaining status through criminal activity, such as corruption and document forgery, or lies also belong to this category. Statements demanding differenciation between legal and illegal immigration should NOT be coded in this category.
  • 904: This label depicts strong state action as the desirable way to address migration-related issues. Calls for and positive appraisals of policies granting new powers and resources to authorities, establishing strict border control procedures, building walls/fences, returning immigrants, and creating external hotspots belong to this category. Supporting keeping migrants in extraterritorial camps, sending signals to keep migrants away, demanding zero tolerance for migration also belong to this category. Advocating for time-limited refugee status and the deportation of refugees where the situation in the home country has seemingly improved belong to this category.
  • 905: This label depicts migration as posing a threat to national sovereignty and/or state authority. Negative appraisals of and opposition to the activities and plans of supranational (e.g., UN, EU) and subnational organisations (e.g., NGOs, media providing migrants with info on how to circumvent controls) belong in this category.
  • 906: This label depicts immigrants as burdening the state/local administrations, usually in the context high workloads and increased fiscal costs. Decrease in available funding due to the financing needs of migrants also belong to this category.
  • 907: This label describes migration or migration policy as a problem or challenge to be tackled. Instances when the speaker decries someone for not believing that migration is an issue, problem, challenge also belong here. The category also includes general opposition to or fighting against migration. This category should generally only be used if the text does NOT refer to what kind of problem or challenge migration poses. Otherwise it should be coded according to the relevant specific category.
  • 908: This label depicts migrants and their supporters as hostile or the enemy" This frames depicts migrants as representing a threat to the physical safety and security of society. It associates the "invading" migrants with terrorism, drug trafficking, human trafficking, or organised crimes, threatening the national security.
  • 909: This label depicts migrants as representing a threat to the physical safety and security of society. It associates migrants with all kinds of physical abuse and crime, such as murder, rape, fighting, terrorism, drug trafficking, human trafficking, and robbery, creating an atmosphere of suspicion and prejudice. Linkages between migration and the spread of illnesses also belong to this category (biosecurity).
  • 910: This label depicts immigrants are described as a burden on the welfare system, usually in the context of abusing social benefits without working for them. Complaints about migrants using social services without paying taxes belong here. Proposals advocating for foreigners' restricted access to social benefits also belong to this category.
  • 999: None of them

How to use the model

from transformers import AutoTokenizer, pipeline

tokenizer = AutoTokenizer.from_pretrained("xlm-roberta-large")
pipe = pipeline(
    model="poltextlab/xlm-roberta-large-pooled-cap-minor-v3",
    task="text-classification",
    tokenizer=tokenizer,
    use_fast=False,
    truncation=True,
    max_length=512,
    token="<your_hf_read_only_token>"
)

Classification Report

Overall Performance:

  • Accuracy: 54%
  • Macro Avg: Precision: 0.36, Recall: 0.54, F1-score: 0.41
  • Weighted Avg: Precision: 0.73, Recall: 0.54, F1-score: 0.57

Per-Class Metrics:

Label Precision Recall F1-score Support
901: Culture Under Attack 0.33 0.43 0.38 7
902: Economic Burden 0.5 0.89 0.64 47
903: Illegals and Fraudsters 0.34 0.8 0.48 41
904: Extradition Necessity 0.34 0.69 0.45 62
905: Nation State Should Decide 0.14 0.18 0.16 11
906: Administrative Burden 0.33 0.31 0.32 16
907: General System Failure 0.24 0.67 0.36 104
908: Security Threat 0.33 0.79 0.47 24
909: Criminals 0.3 0.35 0.33 20
910: Welfare State Overload 0.25 0.29 0.27 7
999: None of Them 0.89 0.48 0.63 839

Inference platform

This model is used by the CAP Babel Machine, an open-source and free natural language processing tool, designed to simplify and speed up projects for comparative research.

Cooperation

Model performance can be significantly improved by extending our training sets. We appreciate every submission of CAP-coded corpora (of any domain and language) at poltextlab{at}poltextlab{dot}com or by using the CAP Babel Machine.

Debugging and issues

This architecture uses the sentencepiece tokenizer. In order to run the model before transformers==4.27 you need to install it manually.

If you encounter a RuntimeError when loading the model using the from_pretrained() method, adding ignore_mismatched_sizes=True should solve the issue.