NCI Technique Classifier v2

Multi-label propaganda technique classifier for the NCI (News Content Intelligence) Protocol.

Model Description

This model classifies text into 18 propaganda techniques as part of a two-stage pipeline:

Stage 1: Binary detection (synapti/nci-binary-detector-v2) determines if propaganda exists
Stage 2: This model identifies which specific techniques are used

Techniques Detected

ID	Technique	Description
0	Loaded_Language	Using words with strong emotional implications
1	Appeal_to_fear-prejudice	Seeking to build support by instilling fear
2	Exaggeration,Minimisation	Overstating or understating aspects of issues
3	Repetition	Repeating the same message multiple times
4	Flag-Waving	Appeals to patriotism or group identity
5	Name_Calling,Labeling	Giving a subject a name with negative connotations
6	Reductio_ad_hitlerum	Comparing to Hitler or Nazis to discredit
7	Black-and-White_Fallacy	Presenting only two options when more exist
8	Causal_Oversimplification	Assuming a single cause for complex issues
9	Whataboutism,Straw_Men,Red_Herring	Deflection and misrepresentation tactics
10	Straw_Man	Misrepresenting someone's argument
11	Red_Herring	Introducing irrelevant information
12	Doubt	Questioning credibility of sources
13	Appeal_to_Authority	Citing authorities to support claims
14	Thought-terminating_Cliches	Using clichés to end discussion
15	Bandwagon	Appeal to popularity
16	Slogans	Brief, striking phrases
17	Obfuscation,Intentional_Vagueness,Confusion	Being deliberately unclear

Training

Base Model: answerdotai/ModernBERT-base
Dataset: synapti/nci-propaganda-production (19,581 train, 1,727 val, 1,729 test)
Loss: Focal Loss (gamma=2.0) with class weights for imbalanced techniques
Epochs: 5
Batch Size: 16
Learning Rate: 2e-5
Hardware: NVIDIA A10G GPU

Performance

Metric	Score
Micro F1	80.2%
Macro F1	63.9%
Micro Precision	83.4%
Micro Recall	77.4%

Per-Technique Performance (selected)

Technique	F1 Score
Loaded_Language	97.0%
Appeal_to_fear-prejudice	89.7%
Name_Calling,Labeling	84.3%
Flag-Waving	82.1%

Usage

With Transformers

from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

model = AutoModelForSequenceClassification.from_pretrained("synapti/nci-technique-classifier-v2")
tokenizer = AutoTokenizer.from_pretrained("synapti/nci-technique-classifier-v2")

text = "The radical left is DESTROYING our great nation!"
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)

with torch.no_grad():
    outputs = model(**inputs)
    probs = torch.sigmoid(outputs.logits)[0]

# Get techniques above threshold
threshold = 0.5
techniques = list(model.config.id2label.values())
detected = [(techniques[i], probs[i].item()) for i in range(len(techniques)) if probs[i] > threshold]
print(detected)

With NCI Protocol

from nci.transformers.two_stage_pipeline import TwoStagePipeline

pipeline = TwoStagePipeline.from_pretrained(
    binary_model="synapti/nci-binary-detector-v2",
    technique_model="synapti/nci-technique-classifier-v2",
)

result = pipeline.analyze("The radical left is DESTROYING our great nation!")
print(f"Has propaganda: {result.has_propaganda}")
print(f"Techniques: {[t.name for t in result.techniques if t.above_threshold]}")

ONNX Inference

ONNX model available in onnx/model.onnx for faster inference (~1.25x speedup).

import onnxruntime as ort
import numpy as np
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("synapti/nci-technique-classifier-v2")
session = ort.InferenceSession("onnx/model.onnx")

text = "WAKE UP AMERICA!"
inputs = tokenizer(text, return_tensors="np", truncation=True, max_length=512)

outputs = session.run(None, {
    "input_ids": inputs["input_ids"],
    "attention_mask": inputs["attention_mask"]
})
probs = 1 / (1 + np.exp(-outputs[0]))  # sigmoid

Limitations

Trained primarily on English news articles
May not generalize well to social media or other domains
Threshold of 0.5 may need adjustment for specific use cases
Multi-label classification means multiple techniques can be detected per text

Citation

@misc{nci-technique-classifier-v2,
  author = {Synapti},
  title = {NCI Technique Classifier v2},
  year = {2024},
  publisher = {Hugging Face},
  url = {https://huggingface.co/synapti/nci-technique-classifier-v2}
}

License

MIT License

Downloads last month: 7

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for synapti/nci-technique-classifier-v2

Base model

answerdotai/ModernBERT-base

Quantized

(23)

this model

synapti
/

nci-technique-classifier-v2