NCI Binary Propaganda Detector v2

This model is Stage 1 of the NCI (Narrative Control Index) two-stage propaganda detection pipeline. It performs binary classification to detect whether text contains ANY propaganda techniques.

Model Description

Model Type: Binary text classifier
Base Model: answerdotai/ModernBERT-base
Training Data: synapti/nci-binary-classification (24,517 train, 1,727 validation, 1,729 test)
Language: English
License: Apache 2.0

Performance

Metric	Value
Accuracy	99.4%
Precision	98.9%
Recall	100.0%
F1 Score	99.4%
False Positive Rate	1.47%
False Negative Rate	0.00%

Confusion Matrix (Test Set, n=1,729)

                  Predicted
                  No Prop | Has Prop
Actual No Prop:      736  |     11
Actual Has Prop:       0  |    982

Threshold Analysis

Threshold	Accuracy	Precision	Recall	F1
0.3	99.2%	98.6%	100%	99.3%
0.4	99.2%	98.7%	100%	99.3%
0.5	99.4%	98.9%	100%	99.4%
0.6	99.7%	99.4%	100%	99.7%
0.7	99.7%	99.5%	100%	99.7%

Recommended threshold: 0.5 (default) or 0.6 for reduced false positives

Training Details

Loss Function: Focal Loss (gamma=2.0, alpha=0.25) for class imbalance
Optimizer: AdamW with weight decay 0.01
Learning Rate: 2e-5 with warmup ratio 0.1
Batch Size: 16 (effective 32 with gradient accumulation)
Epochs: 5 with early stopping (patience=3)
Best Model Selection: Based on F1 score on validation set

Usage

With Transformers Pipeline

from transformers import pipeline

detector = pipeline(
    "text-classification",
    model="synapti/nci-binary-detector-v2"
)

result = detector("The radical left is DESTROYING our country!")
# [{"label": "has_propaganda", "score": 0.99}]

result = detector("The Federal Reserve announced a 0.25% rate increase.")
# [{"label": "no_propaganda", "score": 0.98}]

With AutoModel

from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

model = AutoModelForSequenceClassification.from_pretrained("synapti/nci-binary-detector-v2")
tokenizer = AutoTokenizer.from_pretrained("synapti/nci-binary-detector-v2")

text = "Wake up, people! They are hiding the truth from you!"
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)

with torch.no_grad():
    outputs = model(**inputs)
    probs = torch.softmax(outputs.logits, dim=1)
    propaganda_prob = probs[0, 1].item()

print(f"Propaganda probability: {propaganda_prob:.2%}")

Two-Stage Pipeline (Recommended)

For full propaganda analysis with technique identification:

from transformers import pipeline

# Stage 1: Binary detection
binary_detector = pipeline(
    "text-classification",
    model="synapti/nci-binary-detector-v2"
)

# Stage 2: Technique classification
technique_classifier = pipeline(
    "text-classification",
    model="synapti/nci-technique-classifier-v2",
    top_k=None
)

text = "Some text to analyze..."

# Run Stage 1
binary_result = binary_detector(text)[0]
if binary_result["label"] == "has_propaganda" and binary_result["score"] >= 0.5:
    # Run Stage 2 only if propaganda detected
    techniques = technique_classifier(text)[0]
    detected = [t for t in techniques if t["score"] >= 0.3]
    print(f"Detected techniques: {[t['label'] for t in detected]}")
else:
    print("No propaganda detected")

Labels

Label ID	Label Name	Description
0	no_propaganda	Text does not contain propaganda techniques
1	has_propaganda	Text contains one or more propaganda techniques

Intended Use

Primary Use Cases

Media literacy tools and browser extensions
Content moderation assistance
Research on information manipulation
Educational platforms for critical thinking

Out of Scope

Censorship or automated content removal
Political targeting or surveillance
Single-source truth determination

Limitations

Optimized for English text
May have reduced performance on very short texts (<10 words)
Trained primarily on political/news content; domain shift may affect performance
Should be used as one signal among many, not as sole arbiter

Related Models

Stage 2: synapti/nci-technique-classifier-v2 - Multi-label technique classification
Dataset: synapti/nci-binary-classification

Citation

If you use this model, please cite:

@misc{nci-binary-detector-v2,
  author = {Synapti},
  title = {NCI Binary Propaganda Detector v2},
  year = {2024},
  publisher = {HuggingFace},
  url = {https://huggingface.co/synapti/nci-binary-detector-v2}
}

Downloads last month: 344

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for synapti/nci-binary-detector-v2

Base model

answerdotai/ModernBERT-base

Quantized

(22)

this model

Dataset used to train synapti/nci-binary-detector-v2

Evaluation results

Accuracy on NCI Binary Classification
test set self-reported

0.994
F1 on NCI Binary Classification
test set self-reported

0.994
Precision on NCI Binary Classification
test set self-reported

0.989
Recall on NCI Binary Classification
test set self-reported

1.000