NCI Binary Propaganda Detector v2

This model is Stage 1 of the NCI (Narrative Control Index) two-stage propaganda detection pipeline. It performs binary classification to detect whether text contains ANY propaganda techniques.

Model Description

Performance

Metric Value
Accuracy 99.4%
Precision 98.9%
Recall 100.0%
F1 Score 99.4%
False Positive Rate 1.47%
False Negative Rate 0.00%

Confusion Matrix (Test Set, n=1,729)

                  Predicted
                  No Prop | Has Prop
Actual No Prop:      736  |     11
Actual Has Prop:       0  |    982

Threshold Analysis

Threshold Accuracy Precision Recall F1
0.3 99.2% 98.6% 100% 99.3%
0.4 99.2% 98.7% 100% 99.3%
0.5 99.4% 98.9% 100% 99.4%
0.6 99.7% 99.4% 100% 99.7%
0.7 99.7% 99.5% 100% 99.7%

Recommended threshold: 0.5 (default) or 0.6 for reduced false positives

Training Details

  • Loss Function: Focal Loss (gamma=2.0, alpha=0.25) for class imbalance
  • Optimizer: AdamW with weight decay 0.01
  • Learning Rate: 2e-5 with warmup ratio 0.1
  • Batch Size: 16 (effective 32 with gradient accumulation)
  • Epochs: 5 with early stopping (patience=3)
  • Best Model Selection: Based on F1 score on validation set

Usage

With Transformers Pipeline

from transformers import pipeline

detector = pipeline(
    "text-classification",
    model="synapti/nci-binary-detector-v2"
)

result = detector("The radical left is DESTROYING our country!")
# [{"label": "has_propaganda", "score": 0.99}]

result = detector("The Federal Reserve announced a 0.25% rate increase.")
# [{"label": "no_propaganda", "score": 0.98}]

With AutoModel

from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

model = AutoModelForSequenceClassification.from_pretrained("synapti/nci-binary-detector-v2")
tokenizer = AutoTokenizer.from_pretrained("synapti/nci-binary-detector-v2")

text = "Wake up, people! They are hiding the truth from you!"
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)

with torch.no_grad():
    outputs = model(**inputs)
    probs = torch.softmax(outputs.logits, dim=1)
    propaganda_prob = probs[0, 1].item()

print(f"Propaganda probability: {propaganda_prob:.2%}")

Two-Stage Pipeline (Recommended)

For full propaganda analysis with technique identification:

from transformers import pipeline

# Stage 1: Binary detection
binary_detector = pipeline(
    "text-classification",
    model="synapti/nci-binary-detector-v2"
)

# Stage 2: Technique classification
technique_classifier = pipeline(
    "text-classification",
    model="synapti/nci-technique-classifier-v2",
    top_k=None
)

text = "Some text to analyze..."

# Run Stage 1
binary_result = binary_detector(text)[0]
if binary_result["label"] == "has_propaganda" and binary_result["score"] >= 0.5:
    # Run Stage 2 only if propaganda detected
    techniques = technique_classifier(text)[0]
    detected = [t for t in techniques if t["score"] >= 0.3]
    print(f"Detected techniques: {[t['label'] for t in detected]}")
else:
    print("No propaganda detected")

Labels

Label ID Label Name Description
0 no_propaganda Text does not contain propaganda techniques
1 has_propaganda Text contains one or more propaganda techniques

Intended Use

Primary Use Cases

  • Media literacy tools and browser extensions
  • Content moderation assistance
  • Research on information manipulation
  • Educational platforms for critical thinking

Out of Scope

  • Censorship or automated content removal
  • Political targeting or surveillance
  • Single-source truth determination

Limitations

  • Optimized for English text
  • May have reduced performance on very short texts (<10 words)
  • Trained primarily on political/news content; domain shift may affect performance
  • Should be used as one signal among many, not as sole arbiter

Related Models

Citation

If you use this model, please cite:

@misc{nci-binary-detector-v2,
  author = {Synapti},
  title = {NCI Binary Propaganda Detector v2},
  year = {2024},
  publisher = {HuggingFace},
  url = {https://huggingface.co/synapti/nci-binary-detector-v2}
}
Downloads last month
116
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for synapti/nci-binary-detector-v2

Quantized
(14)
this model

Dataset used to train synapti/nci-binary-detector-v2

Evaluation results