VERITASNET: Temporal Credibility Monitoring for RPP Attack Detection

A complete system for detecting Reputation Pre-Positioning (RPP) attacks on web credibility graphs.

RPP attacks are a class of adversarial manipulation where state-sponsored information operations systematically build artificial credibility for low-quality domains over 6–18 months before deploying them as disinformation vectors.

Architecture

VERITASNET combines three detection subsystems:

1. Hawkes Credibility Kernel (`veritasnet/core/hawkes_kernel.py`)

Self-exciting point process model of citation dynamics
Pure PyTorch MLE fitting (no external Hawkes library required)
Detects anomalous citation bursts via NLL deviation
α-collapse detection: RPP signature of high self-excitation during pre-positioning → sudden drop during activation
Branching ratio analysis: Domains near criticality (η ≈ 1) are suspicious

2. Three-Signal RPP Detector (`veritasnet/core/rpp_detector.py`)

Signal 1 — Citation Velocity Anomaly (CVA): Super-linear growth detection via power-law exponent estimation
Signal 2 — Hyperlink Fan-In Homogeneity (HFH): Entropy-based diversity analysis of linking domains
Signal 3 — Domain Age–Authority Gap (DAAG): 'Too good too fast' detection via logistic growth model
Weights learned via logistic regression: w₁=0.42, w₂=0.31, w₃=0.27

3. Temporal GNN with Adversarial Hardening (`veritasnet/models/`)

TGN-based architecture (Rossi et al., 2020) with dual heads:
- Credibility regression head (CrediBench-compatible)
- RPP binary classification head
GRU node memory for temporal state tracking
Adversarial training via PGD graph perturbation (Madry et al., 2018)
RPP-specific perturbations that mimic realistic campaign patterns

MiroFish Live Integration (`veritasnet/mirofish/live_simulation.py`)

Real LLM-driven multi-agent simulation using any open-source LLM via OpenAI-compatible API. No paid API keys needed.

from veritasnet.mirofish.live_simulation import LiveRPPSimulation
from veritasnet.mirofish.integration import RPPAgentProfileGenerator

# Generate RPP campaign agent profiles
gen = RPPAgentProfileGenerator()
profiles = gen.generate_campaign_profiles(n_operators=5, n_amplifiers=15, n_organic=50)

# Run live simulation with open-source LLM
sim = LiveRPPSimulation(
    # Option A: Ollama (local, free)
    llm_base_url="http://localhost:11434/v1",
    llm_model="qwen2.5:7b",
    llm_api_key="ollama",
    
    # Option B: vLLM (local, fast)
    # llm_base_url="http://localhost:8000/v1",
    # llm_model="Qwen/Qwen2.5-7B-Instruct",
    
    # Option C: HuggingFace Inference API  
    # llm_base_url="https://api-inference.huggingface.co/v1",
    # llm_model="Qwen/Qwen2.5-72B-Instruct",
    # llm_api_key="hf_YOUR_TOKEN",
)

# Run simulation — each agent uses LLM to decide actions
profile_dicts = [{'user_id': p.user_id, 'username': p.username, 
                   'name': p.name, 'persona': p.persona} for p in profiles]
actions_path = sim.run(profile_dicts, n_rounds=50, agents_per_round=10)

# Get citation graph for VERITASNET detection
src, dst, timestamps = sim.get_citation_graph()

Supported Open-Source LLM Backends

Backend	Setup	Model	Notes
Ollama	`ollama serve` then `ollama pull qwen2.5:7b`	qwen2.5:7b, llama3.1:8b, mistral:7b	Easiest local setup
vLLM	`vllm serve Qwen/Qwen2.5-7B-Instruct`	Any HF model	Fastest inference
llama.cpp	`./llama-server -m model.gguf`	GGUF quantized models	Low RAM usage
HF TGI	`docker run ghcr.io/huggingface/text-generation-inference`	Any HF model	Production ready
HF Inference API	No setup needed	Qwen2.5-72B, Llama-3.1-70B	Free tier available

Quick Start (Full Pipeline)

from veritasnet.pipeline import VeritasNetPipeline

pipeline = VeritasNetPipeline(
    output_dir='./output',
    device='cpu',  # or 'cuda'
    config={
        'n_domains': 10000,
        'n_edges': 100000,
        'n_snapshots': 6,
        'tgn_epochs': 10,
        'adv_epochs': 5,
    }
)

# Run complete detection pipeline
reports = pipeline.run_full_pipeline()

# Also generate MiroFish-compatible configs
pipeline.generate_mirofish_config()

Results (Synthetic Benchmark)

System	Precision	Recall	F1	Detection Lag
Static baseline (CrediBench-like)	0.000	0.000	0.000	N/A (blind)
3-Signal RPP Detector	0.968	0.750	0.845	140 ± 45 days
VERITASNET (full)	0.793	0.767	0.780	142 ± 45 days

File Structure

veritasnet/
├── core/
│   ├── hawkes_kernel.py        # Hawkes process credibility kernel (pure PyTorch)
│   └── rpp_detector.py         # Three-signal RPP detector (CVA + HFH + DAAG)
├── models/
│   ├── temporal_gnn.py         # TGN with dual credibility/RPP heads
│   └── adversarial.py          # PGD adversarial training + defenses
├── simulation/
│   └── rpp_simulator.py        # Synthetic RPP campaign generator
├── data/
│   └── web_graph.py            # CommonCrawl WAT extraction + graph builder
├── evaluation/
│   └── metrics.py              # Full evaluation suite
├── mirofish/
│   ├── integration.py          # Profile generation + config + output conversion
│   └── live_simulation.py      # ★ Live LLM-driven agent simulation (camel-ai)
├── pipeline.py                 # End-to-end orchestrator
└── utils/

Dependencies

torch>=2.0
torch_geometric>=2.5
scipy
scikit-learn
numpy
pandas
camel-ai                 # For live agent simulation
openai                   # OpenAI-compatible API client
warcio (optional)        # For WAT file processing

How MiroFish Is Used

MiroFish is a multi-agent swarm intelligence engine (by Shanda Group) powered by OASIS. VERITASNET integrates with it via:

live_simulation.py: Runs real LLM-driven agent simulations using camel-ai + any open-source LLM (Qwen2.5, Llama3, Mistral). Each agent gets a persona and autonomously decides to post, like, repost, or follow — producing realistic social media dynamics. Output is MiroFish-compatible actions.jsonl.
integration.py: Generates MiroFish/OASIS-compatible agent profiles (Twitter CSV + Reddit JSON), simulation configs, and converts simulation output to VERITASNET citation graphs.
Pipeline integration: The full VERITASNET pipeline can use either synthetic RPP campaigns (fast, no LLM needed) or live MiroFish simulation (realistic, needs LLM endpoint) as the attack generation backend.

License

CC BY 4.0

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support