YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
VERITASNET: Temporal Credibility Monitoring for RPP Attack Detection
A complete system for detecting Reputation Pre-Positioning (RPP) attacks on web credibility graphs.
RPP attacks are a class of adversarial manipulation where state-sponsored information operations systematically build artificial credibility for low-quality domains over 6β18 months before deploying them as disinformation vectors.
Architecture
VERITASNET combines three detection subsystems:
1. Hawkes Credibility Kernel (veritasnet/core/hawkes_kernel.py)
- Self-exciting point process model of citation dynamics
- Pure PyTorch MLE fitting (no external Hawkes library required)
- Detects anomalous citation bursts via NLL deviation
- Ξ±-collapse detection: RPP signature of high self-excitation during pre-positioning β sudden drop during activation
- Branching ratio analysis: Domains near criticality (Ξ· β 1) are suspicious
2. Three-Signal RPP Detector (veritasnet/core/rpp_detector.py)
- Signal 1 β Citation Velocity Anomaly (CVA): Super-linear growth detection via power-law exponent estimation
- Signal 2 β Hyperlink Fan-In Homogeneity (HFH): Entropy-based diversity analysis of linking domains
- Signal 3 β Domain AgeβAuthority Gap (DAAG): 'Too good too fast' detection via logistic growth model
- Weights learned via logistic regression: wβ=0.42, wβ=0.31, wβ=0.27
3. Temporal GNN with Adversarial Hardening (veritasnet/models/)
- TGN-based architecture (Rossi et al., 2020) with dual heads:
- Credibility regression head (CrediBench-compatible)
- RPP binary classification head
- GRU node memory for temporal state tracking
- Adversarial training via PGD graph perturbation (Madry et al., 2018)
- RPP-specific perturbations that mimic realistic campaign patterns
MiroFish Live Integration (veritasnet/mirofish/live_simulation.py)
Real LLM-driven multi-agent simulation using any open-source LLM via OpenAI-compatible API. No paid API keys needed.
from veritasnet.mirofish.live_simulation import LiveRPPSimulation
from veritasnet.mirofish.integration import RPPAgentProfileGenerator
# Generate RPP campaign agent profiles
gen = RPPAgentProfileGenerator()
profiles = gen.generate_campaign_profiles(n_operators=5, n_amplifiers=15, n_organic=50)
# Run live simulation with open-source LLM
sim = LiveRPPSimulation(
# Option A: Ollama (local, free)
llm_base_url="http://localhost:11434/v1",
llm_model="qwen2.5:7b",
llm_api_key="ollama",
# Option B: vLLM (local, fast)
# llm_base_url="http://localhost:8000/v1",
# llm_model="Qwen/Qwen2.5-7B-Instruct",
# Option C: HuggingFace Inference API
# llm_base_url="https://api-inference.huggingface.co/v1",
# llm_model="Qwen/Qwen2.5-72B-Instruct",
# llm_api_key="hf_YOUR_TOKEN",
)
# Run simulation β each agent uses LLM to decide actions
profile_dicts = [{'user_id': p.user_id, 'username': p.username,
'name': p.name, 'persona': p.persona} for p in profiles]
actions_path = sim.run(profile_dicts, n_rounds=50, agents_per_round=10)
# Get citation graph for VERITASNET detection
src, dst, timestamps = sim.get_citation_graph()
Supported Open-Source LLM Backends
| Backend | Setup | Model | Notes |
|---|---|---|---|
| Ollama | ollama serve then ollama pull qwen2.5:7b |
qwen2.5:7b, llama3.1:8b, mistral:7b | Easiest local setup |
| vLLM | vllm serve Qwen/Qwen2.5-7B-Instruct |
Any HF model | Fastest inference |
| llama.cpp | ./llama-server -m model.gguf |
GGUF quantized models | Low RAM usage |
| HF TGI | docker run ghcr.io/huggingface/text-generation-inference |
Any HF model | Production ready |
| HF Inference API | No setup needed | Qwen2.5-72B, Llama-3.1-70B | Free tier available |
Quick Start (Full Pipeline)
from veritasnet.pipeline import VeritasNetPipeline
pipeline = VeritasNetPipeline(
output_dir='./output',
device='cpu', # or 'cuda'
config={
'n_domains': 10000,
'n_edges': 100000,
'n_snapshots': 6,
'tgn_epochs': 10,
'adv_epochs': 5,
}
)
# Run complete detection pipeline
reports = pipeline.run_full_pipeline()
# Also generate MiroFish-compatible configs
pipeline.generate_mirofish_config()
Results (Synthetic Benchmark)
| System | Precision | Recall | F1 | Detection Lag |
|---|---|---|---|---|
| Static baseline (CrediBench-like) | 0.000 | 0.000 | 0.000 | N/A (blind) |
| 3-Signal RPP Detector | 0.968 | 0.750 | 0.845 | 140 Β± 45 days |
| VERITASNET (full) | 0.793 | 0.767 | 0.780 | 142 Β± 45 days |
File Structure
veritasnet/
βββ core/
β βββ hawkes_kernel.py # Hawkes process credibility kernel (pure PyTorch)
β βββ rpp_detector.py # Three-signal RPP detector (CVA + HFH + DAAG)
βββ models/
β βββ temporal_gnn.py # TGN with dual credibility/RPP heads
β βββ adversarial.py # PGD adversarial training + defenses
βββ simulation/
β βββ rpp_simulator.py # Synthetic RPP campaign generator
βββ data/
β βββ web_graph.py # CommonCrawl WAT extraction + graph builder
βββ evaluation/
β βββ metrics.py # Full evaluation suite
βββ mirofish/
β βββ integration.py # Profile generation + config + output conversion
β βββ live_simulation.py # β
Live LLM-driven agent simulation (camel-ai)
βββ pipeline.py # End-to-end orchestrator
βββ utils/
Dependencies
torch>=2.0
torch_geometric>=2.5
scipy
scikit-learn
numpy
pandas
camel-ai # For live agent simulation
openai # OpenAI-compatible API client
warcio (optional) # For WAT file processing
How MiroFish Is Used
MiroFish is a multi-agent swarm intelligence engine (by Shanda Group) powered by OASIS. VERITASNET integrates with it via:
live_simulation.py: Runs real LLM-driven agent simulations using camel-ai + any open-source LLM (Qwen2.5, Llama3, Mistral). Each agent gets a persona and autonomously decides to post, like, repost, or follow β producing realistic social media dynamics. Output is MiroFish-compatibleactions.jsonl.integration.py: Generates MiroFish/OASIS-compatible agent profiles (Twitter CSV + Reddit JSON), simulation configs, and converts simulation output to VERITASNET citation graphs.Pipeline integration: The full VERITASNET pipeline can use either synthetic RPP campaigns (fast, no LLM needed) or live MiroFish simulation (realistic, needs LLM endpoint) as the attack generation backend.
License
CC BY 4.0