olafuraron's picture
Update README.md
bb0000f verified
metadata
license: cc-by-nc-sa-4.0
tags:
  - web-privacy
  - tracker-detection
  - entity-attribution
  - feedforward
  - safetensors
  - webassembly
datasets:
  - olafuraron/tracker-radar-ml

Tracking Entity Classifier

Predicts which company owns a third-party tracking domain based on behavioral patterns from DuckDuckGo's Tracker Radar dataset. No ownership metadata is used as input — the model learns to identify entities from API usage, cookie behavior, resource types, and prevalence patterns.

Live Preview

Live preview

Labels

13 tracking-related entities:

Adobe Inc., ByteDance Ltd., Comcast Corporation, Conversant LLC, Google LLC, HubSpot Inc., Impact, Leven Labs Inc. DBA Admiral, Microsoft Corporation, Oracle Corporation, Salesforce.com Inc., Yahoo Inc., Yandex LLC

Performance

  • Accuracy: 58.5%
  • Weighted F1: 0.604
  • Training data: 731 domains from Tracker Radar US region
  • Features: 164 behavioral features

Strong per-entity results for distinctive entities: Leven Labs (F1 0.93), Google (F1 0.75), Microsoft (F1 0.65). Less reliable for smaller entities with few training samples.

Architecture

Feedforward neural network: 164 → 128 → 64 → 13 with ReLU activations and dropout (0.2). Model size: 118.5 KB.

Designed for on-device inference via Kjarni WebAssembly runtime with SIMD128 acceleration.

Usage

Features must be standardized using the provided scaler (mean and scale in tracking_entity_classifier_scaler.json) before inference. This model is most meaningful when applied to domains already identified as ad tech by the entity cluster classifier.

Context

This model demonstrates that tracking companies have identifiable behavioral fingerprints — their domains exhibit characteristic patterns of API usage, cookie behavior, and web presence that distinguish them from other entities. See TrackerML for the full project.

Links

Kjarni

License

CC-BY-NC-SA 4.0 (derived from DuckDuckGo Tracker Radar).