olafuraron's picture
Update README.md
b60c1c4 verified
metadata
license: cc-by-nc-sa-4.0
tags:
  - web-privacy
  - tracker-detection
  - feedforward
  - safetensors
  - webassembly
datasets:
  - olafuraron/tracker-radar-ml

Entity Cluster Classifier

Classifies third-party web domains into four behavioral entity types based on metadata and API usage patterns from DuckDuckGo's Tracker Radar dataset.

Live Preview

Live preview

Labels

Label Description
ad_tech Advertising, analytics, and tracking companies (Google, Microsoft, Adobe, etc.)
cdn_infra CDN and infrastructure providers (Amazon, Akamai, Fastly, etc.)
platform Hosting and platform services (Shopify, GitHub, etc.)
ad_management Ad blocking and ad management tools

Performance

  • Accuracy: 75.2%
  • Weighted F1: 0.767
  • Training data: 4,973 domains from Tracker Radar US region
  • Features: 164 behavioral features (API usage, cookie behavior, prevalence, resource types)

Architecture

Feedforward neural network: 164 → 64 → 32 → 4 with ReLU activations and dropout (0.2). Model size: 50.3 KB.

Designed for on-device inference via Kjarni WebAssembly runtime with SIMD128 acceleration.

Usage

Features must be standardized using the provided scaler (mean and scale in entity_cluster_classifier_scaler.json) before inference.

Context

57% of domains in Tracker Radar have no ownership information. This model predicts what type of entity a domain belongs to based purely on behavioral signals — no ownership metadata is used as input. See TrackerML for the full project.

Links

Kjarni

License

CC-BY-NC-SA 4.0 (derived from DuckDuckGo Tracker Radar).