license: cc-by-nc-sa-4.0
tags:
- web-privacy
- tracker-detection
- feedforward
- safetensors
- webassembly
datasets:
- olafuraron/tracker-radar-ml
Entity Cluster Classifier
Classifies third-party web domains into four behavioral entity types based on metadata and API usage patterns from DuckDuckGo's Tracker Radar dataset.
Live Preview
Labels
| Label | Description |
|---|---|
ad_tech |
Advertising, analytics, and tracking companies (Google, Microsoft, Adobe, etc.) |
cdn_infra |
CDN and infrastructure providers (Amazon, Akamai, Fastly, etc.) |
platform |
Hosting and platform services (Shopify, GitHub, etc.) |
ad_management |
Ad blocking and ad management tools |
Performance
- Accuracy: 75.2%
- Weighted F1: 0.767
- Training data: 4,973 domains from Tracker Radar US region
- Features: 164 behavioral features (API usage, cookie behavior, prevalence, resource types)
Architecture
Feedforward neural network: 164 → 64 → 32 → 4 with ReLU activations and dropout (0.2). Model size: 50.3 KB.
Designed for on-device inference via Kjarni WebAssembly runtime with SIMD128 acceleration.
Usage
Features must be standardized using the provided scaler (mean and scale in entity_cluster_classifier_scaler.json) before inference.
Context
57% of domains in Tracker Radar have no ownership information. This model predicts what type of entity a domain belongs to based purely on behavioral signals — no ownership metadata is used as input. See TrackerML for the full project.
Links
License
CC-BY-NC-SA 4.0 (derived from DuckDuckGo Tracker Radar).