Entity Cluster Classifier

Classifies third-party web domains into four behavioral entity types based on metadata and API usage patterns from DuckDuckGo's Tracker Radar dataset.

Live Preview

Live preview

Labels

Label Description
ad_tech Advertising, analytics, and tracking companies (Google, Microsoft, Adobe, etc.)
cdn_infra CDN and infrastructure providers (Amazon, Akamai, Fastly, etc.)
platform Hosting and platform services (Shopify, GitHub, etc.)
ad_management Ad blocking and ad management tools

Performance

  • Accuracy: 75.2%
  • Weighted F1: 0.767
  • Training data: 4,973 domains from Tracker Radar US region
  • Features: 164 behavioral features (API usage, cookie behavior, prevalence, resource types)

Architecture

Feedforward neural network: 164 → 64 → 32 → 4 with ReLU activations and dropout (0.2). Model size: 50.3 KB.

Designed for on-device inference via Kjarni WebAssembly runtime with SIMD128 acceleration.

Usage

Features must be standardized using the provided scaler (mean and scale in entity_cluster_classifier_scaler.json) before inference.

Context

57% of domains in Tracker Radar have no ownership information. This model predicts what type of entity a domain belongs to based purely on behavioral signals — no ownership metadata is used as input. See TrackerML for the full project.

Links

Kjarni

License

CC-BY-NC-SA 4.0 (derived from DuckDuckGo Tracker Radar).

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train olafuraron/entity-cluster-classifier