File size: 1,879 Bytes
e17367d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b60c1c4
 
 
 
e17367d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b0991aa
 
 
 
e17367d
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
---
license: cc-by-nc-sa-4.0
tags:
  - web-privacy
  - tracker-detection
  - feedforward
  - safetensors
  - webassembly
datasets:
  - olafuraron/tracker-radar-ml
---

# Entity Cluster Classifier

Classifies third-party web domains into four behavioral entity types based on metadata and API usage patterns from DuckDuckGo's [Tracker Radar](https://github.com/duckduckgo/tracker-radar) dataset.

## Live Preview

[Live preview](https://olafurjohannsson.github.io/tracker-ml/)

## Labels

| Label | Description |
|---|---|
| `ad_tech` | Advertising, analytics, and tracking companies (Google, Microsoft, Adobe, etc.) |
| `cdn_infra` | CDN and infrastructure providers (Amazon, Akamai, Fastly, etc.) |
| `platform` | Hosting and platform services (Shopify, GitHub, etc.) |
| `ad_management` | Ad blocking and ad management tools |

## Performance

- **Accuracy:** 75.2%
- **Weighted F1:** 0.767
- **Training data:** 4,973 domains from Tracker Radar US region
- **Features:** 164 behavioral features (API usage, cookie behavior, prevalence, resource types)

## Architecture

Feedforward neural network: 164 → 64 → 32 → 4 with ReLU activations and dropout (0.2). Model size: 50.3 KB.

Designed for on-device inference via [Kjarni](https://github.com/olafurjohannsson/kjarni) WebAssembly runtime with SIMD128 acceleration.

## Usage

Features must be standardized using the provided scaler (mean and scale in `entity_cluster_classifier_scaler.json`) before inference.

## Context

57% of domains in Tracker Radar have no ownership information. This model predicts what type of entity a domain belongs to based purely on behavioral signals — no ownership metadata is used as input. See [TrackerML](https://github.com/olafurjohannsson/tracker-ml) for the full project.

## Links

[Kjarni](https://kjarni.ai)

## License

CC-BY-NC-SA 4.0 (derived from DuckDuckGo Tracker Radar).