File size: 2,271 Bytes
72b170f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
bb0000f
 
 
 
72b170f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8259f46
 
 
 
72b170f
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
---
license: cc-by-nc-sa-4.0
tags:
  - web-privacy
  - tracker-detection
  - entity-attribution
  - feedforward
  - safetensors
  - webassembly
datasets:
  - olafuraron/tracker-radar-ml
---

# Tracking Entity Classifier

Predicts which company owns a third-party tracking domain based on behavioral patterns from DuckDuckGo's [Tracker Radar](https://github.com/duckduckgo/tracker-radar) dataset. No ownership metadata is used as input — the model learns to identify entities from API usage, cookie behavior, resource types, and prevalence patterns.

## Live Preview

[Live preview](https://olafurjohannsson.github.io/tracker-ml/)

## Labels

13 tracking-related entities:

Adobe Inc., ByteDance Ltd., Comcast Corporation, Conversant LLC, Google LLC, HubSpot Inc., Impact, Leven Labs Inc. DBA Admiral, Microsoft Corporation, Oracle Corporation, Salesforce.com Inc., Yahoo Inc., Yandex LLC

## Performance

- **Accuracy:** 58.5%
- **Weighted F1:** 0.604
- **Training data:** 731 domains from Tracker Radar US region
- **Features:** 164 behavioral features

Strong per-entity results for distinctive entities: Leven Labs (F1 0.93), Google (F1 0.75), Microsoft (F1 0.65). Less reliable for smaller entities with few training samples.

## Architecture

Feedforward neural network: 164 → 128 → 64 → 13 with ReLU activations and dropout (0.2). Model size: 118.5 KB.

Designed for on-device inference via [Kjarni](https://github.com/olafurjohannsson/kjarni) WebAssembly runtime with SIMD128 acceleration.

## Usage

Features must be standardized using the provided scaler (mean and scale in `tracking_entity_classifier_scaler.json`) before inference. This model is most meaningful when applied to domains already identified as ad tech by the [entity cluster classifier](https://huggingface.co/olafuraron/entity-cluster-classifier).

## Context

This model demonstrates that tracking companies have identifiable behavioral fingerprints — their domains exhibit characteristic patterns of API usage, cookie behavior, and web presence that distinguish them from other entities. See [TrackerML](https://github.com/olafurjohannsson/tracker-ml) for the full project.

## Links

[Kjarni](https://kjarni.ai)

## License

CC-BY-NC-SA 4.0 (derived from DuckDuckGo Tracker Radar).