olafuraron's picture
Update README.md
bb0000f verified
---
license: cc-by-nc-sa-4.0
tags:
- web-privacy
- tracker-detection
- entity-attribution
- feedforward
- safetensors
- webassembly
datasets:
- olafuraron/tracker-radar-ml
---
# Tracking Entity Classifier
Predicts which company owns a third-party tracking domain based on behavioral patterns from DuckDuckGo's [Tracker Radar](https://github.com/duckduckgo/tracker-radar) dataset. No ownership metadata is used as input — the model learns to identify entities from API usage, cookie behavior, resource types, and prevalence patterns.
## Live Preview
[Live preview](https://olafurjohannsson.github.io/tracker-ml/)
## Labels
13 tracking-related entities:
Adobe Inc., ByteDance Ltd., Comcast Corporation, Conversant LLC, Google LLC, HubSpot Inc., Impact, Leven Labs Inc. DBA Admiral, Microsoft Corporation, Oracle Corporation, Salesforce.com Inc., Yahoo Inc., Yandex LLC
## Performance
- **Accuracy:** 58.5%
- **Weighted F1:** 0.604
- **Training data:** 731 domains from Tracker Radar US region
- **Features:** 164 behavioral features
Strong per-entity results for distinctive entities: Leven Labs (F1 0.93), Google (F1 0.75), Microsoft (F1 0.65). Less reliable for smaller entities with few training samples.
## Architecture
Feedforward neural network: 164 → 128 → 64 → 13 with ReLU activations and dropout (0.2). Model size: 118.5 KB.
Designed for on-device inference via [Kjarni](https://github.com/olafurjohannsson/kjarni) WebAssembly runtime with SIMD128 acceleration.
## Usage
Features must be standardized using the provided scaler (mean and scale in `tracking_entity_classifier_scaler.json`) before inference. This model is most meaningful when applied to domains already identified as ad tech by the [entity cluster classifier](https://huggingface.co/olafuraron/entity-cluster-classifier).
## Context
This model demonstrates that tracking companies have identifiable behavioral fingerprints — their domains exhibit characteristic patterns of API usage, cookie behavior, and web presence that distinguish them from other entities. See [TrackerML](https://github.com/olafurjohannsson/tracker-ml) for the full project.
## Links
[Kjarni](https://kjarni.ai)
## License
CC-BY-NC-SA 4.0 (derived from DuckDuckGo Tracker Radar).