--- language: en tags: - dga - cybersecurity - domain-generation-algorithm - text-classification - pytorch license: mit metrics: - accuracy - f1 --- # DGA-CNN: Character-level CNN for DGA Detection Character-level Convolutional Neural Network trained to detect Domain Generation Algorithm (DGA) domains. Part of the **DGA Multi-Family Benchmark** (Reynier et al., 2026). ## Model Description - **Architecture:** Single Conv1d layer (64 filters, kernel=3) + MaxPool + FC - **Input:** Character-level encoding of domain name (max 75 chars) - **Output:** Binary classification — `legit` (0) or `dga` (1) - **Framework:** PyTorch ## Performance (54 DGA families, 30 runs each) | Metric | Value | |-----------|--------| | Accuracy | 0.9200 | | F1 | 0.9000 | | Precision | 0.9400 | | Recall | 0.8900 | | FPR | 0.0400 | | Query Time| 0.490 ms/domain (CPU) | ## Usage ```python from huggingface_hub import hf_hub_download import importlib.util, torch # Download model files weights = hf_hub_download("Reynier/dga-cnn", "dga_cnn_model_1M.pth") model_py = hf_hub_download("Reynier/dga-cnn", "model.py") # Load module spec = importlib.util.spec_from_file_location("cnn_model", model_py) mod = importlib.util.module_from_spec(spec) spec.loader.exec_module(mod) # Load model model = mod.load_model(weights) # Predict results = mod.predict(model, ["google.com", "xkr3f9mq.ru"]) print(results) # [{"domain": "google.com", "label": "legit", "score": 0.02}, # {"domain": "xkr3f9mq.ru", "label": "dga", "score": 0.98}] ``` ## Training Data Trained on `train_1M.csv` — ~845K samples across 54 DGA families + legitimate domains. ## Citation ```bibtex @article{reynier2026dga, title={DGA Multi-Family Benchmark: Comparing Classical and Transformer-based Detectors}, author={Reynier et al.}, year={2026} } ```