---
language: en
tags:
  - dga
  - cybersecurity
  - domain-generation-algorithm
  - text-classification
  - pytorch
license: mit
metrics:
  - accuracy
  - f1
---

# DGA-CNN: Character-level CNN for DGA Detection

Character-level Convolutional Neural Network trained to detect Domain Generation Algorithm (DGA) domains.
Part of the **DGA Multi-Family Benchmark** (Reynier et al., 2026).

## Model Description

- **Architecture:** Single Conv1d layer (64 filters, kernel=3) + MaxPool + FC
- **Input:** Character-level encoding of domain name (max 75 chars)
- **Output:** Binary classification — `legit` (0) or `dga` (1)
- **Framework:** PyTorch

## Performance (54 DGA families, 30 runs each)

| Metric    | Value  |
|-----------|--------|
| Accuracy  | 0.9200 |
| F1        | 0.9000 |
| Precision | 0.9400 |
| Recall    | 0.8900 |
| FPR       | 0.0400 |
| Query Time| 0.490 ms/domain (CPU) |

## Usage

```python
from huggingface_hub import hf_hub_download
import importlib.util, torch

# Download model files
weights = hf_hub_download("Reynier/dga-cnn", "dga_cnn_model_1M.pth")
model_py = hf_hub_download("Reynier/dga-cnn", "model.py")

# Load module
spec = importlib.util.spec_from_file_location("cnn_model", model_py)
mod = importlib.util.module_from_spec(spec)
spec.loader.exec_module(mod)

# Load model
model = mod.load_model(weights)

# Predict
results = mod.predict(model, ["google.com", "xkr3f9mq.ru"])
print(results)
# [{"domain": "google.com", "label": "legit", "score": 0.02},
#  {"domain": "xkr3f9mq.ru", "label": "dga", "score": 0.98}]
```

## Training Data

Trained on `train_1M.csv` — ~845K samples across 54 DGA families + legitimate domains.

## Citation

```bibtex
@article{reynier2026dga,
  title={DGA Multi-Family Benchmark: Comparing Classical and Transformer-based Detectors},
  author={Reynier et al.},
  year={2026}
}
```