dga-cnn / README.md
Reynier's picture
Upload README.md with huggingface_hub
a24a7ff verified
---
language: en
tags:
- dga
- cybersecurity
- domain-generation-algorithm
- text-classification
- pytorch
license: mit
metrics:
- accuracy
- f1
---
# DGA-CNN: Character-level CNN for DGA Detection
Character-level Convolutional Neural Network trained to detect Domain Generation Algorithm (DGA) domains.
Part of the **DGA Multi-Family Benchmark** (Reynier et al., 2026).
## Model Description
- **Architecture:** Single Conv1d layer (64 filters, kernel=3) + MaxPool + FC
- **Input:** Character-level encoding of domain name (max 75 chars)
- **Output:** Binary classification — `legit` (0) or `dga` (1)
- **Framework:** PyTorch
## Performance (54 DGA families, 30 runs each)
| Metric | Value |
|-----------|--------|
| Accuracy | 0.9200 |
| F1 | 0.9000 |
| Precision | 0.9400 |
| Recall | 0.8900 |
| FPR | 0.0400 |
| Query Time| 0.490 ms/domain (CPU) |
## Usage
```python
from huggingface_hub import hf_hub_download
import importlib.util, torch
# Download model files
weights = hf_hub_download("Reynier/dga-cnn", "dga_cnn_model_1M.pth")
model_py = hf_hub_download("Reynier/dga-cnn", "model.py")
# Load module
spec = importlib.util.spec_from_file_location("cnn_model", model_py)
mod = importlib.util.module_from_spec(spec)
spec.loader.exec_module(mod)
# Load model
model = mod.load_model(weights)
# Predict
results = mod.predict(model, ["google.com", "xkr3f9mq.ru"])
print(results)
# [{"domain": "google.com", "label": "legit", "score": 0.02},
# {"domain": "xkr3f9mq.ru", "label": "dga", "score": 0.98}]
```
## Training Data
Trained on `train_1M.csv` — ~845K samples across 54 DGA families + legitimate domains.
## Citation
```bibtex
@article{reynier2026dga,
title={DGA Multi-Family Benchmark: Comparing Classical and Transformer-based Detectors},
author={Reynier et al.},
year={2026}
}
```