--- license: mit tags: - log-analysis - anomaly-detection - bert - cybersecurity - multiclass-classification language: - en datasets: - custom-log-dataset metrics: - f1 - accuracy pipeline_tag: text-classification --- # DANN-BERT-Log-Anomaly-Detection - Log Anomaly Detection This model is part of the **Log Anomaly Detection System** that classifies system logs into 7 anomaly categories. ## Model Description DANN-BERT-Log-Anomaly-Detection is a Domain-Adversarial Neural Network BERT model fine-tuned for multi-class log anomaly detection. It can classify logs from 16+ different sources (Apache, SSH, Hadoop, etc.) into 7 categories: 1. **Normal** (0): Benign operations 2. **Security Anomaly** (1): Authentication failures, unauthorized access 3. **System Failure** (2): Crashes, kernel panics 4. **Performance Issue** (3): Timeouts, slow responses 5. **Network Anomaly** (4): Connection errors, packet loss 6. **Config Error** (5): Misconfigurations, invalid settings 7. **Hardware Issue** (6): Disk failures, memory errors ## Performance Metrics - **F1-Score (Macro)**: 0.903 - **Accuracy**: 0.921 - **Model Type**: Domain-Adversarial Neural Network BERT - **Classes**: 7 (normal, security_anomaly, system_failure, performance_issue, network_anomaly, config_error, hardware_issue) ## Usage ```python import torch from transformers import AutoTokenizer, AutoModel # Load the model model = torch.load('model.pt') tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased') # Example usage log_text = "Apr 15 12:34:56 server sshd[1234]: Failed password for admin" inputs = tokenizer(log_text, return_tensors='pt', max_length=128, truncation=True, padding=True) with torch.no_grad(): outputs = model(**inputs) predictions = torch.softmax(outputs.logits, dim=-1) predicted_class = torch.argmax(predictions, dim=-1) ``` ## Training Data - **Sources**: 16 log types (Apache, SSH, Hadoop, HDFS, Linux, Windows, etc.) - **Size**: ~32,000 labeled logs - **Classes**: 7 anomaly categories - **Features**: BERT embeddings + template features + statistical features ## Citation ```bibtex @misc{log-anomaly-detection-2024, title={Log Anomaly Detection System}, author={Krishna Sharma}, year={2024}, url={https://github.com/krishnasharma4415/log-anomaly-detection} } ``` ## License MIT License - see LICENSE file for details.