krishnas4415
/

log-anomaly-detection-models

Text Classification

anomaly-detection

multiclass-classification

Model card Files Files and versions

log-anomaly-detection-models / README.md

krishnas4415's picture

Upload README.md with huggingface_hub

d364315 verified about 2 months ago

|

history blame contribute delete

3.47 kB

	---
	license: mit
	tags:
	- log-analysis
	- anomaly-detection
	- bert
	- cybersecurity
	- multiclass-classification
	language:
	- en
	datasets:
	- custom-log-dataset
	metrics:
	- f1
	- accuracy
	pipeline_tag: text-classification
	---

	# Log Anomaly Detection Models

	This repository contains trained models for the Log Anomaly Detection System that classifies system logs into 7 anomaly categories.

	## 🤖 Available Models

	### BERT-based Models
	- DANN-BERT (`models/DANN-BERT-Log-Anomaly-Detection/`) - Domain-Adversarial Neural Network
	- LoRA-BERT (`models/LoRA-BERT-Log-Anomaly-Detection/`) - Low-Rank Adaptation
	- Hybrid-BERT (`models/Hybrid-BERT-Log-Anomaly-Detection/`) - BERT + Template Features

	### Traditional ML Models
	- XGBoost (`models/XGBoost-Log-Anomaly-Detection/`) - Gradient Boosting Classifier

	## 📊 Model Performance

	\| Model \| F1-Score (Macro) \| Accuracy \| Parameters \|
	\|-------\|-----------------\|----------\|------------\|
	\| Hybrid-BERT \| 92.8% \| 94.3% \| 110M \|
	\| DANN-BERT \| 90.3% \| 92.1% \| 110M \|
	\| LoRA-BERT \| 88.7% \| 90.5% \| 1.5M (trainable) \|
	\| XGBoost \| 88.5% \| 91.2% \| - \|

	## 🎯 Classification Categories

	1. Normal (0): Benign operations
	2. Security Anomaly (1): Authentication failures, unauthorized access
	3. System Failure (2): Crashes, kernel panics
	4. Performance Issue (3): Timeouts, slow responses
	5. Network Anomaly (4): Connection errors, packet loss
	6. Config Error (5): Misconfigurations, invalid settings
	7. Hardware Issue (6): Disk failures, memory errors

	## 🚀 Usage

	### Download Models

	```python
	from huggingface_hub import hf_hub_download

	# Download BERT model
	model_path = hf_hub_download(
	repo_id="krishnas4415/log-anomaly-detection-models",
	filename="models/Hybrid-BERT-Log-Anomaly-Detection/pytorch_model.pt"
	)

	# Download XGBoost model
	xgb_path = hf_hub_download(
	repo_id="krishnas4415/log-anomaly-detection-models",
	filename="models/XGBoost-Log-Anomaly-Detection/best_mod.pkl"
	)
	```

	### Load and Use Models

	```python
	import torch
	import pickle
	from transformers import AutoTokenizer

	# Load BERT model
	model = torch.load(model_path)
	tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')

	# Load XGBoost model
	with open(xgb_path, 'rb') as f:
	xgb_model = pickle.load(f)

	# Example prediction
	log_text = "Apr 15 12:34:56 server sshd[1234]: Failed password for admin"
	inputs = tokenizer(log_text, return_tensors='pt', max_length=128, truncation=True, padding=True)

	with torch.no_grad():
	outputs = model(**inputs)
	predictions = torch.softmax(outputs.logits, dim=-1)
	predicted_class = torch.argmax(predictions, dim=-1)
	```

	## 📚 Training Data

	- Sources: 16 log types (Apache, SSH, Hadoop, HDFS, Linux, Windows, etc.)
	- Size: ~32,000 labeled logs
	- Classes: 7 anomaly categories
	- Features: BERT embeddings + template features + statistical features

	## 🔗 Related Links

	- Main Project: [Log Anomaly Detection System](https://github.com/krishnasharma4415/log-anomaly-detection)
	- Live Demo: [Frontend Application](https://log-anomaly-frontend.vercel.app)
	- API: [Backend API](https://log-anomaly-api.onrender.com)

	## 📄 Citation

	```bibtex
	@misc{log-anomaly-detection-2024,
	title={Log Anomaly Detection System},
	author={Krishna Sharma},
	year={2024},
	url={https://github.com/krishnasharma4415/log-anomaly-detection}
	}
	```

	## 📝 License

	MIT License - see LICENSE file for details.