Shoriful025
/

cyber_threat_log_classifier

threat-detection

Model card Files Files and versions

cyber_threat_log_classifier / README.md

Shoriful025's picture

Create README.md

8608862 verified 6 days ago

|

history blame contribute delete

1.55 kB

	---
	language: en
	license: apache-2.0
	tags:
	- cybersecurity
	- log-analysis
	- threat-detection
	- roberta
	---

	# cyber_threat_log_classifier

	## Overview
	This model is a fine-tuned RoBERTa-base classifier designed to analyze raw HTTP server logs and system audit trails for malicious patterns. It identifies common web-based attacks such as SQL Injection and Cross-Site Scripting (XSS) with high precision, enabling real-time security orchestration.

	## Model Architecture
	The model utilizes a Transformer-based encoder architecture (RoBERTa).



	- Encoder: 12-layer Transformer with 768 hidden units and 12 attention heads.
	- Input: Tokenized raw log strings (up to 512 tokens).
	- Classification Head: Linear layer on top of the `[CLS]` (or equivalent `<s>`) token pooling to map hidden states to 5 threat categories.

	## Intended Use
	- SIEM Integration: Automated labeling of incoming logs in Security Information and Event Management systems.
	- Incident Response: Prioritizing security alerts based on the classified threat type.
	- Log Cleaning: Filtering out high-volume benign noise from security dashboards.

	## Limitations
	- Obfuscated Payloads: Highly encoded or polymorphic attack payloads may bypass detection if not represented in the training distribution.
	- Context Window: Extremely long request bodies or multi-line log events exceeding 512 tokens will be truncated.
	- Adversarial Examples: Sophisticated attackers may craft "log-injection" payloads specifically designed to mislead the classifier.