Shoriful025
/

cyber_threat_log_classifier

threat-detection

Model card Files Files and versions

Shoriful025 commited on Dec 26, 2025

Commit

8608862

·

verified ·

1 Parent(s): 02aa420

Create README.md

Files changed (1) hide show

README.md +33 -0

README.md ADDED Viewed

	@@ -0,0 +1,33 @@

+---
+language: en
+license: apache-2.0
+tags:
+- cybersecurity
+- log-analysis
+- threat-detection
+- roberta
+---
+# cyber_threat_log_classifier
+## Overview
+This model is a fine-tuned RoBERTa-base classifier designed to analyze raw HTTP server logs and system audit trails for malicious patterns. It identifies common web-based attacks such as SQL Injection and Cross-Site Scripting (XSS) with high precision, enabling real-time security orchestration.
+## Model Architecture
+The model utilizes a Transformer-based encoder architecture (RoBERTa).
+- **Encoder:** 12-layer Transformer with 768 hidden units and 12 attention heads.
+- **Input:** Tokenized raw log strings (up to 512 tokens).
+- **Classification Head:** Linear layer on top of the `[CLS]` (or equivalent `<s>`) token pooling to map hidden states to 5 threat categories.
+## Intended Use
+- **SIEM Integration:** Automated labeling of incoming logs in Security Information and Event Management systems.
+- **Incident Response:** Prioritizing security alerts based on the classified threat type.
+- **Log Cleaning:** Filtering out high-volume benign noise from security dashboards.
+## Limitations
+- **Obfuscated Payloads:** Highly encoded or polymorphic attack payloads may bypass detection if not represented in the training distribution.
+- **Context Window:** Extremely long request bodies or multi-line log events exceeding 512 tokens will be truncated.
+- **Adversarial Examples:** Sophisticated attackers may craft "log-injection" payloads specifically designed to mislead the classifier.