viitheone commited on
Commit
cc5fa64
·
verified ·
1 Parent(s): b6fcbb2

Upload 3 files

Browse files
Files changed (3) hide show
  1. README.md +48 -3
  2. tokenizer.json +0 -0
  3. tokenizer_config.json +14 -0
README.md CHANGED
@@ -1,3 +1,48 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ pipeline_tag: text-classification
5
+ tags:
6
+ - web-application-firewall
7
+ - waf
8
+ - security
9
+ ---
10
+
11
+ # 4thwall WAF Model
12
+
13
+ This model is a custom Web Application Firewall (WAF) classifier built by fine-tuning the `distilbert` (DistilBertForSequenceClassification) architecture. It is designed to identify and classify HTTP requests as either safe or potentially malicious (similarly to ModSecurity).
14
+
15
+ ## Model Details
16
+
17
+ - **Model Type:** Text Classification (DistilBERT)
18
+ - **Task:** Identifying Malicious HTTP Requests (Web Application Firewall)
19
+ - **Use Case:** Can be used as a standalone classifier or inline ML-based proxy to analyze real-time HTTP traffic and reject high-risk requests (e.g., 403 Forbidden).
20
+
21
+ ## Intended Uses & Limitations
22
+
23
+ - **Intended Use:** Inspecting HTTP paths, headers, and payloads for malicious intent (e.g., SQL Injection, XSS, etc.). Ideal for use within an ML pipeline integrating with services like Nginx or a customized inline WAF proxy.
24
+ - **Limitations:** The model acts as a learning proxy and can still result in False Positives or False Negatives. Continuous learning and manual feedback over time can help improve model confidence.
25
+
26
+ ## Metrics
27
+
28
+ During evaluation, the model achieved the following metrics:
29
+ - **Accuracy:** 94.23%
30
+ - **Precision:** 92.50%
31
+ - **Recall:** 93.10%
32
+ - **F1 Score:** 92.80%
33
+
34
+ ## How to Get Started with the Model
35
+
36
+ ```python
37
+ from transformers import pipeline
38
+
39
+ # Load the WAF classifier
40
+ waf_classifier = pipeline("text-classification", model="your-username/my-waf-model")
41
+
42
+ # Example request payload
43
+ payload = "GET /index.php?id=1 UNION SELECT 1,2,3-- HTTP/1.1"
44
+
45
+ # Predict if malicious or benign
46
+ result = waf_classifier(payload)
47
+ print(result)
48
+ ```
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "backend": "tokenizers",
3
+ "cls_token": "[CLS]",
4
+ "do_lower_case": true,
5
+ "is_local": false,
6
+ "mask_token": "[MASK]",
7
+ "model_max_length": 512,
8
+ "pad_token": "[PAD]",
9
+ "sep_token": "[SEP]",
10
+ "strip_accents": null,
11
+ "tokenize_chinese_chars": true,
12
+ "tokenizer_class": "BertTokenizer",
13
+ "unk_token": "[UNK]"
14
+ }