Spaces:
Sleeping
Sleeping
File size: 1,554 Bytes
435dc35 abc86a6 435dc35 cab021f 435dc35 abc86a6 cab021f | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 | ---
title: Log Classification System
emoji: π
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 5.23.0
app_file: app.py
pinned: false
license: mit
---
# π Log Classification System
A **production-inspired hybrid log classification pipeline** that routes enterprise logs through 3 tiers β Regex β BERT + Logistic Regression β LLM β based on pattern confidence and source system.
## Architecture
```
Input Log
β
βββΊ [Tier 1] Regex Classifier β Fixed patterns (sub-ms latency)
β β No match?
β βΌ
βββΊ [Tier 2] BERT + LogReg β High-confidence ML (conf > 0.5)
β β Low confidence?
β βΌ
βββΊ [Tier 3] LLM (HF Inference) β LegacyCRM / rare patterns
```
## Categories
| Category | Tier Used |
|---|---|
| User Action | Regex |
| System Notification | Regex |
| HTTP Status | BERT |
| Security Alert | BERT |
| Critical Error | BERT |
| Error | BERT |
| Resource Usage | BERT |
| Workflow Error | LLM |
| Deprecation Warning | LLM |
## Setup
### HuggingFace Spaces Secrets Required
- `HF_TOKEN` β your HuggingFace token (for LLM inference on LegacyCRM logs)
### Local Setup
```bash
pip install -r requirements.txt
python app.py
```
## Source Systems
- `ModernCRM`, `ModernHR`, `BillingSystem`, `AnalyticsEngine`, `ThirdPartyAPI` β Regex β BERT
- `LegacyCRM` β LLM directly (too few training samples for ML)
## Tech Stack
`sentence-transformers` Β· `scikit-learn` Β· `huggingface-hub` Β· `gradio` Β· `pandas` |