| --- |
| title: Log Classification System |
| emoji: π |
| colorFrom: blue |
| colorTo: indigo |
| sdk: gradio |
| sdk_version: 5.23.0 |
| app_file: app.py |
| pinned: false |
| license: mit |
| --- |
| |
| # π Log Classification System |
|
|
| A **production-inspired hybrid log classification pipeline** that routes enterprise logs through 3 tiers β Regex β BERT + Logistic Regression β LLM β based on pattern confidence and source system. |
|
|
| ## Architecture |
|
|
| ``` |
| Input Log |
| β |
| βββΊ [Tier 1] Regex Classifier β Fixed patterns (sub-ms latency) |
| β β No match? |
| β βΌ |
| βββΊ [Tier 2] BERT + LogReg β High-confidence ML (conf > 0.5) |
| β β Low confidence? |
| β βΌ |
| βββΊ [Tier 3] LLM (HF Inference) β LegacyCRM / rare patterns |
| ``` |
|
|
| ## Categories |
|
|
| | Category | Tier Used | |
| |---|---| |
| | User Action | Regex | |
| | System Notification | Regex | |
| | HTTP Status | BERT | |
| | Security Alert | BERT | |
| | Critical Error | BERT | |
| | Error | BERT | |
| | Resource Usage | BERT | |
| | Workflow Error | LLM | |
| | Deprecation Warning | LLM | |
|
|
| ## Setup |
|
|
| ### HuggingFace Spaces Secrets Required |
| - `HF_TOKEN` β your HuggingFace token (for LLM inference on LegacyCRM logs) |
|
|
| ### Local Setup |
| ```bash |
| pip install -r requirements.txt |
| python app.py |
| ``` |
|
|
| ## Source Systems |
| - `ModernCRM`, `ModernHR`, `BillingSystem`, `AnalyticsEngine`, `ThirdPartyAPI` β Regex β BERT |
| - `LegacyCRM` β LLM directly (too few training samples for ML) |
|
|
| ## Tech Stack |
| `sentence-transformers` Β· `scikit-learn` Β· `huggingface-hub` Β· `gradio` Β· `pandas` |