Business Issue Allocation Classifier
A text classifier that maps a natural language business problem description to the most likely data engineering solution category.
Model Details
- Classifier: SVM (Support Vector Machine)
- Embedding model:
sentence-transformers/all-mpnet-base-v2 - Classes: 9 (stream_processing, etl_pipeline, data_warehouse, data_lake, api_integration, ml_feature_store, data_caching, data_governance, data_quality)
- Accuracy: 88.2%
- Macro F1: 88.4%
How to Use
Clone the full project from GitHub and run:
from src.inference import Predictor
predictor = Predictor()
result = predictor.predict("We need to detect fraud before transactions are approved.")
print(result["predicted_label"])
Dataset
dianamikova/business-issue-allocation