| # Finance Classifier Model | |
| This directory contains the fine-tuned mBERT model for binary financial conversation classification. | |
| ## Model Files | |
| The model directory should contain: | |
| - `config.json` - Model configuration | |
| - `tokenizer_config.json` - Tokenizer configuration | |
| - `special_tokens_map.json` - Special tokens mapping | |
| - `pytorch_model.bin` - Trained model weights (generated by training) | |
| ## Training | |
| To generate the trained model, run: | |
| ```bash | |
| cd nlp/ | |
| python train_classifier.py | |
| ``` | |
| This will: | |
| 1. Load training data from `../classifier_training.json` | |
| 2. Fine-tune bert-base-multilingual-cased on financial vs non-financial classification | |
| 3. Save the trained model to this directory | |
| ## Model Details | |
| - **Base Model**: bert-base-multilingual-cased | |
| - **Task**: Binary Classification (financial: 1, non-financial: 0) | |
| - **Input**: Text sentences | |
| - **Languages**: Multilingual support | |
| - **Training File**: `classifier_training.json` | |
| ## Usage | |
| ```python | |
| from nlp.classifier import FinanceClassifier | |
| clf = FinanceClassifier() | |
| result = clf.predict("Loan lena chahiye") | |
| print(result) # {'prediction': 'financial', 'confidence': 0.95} | |
| ``` | |