| # π Contract Sentiment Classifier (BERT) | |
| A fine-tuned BERT model for contract sentiment analysis, classifying legal or contractual text into positive, negative, or neutral sentiments. | |
| ## π§ Model Details | |
| - π**Base Model**: bert-base-uncased | |
| - π§**Task**: Sentiment Classification (Contractual Text) | |
| - π **Labels**: `Negative (0)`, `Neutral (1)`, `Positive (2)` | |
| - πΎ **Quantized version available**: for faster inference | |
| - π§ **Framework**: PyTorch, Transformers (π€ Hugging Face) | |
| ## π§ Intended Uses | |
| - β Classifying product feedback and user reviews | |
| - β Sentiment analysis for e-commerce platforms | |
| - β Social media monitoring and customer opinion mining | |
| --- | |
| ## π« Limitations | |
| - β Designed for English texts only | |
| - βNeeds further tuning and evaluation on larger, diverse contract. | |
| - β Not suitable for production use without robustness checks. | |
| --- | |
| ## ποΈββοΈ Training Details | |
| - **Base Model**: `bert-base-uncased` | |
| - **Dataset**: Custom labeled Contract Sentiment dataset | |
| - **Epochs**: 3 | |
| - **Batch Size**: 5 | |
| - **Learning rate**: AdamW | |
| - **Hardware**: Trained on NVIDIA GPU (CUDA-enabled) | |
| --- | |
| ## π Evaluation Metrics | |
| | Metric | Score | | |
| |------------|-------| | |
| | Accuracy | 0.98 | | |
| | F1 | 0.99 | | |
| | Precision | 0.99 | | |
| | Recall | 0.97 | | |
| --- | |
| ## π Label Mapping | |
| | Label ID | Sentiment | | |
| |----------|-----------| | |
| | 0 | Negative | | |
| | 1 | Neutral | | |
| | 2 | Positive | | |
| --- | |
| ## π Usage Example | |
| ```python | |
| import pandas as pd | |
| from sklearn.model_selection import train_test_split | |
| from sklearn.preprocessing import LabelEncoder | |
| from sklearn.metrics import accuracy_score, precision_recall_fscore_support | |
| import torch | |
| from transformers import BertTokenizer, BertForSequenceClassification, Trainer, TrainingArguments | |
| from datasets import Dataset | |
| import torch.nn.functional as F | |
| # Load model and tokenizer | |
| model_name = "AventIQ-AI/Sentiment-Analysis-for-Contract-Sentiment" | |
| tokenizer = BertTokenizer.from_pretrained('bert-base-uncased') | |
| model = BertForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=3) | |
| model.eval() | |
| def tokenize_function(examples): | |
| return tokenizer(examples['text'], padding='max_length', truncation=True) | |
| # Inference | |
| def predict_sentiment(user_text): | |
| # Ensure input is a list for batch processing | |
| if isinstance(user_text, str): | |
| user_text = [user_text] | |
| # Tokenize input text | |
| inputs = tokenizer(user_text, return_tensors="pt", padding=True, truncation=True) | |
| # Predict using the model | |
| with torch.no_grad(): | |
| outputs = model(**inputs) | |
| preds = torch.argmax(outputs.logits, dim=1) | |
| # Decode predictions back to original sentiment labels | |
| decoded_preds = label_encoder.inverse_transform(preds.numpy()) | |
| # Print each prediction | |
| for text, sentiment in zip(user_text, decoded_preds): | |
| print(f"Text: '{text}' => Sentiment: {sentiment}") | |
| # Example | |
| predict_sentiment("The delivery scheduled") | |
| ``` | |
| --- | |
| ## π§ͺ Quantization | |
| - Applied **post-training dynamic quantization** using PyTorch to reduce model size and speed up inference. | |
| - Quantized model supports CPU-based deployments. | |
| --- | |
| ## π Repository Structure | |
| ``` | |
| . | |
| βββ model/ # Quantized model files | |
| βββ tokenizer/ # Tokenizer config and vocabulary | |
| βββ model.safetensors/ # Fine-tuned full-precision model | |
| βββ README.md # Model documentation | |
| ``` | |
| --- | |
| ## π€ Contributing | |
| We welcome contributions! Please feel free to raise an issue or submit a pull request if you find a bug or have a suggestion. |