| # π¬ Sentiment-Analysis-for-Product-Release-Sentiment |
|
|
| A **BERT-based sentiment analysis model** fine-tuned on a product review dataset. It predicts the sentiment of a text as **Positive**, **Neutral**, or **Negative** with a confidence score. This model is ideal for analyzing customer feedback, reviews, or user comments. |
|
|
| --- |
|
|
| ## β¨ Model Highlights |
|
|
| - π **Architecture**: Based on [`bert-base-uncased`](https://huggingface.co/bert-base-uncased) by Google |
| - π§ **Fine-tuned** on labeled product review data |
| - π **3-way sentiment classification**: `Negative (0)`, `Neutral (1)`, `Positive (2)` |
| - πΎ **Quantized version available** for faster inference |
|
|
| --- |
|
|
| ## π§ Intended Uses |
|
|
| - β
Classifying product feedback and user reviews |
| - β
Sentiment analysis for e-commerce platforms |
| - β
Social media monitoring and customer opinion mining |
|
|
| --- |
|
|
| ## π« Limitations |
|
|
| - β Designed for English texts only |
| - β May not perform well on sarcastic or ironic inputs |
| - β May struggle with domains very different from product reviews |
| - β Input texts longer than 128 tokens are truncated |
|
|
| --- |
|
|
| ## ποΈββοΈ Training Details |
|
|
| - **Base Model**: `bert-base-uncased` |
| - **Dataset**: Custom-labeled product review dataset |
| - **Epochs**: 5 |
| - **Batch Size**: 8 |
| - **Max Length**: 128 tokens |
| - **Optimizer**: AdamW |
| - **Loss Function**: CrossEntropyLoss (with class balancing) |
| - **Hardware**: Trained on NVIDIA GPU (CUDA-enabled) |
|
|
| --- |
|
|
| ## π Evaluation Metrics |
|
|
| | Metric | Score | |
| |------------|-------| |
| | Accuracy | 0.90 | |
| | F1 | 0.90 | |
| | Precision | 0.90 | |
| | Recall | 0.90 | |
|
|
| --- |
|
|
| ## π Label Mapping |
|
|
| | Label ID | Sentiment | |
| |----------|-----------| |
| | 0 | Negative | |
| | 1 | Neutral | |
| | 2 | Positive | |
|
|
| --- |
|
|
| ## π Usage Example |
|
|
| ```python |
| from transformers import AutoTokenizer, AutoModelForSequenceClassification, Trainer, TrainingArguments |
| from transformers import DataCollatorWithPadding |
| import torch |
| import torch.nn.functional as F |
| |
| # Load model and tokenizer |
| model_name = "AventIQ-AI/Sentiment-Analysis-for-Product-Release-Sentiment" |
| tokenizer = BertTokenizer.from_pretrained(model_name) |
| model = BertForSequenceClassification.from_pretrained(model_name) |
| model.eval() |
| |
| # Inference |
| def predict_sentiment(text): |
| inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True) |
| inputs = {k: v.to(quantized_model.device) for k, v in inputs.items()} |
| with torch.no_grad(): |
| logits = quantized_model(**inputs).logits |
| probs = F.softmax(logits, dim=1) |
| |
| predicted_class_id = torch.argmax(probs, dim=1).item() |
| confidence = probs[0][predicted_class_id].item() |
| |
| label_map = {0: "Negative", 1: "Positive"} |
| label = label_map[predicted_class_id] |
| confidence_str = f"confidence : {confidence * 100:.1f}%" |
| |
| return label, confidence_str |
| |
| # Example |
| print(predict_sentiment( |
| "The service was excellent and the staff was friendly.") |
| ) |
| ``` |
|
|
| --- |
|
|
| ## π§ͺ Quantization |
|
|
| - Applied **post-training dynamic quantization** using PyTorch to reduce model size and speed up inference. |
| - Quantized model supports CPU-based deployments. |
|
|
| --- |
|
|
| ## π Repository Structure |
|
|
| ``` |
| . |
| βββ model/ # Quantized model files |
| βββ tokenizer/ # Tokenizer config and vocabulary |
| βββ model.safetensors/ # Fine-tuned full-precision model |
| βββ README.md # Model documentation |
| ``` |
|
|
| --- |
|
|
| ## π Limitations |
|
|
| - May not generalize to completely different domains (e.g., medical, legal) |
| - Quantized version may show slight drop in accuracy compared to full-precision model |
|
|
| --- |
|
|
| ## π€ Contributing |
|
|
| We welcome contributions! Please feel free to raise an issue or submit a pull request if you find a bug or have a suggestion. |