| # Sentiment Analysis Models | |
| This repository contains two logistic regression models trained to predict sentiment scores. | |
| ## Model Details | |
| - Base embedding model: BAAI/bge-large-en-v1.5 | |
| - Architecture: LogisticRegression (scikit-learn) | |
| - Training data: Custom sentiment dataset with dual expert annotations | |
| - Data split: 70% training, 15% development, 15% test | |
| ## Performance Metrics | |
| ### Development Set | |
| #### Against Expert 1: | |
| - Exact match: 50.95% | |
| - Within 1 level: 95.17% | |
| #### Against Expert 2: | |
| - Exact match: 37.92% | |
| - Within 1 level: 92.31% | |
| ### Test Set | |
| #### Against Expert 1: | |
| - Exact match: 50.21% | |
| - Within 1 level: 95.27% | |
| #### Against Expert 2: | |
| - Exact match: 41.23% | |
| - Within 1 level: 92.26% | |
| ## Usage | |
| See `inference.py` for an example of how to use these models to predict sentiment for new text. | |
| ## Model Files | |
| - `model1.joblib`: Model trained on Expert 1 annotations | |
| - `model2.joblib`: Model trained on Expert 2 annotations | |
| ## Data Files | |
| - `dev_results.csv`: Complete predictions on development set | |
| - `test_results.csv`: Complete predictions on test set | |