# Sentiment Analysis Models

This repository contains two logistic regression models trained to predict sentiment scores.

## Model Details
- Base embedding model: BAAI/bge-large-en-v1.5
- Architecture: LogisticRegression (scikit-learn)
- Training data: Custom sentiment dataset with dual expert annotations
- Data split: 70% training, 15% development, 15% test

## Performance Metrics

### Development Set
#### Against Expert 1:
- Exact match: 50.95%
- Within 1 level: 95.17%

#### Against Expert 2:
- Exact match: 37.92%
- Within 1 level: 92.31%

### Test Set
#### Against Expert 1:
- Exact match: 50.21%
- Within 1 level: 95.27%

#### Against Expert 2:
- Exact match: 41.23%
- Within 1 level: 92.26%

## Usage

See `inference.py` for an example of how to use these models to predict sentiment for new text.

## Model Files
- `model1.joblib`: Model trained on Expert 1 annotations
- `model2.joblib`: Model trained on Expert 2 annotations

## Data Files
- `dev_results.csv`: Complete predictions on development set
- `test_results.csv`: Complete predictions on test set