Eval-01 / model-card.md

Upload folder using huggingface_hub

6dbf09f verified about 1 year ago

1.79 kB

language: en
license: mit
library_name: scikit-learn
tags:
  - sentiment-analysis
  - text-classification
  - scikit-learn
  - sentence-transformers
datasets:
  - custom_sentiment_dataset
metrics:
  - accuracy

Sentiment Analysis Model

This model predicts sentiment scores based on text input. It uses sentence embeddings from BAAI/bge-large-en-v1.5 and logistic regression classifiers.

Model Description

This repository contains two logistic regression models trained to predict sentiment scores based on text embeddings. The models were trained on a custom dataset with annotations from two different experts.

Model Architecture

Base embedding model: BAAI/bge-large-en-v1.5
Classifier: LogisticRegression (scikit-learn)
Final prediction: Average of both model predictions, rounded to nearest integer

Intended Use and Limitations

The model is designed for sentiment analysis tasks. The model works best with English text similar to the training data.

Training and Evaluation Data

The model was trained on a custom dataset with:

70% training data
15% development data
15% test data

Each sample has annotations from two human experts.

Evaluation Results

See README.md for detailed performance metrics on both development and test sets.

Using the Models

import joblib
import numpy as np
from sentence_transformers import SentenceTransformer

# Load the models
model1 = joblib.load('model1.joblib')
model2 = joblib.load('model2.joblib')
embedder = SentenceTransformer('BAAI/bge-large-en-v1.5')

def predict_sentiment(text):
    embedding = embedder.encode([text])
    pred1 = model1.predict(embedding)[0]
    pred2 = model2.predict(embedding)[0]
    final_prediction = np.round((pred1 + pred2) / 2).astype(int)
    return final_prediction