YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Sentiment Analysis Model

This model is designed for sentiment analysis of English text. It predicts the sentiment of a given text as one of three classes: positive, neutral, or negative. The model was trained on a combination of datasets from Kaggle and Sentiment140.

Model Description

The model card describes two approaches:

  1. Baseline Model: A classical machine learning pipeline using TF-IDF vectorization and Logistic Regression.
  2. CNN Model: A lightweight Convolutional Neural Network (CNN) implemented in Keras.

The best-performing model (based on validation macro-F1 score) is selected for inference.

Baseline Model

  • Vectorizer: TF-IDF (word + character n-grams)
  • Classifier: Logistic Regression
  • Features: 200,000 max features, n-gram range (1, 2)

CNN Model

  • Tokenizer: Keras Tokenizer
  • Architecture: Embedding layer -> 1D Convolution -> Global Max Pooling -> Dense layers

Training Data

The model was trained on a combination of datasets:

  • Kaggle Train: 27,477 samples
  • Sentiment140 Train: 300,000 balanced samples
  • Sentiment140 Manual Test: 516 samples

The datasets were cleaned and unified into a common schema with text and sentiment columns.

Evaluation

The model was evaluated on a stratified validation split (15% of the training data). The best model was selected based on the macro-F1 score.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support