IMDB Sentiment Analysis MLP Model

This model performs sentiment analysis on movie reviews from the IMDB balanced 10k dataset using BoW/TF-IDF features with a small MLP classifier.

Model Architecture

  • Features: BoW / TF-IDF
  • Feature Cap: 20000
  • N-grams: 1-2
  • Model: 2-layer MLP with Dropout
  • Hidden Dimensions: 256
  • Output: Binary classification (Positive/Negative)
  • Training Framework: PyTorch

Training Details

  • Dataset: IMDB Balanced 10k
  • Batch Size: 64
  • Epochs: 12
  • Learning Rate: 0.0005
  • Optimizer: Adam
  • Loss Function: Binary Cross Entropy with Logits
  • Device: CPU/GPU (automatically detected)

Usage

import torch
from train import SentimentMLP, TextFeatureExtractor

# Load model
model = SentimentMLP(20000, 256, 1, 0.2)
model.load_state_dict(torch.load('best_model.pt'))
model.eval()

# Fit the same feature extractor used in training
feature_extractor = TextFeatureExtractor('tfidf', 20000, (1, 2), 2, 0.95)
# feature_extractor.fit(train_texts)

# Predict
text = "This movie is amazing!"
features = feature_extractor.transform([text])
features = torch.FloatTensor(features)

with torch.no_grad():
    logits = model(features)
    probability = torch.sigmoid(logits).item()
    sentiment = "Positive" if probability > 0.5 else "Negative"
    confidence = probability

Performance Metrics

See training_results.png and confusion_matrix.png for detailed visualizations.

Files

  • best_model.pt - Best trained model weights
  • train.py - Training script
  • requirements.txt - Python dependencies
  • training_results.png - Loss and accuracy curves
  • confusion_matrix.png - Test set confusion matrix
  • README.md - This file

License

MIT

Author

Generated by automated training pipeline

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train TheliaQaQ/imdb-sentiment-analysis-lstm