metadata
language:
- en
metrics:
- accuracy
Custom Transformer for Amazon Sentiment Analysis
This repository contains a custom-built Transformer Encoder model for binary sentiment classification, trained on the Amazon Polarity dataset.
π Model Overview
Unlike standard pre-trained models (like BERT), this architecture was built from scratch to demonstrate the implementation of Self-Attention and Positional Encodings in PyTorch.
- Architecture: 4-Layer Transformer Encoder
- Task: Binary Sentiment Analysis (Positive/Negative)
- Accuracy: 89.73% on Test Set
- Parameters: Optimized for efficient inference on edge devices
π οΈ Technical Specifications
- Embedding Dimension: 128
- Attention Heads: 8
- Feed-Forward Dimension: 512
- Sequence Length: 300 tokens
- Optimizer: AdamW with Linear Learning Rate Warmup
π» Training Environment
This model was trained locally on an Apple Mac mini M4 with 24GB of Unified Memory.
- Accelerator: Metal Performance Shaders (MPS)
- Training Time: ~1.5 hours
- Dataset: Subset of 500,000 samples from Amazon Polarity
π Performance & Insights
During development, the model was benchmarked against a Bidirectional LSTM. The Transformer architecture achieved a ~5% improvement in accuracy, demonstrating its superior ability to capture long-range dependencies in product reviews.
π How to Use
To use this model, ensure you have torch and transformers installed.
from transformers import DistilBertTokenizer
import torch
# 1. Initialize Tokenizer
tokenizer = DistilBertTokenizer.from_pretrained('Nefflymicn/amazon-sentiment-transformer')
# 2. Load Model (Architecture must match)
model = TransformerSentimentModel(
vocab_size=tokenizer.vocab_size,
embed_dim=128,
num_heads=8,
ff_dim=512,
num_layers=4,
output_dim=2
)
model.load_state_dict(torch.load("pytorch_model.bin", map_location='cpu'))
model.eval()