amazon-sentiment-transformer / README.md

Nefflymicn

Update README.md

5b59a0f verified 2 days ago

preview code

raw

history blame contribute delete

2.05 kB

metadata

language:
  - en
metrics:
  - accuracy

Custom Transformer for Amazon Sentiment Analysis

This repository contains a custom-built Transformer Encoder model for binary sentiment classification, trained on the Amazon Polarity dataset.

🚀 Model Overview

Unlike standard pre-trained models (like BERT), this architecture was built from scratch to demonstrate the implementation of Self-Attention and Positional Encodings in PyTorch.

Architecture: 4-Layer Transformer Encoder
Task: Binary Sentiment Analysis (Positive/Negative)
Accuracy: 89.73% on Test Set
Parameters: Optimized for efficient inference on edge devices

🛠️ Technical Specifications

Embedding Dimension: 128
Attention Heads: 8
Feed-Forward Dimension: 512
Sequence Length: 300 tokens
Optimizer: AdamW with Linear Learning Rate Warmup

💻 Training Environment

This model was trained locally on an Apple Mac mini M4 with 24GB of Unified Memory.

Accelerator: Metal Performance Shaders (MPS)
Training Time: ~1.5 hours
Dataset: Subset of 500,000 samples from Amazon Polarity

📈 Performance & Insights

During development, the model was benchmarked against a Bidirectional LSTM. The Transformer architecture achieved a ~5% improvement in accuracy, demonstrating its superior ability to capture long-range dependencies in product reviews.

📖 How to Use

To use this model, ensure you have torch and transformers installed.

from transformers import DistilBertTokenizer
import torch

# 1. Initialize Tokenizer
tokenizer = DistilBertTokenizer.from_pretrained('Nefflymicn/amazon-sentiment-transformer')

# 2. Load Model (Architecture must match)
model = TransformerSentimentModel(
            vocab_size=tokenizer.vocab_size,
            embed_dim=128,
            num_heads=8,
            ff_dim=512,
            num_layers=4,
            output_dim=2
        ) 
model.load_state_dict(torch.load("pytorch_model.bin", map_location='cpu'))
model.eval()