|
|
--- |
|
|
language: |
|
|
- en |
|
|
metrics: |
|
|
- accuracy |
|
|
--- |
|
|
# Custom Transformer for Amazon Sentiment Analysis |
|
|
|
|
|
This repository contains a **custom-built Transformer Encoder** model for binary sentiment classification, trained on the **Amazon Polarity** dataset. |
|
|
|
|
|
## π Model Overview |
|
|
Unlike standard pre-trained models (like BERT), this architecture was built from scratch to demonstrate the implementation of **Self-Attention** and **Positional Encodings** in PyTorch. |
|
|
|
|
|
* **Architecture**: 4-Layer Transformer Encoder |
|
|
* **Task**: Binary Sentiment Analysis (Positive/Negative) |
|
|
* **Accuracy**: 89.73% on Test Set |
|
|
* **Parameters**: Optimized for efficient inference on edge devices |
|
|
|
|
|
## π οΈ Technical Specifications |
|
|
* **Embedding Dimension**: 128 |
|
|
* **Attention Heads**: 8 |
|
|
* **Feed-Forward Dimension**: 512 |
|
|
* **Sequence Length**: 300 tokens |
|
|
* **Optimizer**: AdamW with Linear Learning Rate Warmup |
|
|
|
|
|
## π» Training Environment |
|
|
This model was trained locally on an **Apple Mac mini M4** with **24GB of Unified Memory**. |
|
|
* **Accelerator**: Metal Performance Shaders (MPS) |
|
|
* **Training Time**: ~1.5 hours |
|
|
* **Dataset**: Subset of 500,000 samples from Amazon Polarity |
|
|
|
|
|
## π Performance & Insights |
|
|
During development, the model was benchmarked against a Bidirectional LSTM. The Transformer architecture achieved a **~5% improvement in accuracy**, demonstrating its superior ability to capture long-range dependencies in product reviews. |
|
|
|
|
|
## π How to Use |
|
|
To use this model, ensure you have `torch` and `transformers` installed. |
|
|
|
|
|
```python |
|
|
from transformers import DistilBertTokenizer |
|
|
import torch |
|
|
|
|
|
# 1. Initialize Tokenizer |
|
|
tokenizer = DistilBertTokenizer.from_pretrained('Nefflymicn/amazon-sentiment-transformer') |
|
|
|
|
|
# 2. Load Model (Architecture must match) |
|
|
model = TransformerSentimentModel( |
|
|
vocab_size=tokenizer.vocab_size, |
|
|
embed_dim=128, |
|
|
num_heads=8, |
|
|
ff_dim=512, |
|
|
num_layers=4, |
|
|
output_dim=2 |
|
|
) |
|
|
model.load_state_dict(torch.load("pytorch_model.bin", map_location='cpu')) |
|
|
model.eval() |