File size: 2,049 Bytes
fe2f9f1 5b59a0f fe2f9f1 5b59a0f fe2f9f1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | ---
language:
- en
metrics:
- accuracy
---
# Custom Transformer for Amazon Sentiment Analysis
This repository contains a **custom-built Transformer Encoder** model for binary sentiment classification, trained on the **Amazon Polarity** dataset.
## ๐ Model Overview
Unlike standard pre-trained models (like BERT), this architecture was built from scratch to demonstrate the implementation of **Self-Attention** and **Positional Encodings** in PyTorch.
* **Architecture**: 4-Layer Transformer Encoder
* **Task**: Binary Sentiment Analysis (Positive/Negative)
* **Accuracy**: 89.73% on Test Set
* **Parameters**: Optimized for efficient inference on edge devices
## ๐ ๏ธ Technical Specifications
* **Embedding Dimension**: 128
* **Attention Heads**: 8
* **Feed-Forward Dimension**: 512
* **Sequence Length**: 300 tokens
* **Optimizer**: AdamW with Linear Learning Rate Warmup
## ๐ป Training Environment
This model was trained locally on an **Apple Mac mini M4** with **24GB of Unified Memory**.
* **Accelerator**: Metal Performance Shaders (MPS)
* **Training Time**: ~1.5 hours
* **Dataset**: Subset of 500,000 samples from Amazon Polarity
## ๐ Performance & Insights
During development, the model was benchmarked against a Bidirectional LSTM. The Transformer architecture achieved a **~5% improvement in accuracy**, demonstrating its superior ability to capture long-range dependencies in product reviews.
## ๐ How to Use
To use this model, ensure you have `torch` and `transformers` installed.
```python
from transformers import DistilBertTokenizer
import torch
# 1. Initialize Tokenizer
tokenizer = DistilBertTokenizer.from_pretrained('Nefflymicn/amazon-sentiment-transformer')
# 2. Load Model (Architecture must match)
model = TransformerSentimentModel(
vocab_size=tokenizer.vocab_size,
embed_dim=128,
num_heads=8,
ff_dim=512,
num_layers=4,
output_dim=2
)
model.load_state_dict(torch.load("pytorch_model.bin", map_location='cpu'))
model.eval() |