File size: 2,049 Bytes

fe2f9f1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5b59a0f
fe2f9f1
 
 
 
 
 
 
 
 
 
 
 
5b59a0f
fe2f9f1

---
language:
- en
metrics:
- accuracy
---
# Custom Transformer for Amazon Sentiment Analysis

This repository contains a **custom-built Transformer Encoder** model for binary sentiment classification, trained on the **Amazon Polarity** dataset.

## 🚀 Model Overview
Unlike standard pre-trained models (like BERT), this architecture was built from scratch to demonstrate the implementation of **Self-Attention** and **Positional Encodings** in PyTorch.

* **Architecture**: 4-Layer Transformer Encoder
* **Task**: Binary Sentiment Analysis (Positive/Negative)
* **Accuracy**: 89.73% on Test Set
* **Parameters**: Optimized for efficient inference on edge devices

## 🛠️ Technical Specifications
* **Embedding Dimension**: 128
* **Attention Heads**: 8
* **Feed-Forward Dimension**: 512
* **Sequence Length**: 300 tokens
* **Optimizer**: AdamW with Linear Learning Rate Warmup

## 💻 Training Environment
This model was trained locally on an **Apple Mac mini M4** with **24GB of Unified Memory**.
* **Accelerator**: Metal Performance Shaders (MPS)
* **Training Time**: ~1.5 hours
* **Dataset**: Subset of 500,000 samples from Amazon Polarity

## 📈 Performance & Insights
During development, the model was benchmarked against a Bidirectional LSTM. The Transformer architecture achieved a **~5% improvement in accuracy**, demonstrating its superior ability to capture long-range dependencies in product reviews.

## 📖 How to Use
To use this model, ensure you have `torch` and `transformers` installed.

```python
from transformers import DistilBertTokenizer
import torch

# 1. Initialize Tokenizer
tokenizer = DistilBertTokenizer.from_pretrained('Nefflymicn/amazon-sentiment-transformer')

# 2. Load Model (Architecture must match)
model = TransformerSentimentModel(
            vocab_size=tokenizer.vocab_size,
            embed_dim=128,
            num_heads=8,
            ff_dim=512,
            num_layers=4,
            output_dim=2
        ) 
model.load_state_dict(torch.load("pytorch_model.bin", map_location='cpu'))
model.eval()