Nefflymicn
/

amazon-sentiment-transformer

Model card Files Files and versions

amazon-sentiment-transformer / README.md

Nefflymicn's picture

Update README.md

5b59a0f verified 2 days ago

|

history blame contribute delete

2.05 kB

	---
	language:
	- en
	metrics:
	- accuracy
	---
	# Custom Transformer for Amazon Sentiment Analysis

	This repository contains a custom-built Transformer Encoder model for binary sentiment classification, trained on the Amazon Polarity dataset.

	## 🚀 Model Overview
	Unlike standard pre-trained models (like BERT), this architecture was built from scratch to demonstrate the implementation of Self-Attention and Positional Encodings in PyTorch.

	* Architecture: 4-Layer Transformer Encoder
	* Task: Binary Sentiment Analysis (Positive/Negative)
	* Accuracy: 89.73% on Test Set
	* Parameters: Optimized for efficient inference on edge devices

	## 🛠️ Technical Specifications
	* Embedding Dimension: 128
	* Attention Heads: 8
	* Feed-Forward Dimension: 512
	* Sequence Length: 300 tokens
	* Optimizer: AdamW with Linear Learning Rate Warmup

	## 💻 Training Environment
	This model was trained locally on an Apple Mac mini M4 with 24GB of Unified Memory.
	* Accelerator: Metal Performance Shaders (MPS)
	* Training Time: ~1.5 hours
	* Dataset: Subset of 500,000 samples from Amazon Polarity

	## 📈 Performance & Insights
	During development, the model was benchmarked against a Bidirectional LSTM. The Transformer architecture achieved a ~5% improvement in accuracy, demonstrating its superior ability to capture long-range dependencies in product reviews.

	## 📖 How to Use
	To use this model, ensure you have `torch` and `transformers` installed.

	```python
	from transformers import DistilBertTokenizer
	import torch

	# 1. Initialize Tokenizer
	tokenizer = DistilBertTokenizer.from_pretrained('Nefflymicn/amazon-sentiment-transformer')

	# 2. Load Model (Architecture must match)
	model = TransformerSentimentModel(
	vocab_size=tokenizer.vocab_size,
	embed_dim=128,
	num_heads=8,
	ff_dim=512,
	num_layers=4,
	output_dim=2
	)
	model.load_state_dict(torch.load("pytorch_model.bin", map_location='cpu'))
	model.eval()