ryeyoo
/

sentimentizer-decoder

Text Classification

sentiment-analysis

Model card Files Files and versions

sentimentizer-decoder / README.md

ryeyoo's picture

Update decoder model card

d40d209 verified 20 days ago

|

history blame contribute delete

1.63 kB

	---
	language: en
	license: mit
	tags:
	- sentiment-analysis
	- text-classification
	- decoder
	library_name: sentimentizer
	task: text-classification
	---
	# Sentimentizer DECODER Sentiment Model
	## Description

	A Transformer Encoder-Decoder for sentiment classification built on pre-trained GloVe embeddings. The encoder processes the input sequence, and the decoder attends to the encoder outputs to produce a sentiment prediction.

	## Training Data

	Trained on the [Yelp Open Dataset](https://www.yelp.com/dataset) reviews, with GloVe Wiki-Gigaword-100 pre-trained embeddings. Reviews are tokenized with a custom dictionary (20k vocab, min frequency 3) and padded/truncated to 200 tokens.

	## Usage

	```python
	from sentimentizer.hf import download_weights
	from sentimentizer.config import DriverConfig, weights_path_for

	# Download weights + dictionary from Hugging Face Hub
	weights_path = weights_path_for("decoder")
	download_weights(
	"decoder",
	weights_path,
	repo_id="ryeyoo/sentimentizer-decoder",
	dict_path=DriverConfig.files.dictionary_file_path,
	)

	# Load and run inference
	from sentimentizer.models.decoder import get_trained_model
	from sentimentizer.tokenizer import get_trained_tokenizer

	model = get_trained_model(device="cpu")
	tokenizer = get_trained_tokenizer()

	probs = model.predict_text('amazing food great service')
	for label, prob in sorted(probs.items(), key=lambda x: -x[1]):
	print(f'{label}: {prob:.4f}')
	# e.g. positive: 0.8300, neutral: 0.1200, negative: 0.0500
	```

	## Files

	- `decoder_weights.pth` — Model state dictionary
	- `yelp.dictionary` — Gensim dictionary for tokenization