--- language: en license: mit tags: - sentiment-analysis - text-classification - decoder library_name: sentimentizer task: text-classification --- # Sentimentizer DECODER Sentiment Model ## Description A Transformer Encoder-Decoder for sentiment classification built on pre-trained GloVe embeddings. The encoder processes the input sequence, and the decoder attends to the encoder outputs to produce a sentiment prediction. ## Training Data Trained on the [Yelp Open Dataset](https://www.yelp.com/dataset) reviews, with GloVe Wiki-Gigaword-100 pre-trained embeddings. Reviews are tokenized with a custom dictionary (20k vocab, min frequency 3) and padded/truncated to 200 tokens. ## Usage ```python from sentimentizer.hf import download_weights from sentimentizer.config import DriverConfig, weights_path_for # Download weights + dictionary from Hugging Face Hub weights_path = weights_path_for("decoder") download_weights( "decoder", weights_path, repo_id="ryeyoo/sentimentizer-decoder", dict_path=DriverConfig.files.dictionary_file_path, ) # Load and run inference from sentimentizer.models.decoder import get_trained_model from sentimentizer.tokenizer import get_trained_tokenizer model = get_trained_model(device="cpu") tokenizer = get_trained_tokenizer() probs = model.predict_text('amazing food great service') for label, prob in sorted(probs.items(), key=lambda x: -x[1]): print(f'{label}: {prob:.4f}') # e.g. positive: 0.8300, neutral: 0.1200, negative: 0.0500 ``` ## Files - `decoder_weights.pth` — Model state dictionary - `yelp.dictionary` — Gensim dictionary for tokenization