ryeyoo
/

sentimentizer-encoder

Text Classification

sentiment-analysis

Model card Files Files and versions

ryeyoo commited on 5 days ago

Commit

3df0af0

·

verified ·

1 Parent(s): 7648118

Update encoder model card

Files changed (1) hide show

README.md +50 -0

README.md ADDED Viewed

	@@ -0,0 +1,50 @@

+---
+language: en
+license: mit
+tags:
+  - sentiment-analysis
+  - text-classification
+  - encoder
+library_name: sentimentizer
+task: text-classification
+---
+# Sentimentizer ENCODER Sentiment Model
+## Description
+A Transformer Encoder for sentiment classification built on pre-trained GloVe embeddings. The model uses multi-head self-attention with positional encodings and a classification token (CLS) to produce a sentiment score.
+## Training Data
+Trained on the [Yelp Open Dataset](https://www.yelp.com/dataset) reviews, with GloVe Wiki-Gigaword-100 pre-trained embeddings. Reviews are tokenized with a custom dictionary (20k vocab, min frequency 3) and padded/truncated to 200 tokens.
+## Usage
+```python
+from sentimentizer.hf import download_weights
+from sentimentizer.config import DriverConfig, weights_path_for
+# Download weights + dictionary from Hugging Face Hub
+weights_path = weights_path_for("encoder")
+download_weights(
+    "encoder",
+    weights_path,
+    dict_path=DriverConfig.files.dictionary_file_path,
+)
+# Load and run inference
+from sentimentizer.models.encoder import get_trained_model
+from sentimentizer.tokenizer import get_trained_tokenizer
+model = get_trained_model(device="cpu")
+tokenizer = get_trained_tokenizer()
+import numpy as np
+token_ids = tokenizer.tokenize_text("amazing food great service")
+score = model.predict(token_ids)
+print(f'Sentiment score: {score.item():.4f}')  # >0.5 = positive, <0.5 = negative
+```
+## Files
+- `encoder_weights.pth` — Model state dictionary
+- `yelp.dictionary` — Gensim dictionary for tokenization