ryeyoo commited on
Commit
d40d209
·
verified ·
1 Parent(s): 0980024

Update decoder model card

Browse files
Files changed (1) hide show
  1. README.md +51 -0
README.md ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ license: mit
4
+ tags:
5
+ - sentiment-analysis
6
+ - text-classification
7
+ - decoder
8
+ library_name: sentimentizer
9
+ task: text-classification
10
+ ---
11
+ # Sentimentizer DECODER Sentiment Model
12
+ ## Description
13
+
14
+ A Transformer Encoder-Decoder for sentiment classification built on pre-trained GloVe embeddings. The encoder processes the input sequence, and the decoder attends to the encoder outputs to produce a sentiment prediction.
15
+
16
+ ## Training Data
17
+
18
+ Trained on the [Yelp Open Dataset](https://www.yelp.com/dataset) reviews, with GloVe Wiki-Gigaword-100 pre-trained embeddings. Reviews are tokenized with a custom dictionary (20k vocab, min frequency 3) and padded/truncated to 200 tokens.
19
+
20
+ ## Usage
21
+
22
+ ```python
23
+ from sentimentizer.hf import download_weights
24
+ from sentimentizer.config import DriverConfig, weights_path_for
25
+
26
+ # Download weights + dictionary from Hugging Face Hub
27
+ weights_path = weights_path_for("decoder")
28
+ download_weights(
29
+ "decoder",
30
+ weights_path,
31
+ repo_id="ryeyoo/sentimentizer-decoder",
32
+ dict_path=DriverConfig.files.dictionary_file_path,
33
+ )
34
+
35
+ # Load and run inference
36
+ from sentimentizer.models.decoder import get_trained_model
37
+ from sentimentizer.tokenizer import get_trained_tokenizer
38
+
39
+ model = get_trained_model(device="cpu")
40
+ tokenizer = get_trained_tokenizer()
41
+
42
+ probs = model.predict_text('amazing food great service')
43
+ for label, prob in sorted(probs.items(), key=lambda x: -x[1]):
44
+ print(f'{label}: {prob:.4f}')
45
+ # e.g. positive: 0.8300, neutral: 0.1200, negative: 0.0500
46
+ ```
47
+
48
+ ## Files
49
+
50
+ - `decoder_weights.pth` — Model state dictionary
51
+ - `yelp.dictionary` — Gensim dictionary for tokenization