| | --- |
| | language: |
| | - tw |
| | tags: |
| | - albert |
| | - classification |
| | license: afl-3.0 |
| | metrics: |
| | - Accuracy |
| | --- |
| | |
| | # 繁體中文情緒分類: 負面(0)、正面(1) |
| |
|
| | 依據ckiplab/albert預訓練模型微調,訓練資料集只有8萬筆,做為課程的範例模型。 |
| |
|
| | # 使用範例: |
| |
|
| | from transformers import AutoTokenizer, AutoModelForSequenceClassification |
| | tokenizer = AutoTokenizer.from_pretrained("clhuang/albert-sentiment") |
| | model = AutoModelForSequenceClassification.from_pretrained("clhuang/albert-sentiment") |
| | |
| | ## Pediction |
| | target_names=['Negative','Positive'] |
| | max_length = 200 # 最多字數 若超出模型訓練時的字數,以模型最大字數為依據 |
| | def get_sentiment_proba(text): |
| | # prepare our text into tokenized sequence |
| | inputs = tokenizer(text, padding=True, truncation=True, max_length=max_length, return_tensors="pt") |
| | # perform inference to our model |
| | outputs = model(**inputs) |
| | # get output probabilities by doing softmax |
| | probs = outputs[0].softmax(1) |
| | |
| | response = {'Negative': round(float(probs[0, 0]), 2), 'Positive': round(float(probs[0, 1]), 2)} |
| | # executing argmax function to get the candidate label |
| | #return probs.argmax() |
| | return response |
| | |
| | get_sentiment_proba('我喜歡這本書') |
| | get_sentiment_proba('不喜歡這款產品') |