Commit
·
9b59751
1
Parent(s):
a82bd5d
Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,82 @@
|
|
| 1 |
---
|
| 2 |
license: mit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: mit
|
| 3 |
+
language: ja
|
| 4 |
+
tags:
|
| 5 |
+
- luke
|
| 6 |
+
- sentiment-analysis
|
| 7 |
+
- wrime
|
| 8 |
+
- SentimentAnalysis
|
| 9 |
+
- pytorch
|
| 10 |
---
|
| 11 |
+
|
| 12 |
+
# このモデルはLuke-japanese-large-liteをファインチューニングしたものです。
|
| 13 |
+
このモデルは8つの感情(喜び、悲しみ、期待、驚き、怒り、恐れ、嫌悪、信頼)の内、どの感情が文章に含まれているのか分析することができます。
|
| 14 |
+
このモデルはwrimeデータセット(
|
| 15 |
+
https://huggingface.co/datasets/shunk031/wrime
|
| 16 |
+
)を用いて学習を行いました。
|
| 17 |
+
|
| 18 |
+
# This model is based on Luke-japanese-large-lite
|
| 19 |
+
This model is fine-tuned model which besed on studio-ousia/Luke-japanese-large-lite.
|
| 20 |
+
This could be able to analyze which emotions (joy or sadness or anticipation or surprise or anger or fear or disdust or trust ) are included.
|
| 21 |
+
This model was fine-tuned by using wrime dataset.
|
| 22 |
+
|
| 23 |
+
# what is Luke? Lukeとは?[1]
|
| 24 |
+
LUKE (Language Understanding with Knowledge-based Embeddings) is a new pre-trained contextualized representation of words and entities based on transformer. LUKE treats words and entities in a given text as independent tokens, and outputs contextualized representations of them. LUKE adopts an entity-aware self-attention mechanism that is an extension of the self-attention mechanism of the transformer, and considers the types of tokens (words or entities) when computing attention scores.
|
| 25 |
+
|
| 26 |
+
LUKE achieves state-of-the-art results on five popular NLP benchmarks including SQuAD v1.1 (extractive question answering), CoNLL-2003 (named entity recognition), ReCoRD (cloze-style question answering), TACRED (relation classification), and Open Entity (entity typing).
|
| 27 |
+
luke-japaneseは、単語とエンティティの知識拡張型訓練済み Transformer モデルLUKEの日本語版です。LUKE は単語とエンティティを独立したトークンとして扱い、これらの文脈を考慮した表現を出力します。
|
| 28 |
+
|
| 29 |
+
# how to use 使い方
|
| 30 |
+
ステップ1:pythonとpytorch, sentencepieceのインストールとtransformersのアップデート(バージョンが古すぎるとLukeTokenizerが入っていないため)
|
| 31 |
+
update transformers and install sentencepiece, python and pytorch
|
| 32 |
+
|
| 33 |
+
ステップ2:下記のコードを実行する
|
| 34 |
+
Please execute this code
|
| 35 |
+
|
| 36 |
+
|
| 37 |
+
```python
|
| 38 |
+
from transformers import AutoTokenizer, AutoModelForSequenceClassification, LukeConfig
|
| 39 |
+
import torch
|
| 40 |
+
tokenizer = AutoTokenizer.from_pretrained("Mizuiro-sakura/luke-japanese-large-sentiment-analysis-wrime")
|
| 41 |
+
config = LukeConfig.from_pretrained('Mizuiro-sakura/luke-japanese-large-sentiment-analysis-wrime', output_hidden_states=True)
|
| 42 |
+
model = AutoModelForSequenceClassification.from_pretrained('Mizuiro-sakura/luke-japanese-large-sentiment-analysis-wrime', config=config)
|
| 43 |
+
|
| 44 |
+
text='すごく楽しかった。また行きたい。'
|
| 45 |
+
|
| 46 |
+
max_seq_length=512
|
| 47 |
+
token=tokenizer(text,
|
| 48 |
+
truncation=True,
|
| 49 |
+
max_length=max_seq_length,
|
| 50 |
+
padding="max_length")
|
| 51 |
+
output=model(torch.tensor(token['input_ids']).unsqueeze(0), torch.tensor(token['attention_mask']).unsqueeze(0))
|
| 52 |
+
max_index=torch.argmax(torch.tensor(output.logits))
|
| 53 |
+
|
| 54 |
+
if max_index==0:
|
| 55 |
+
print('joy、うれしい')
|
| 56 |
+
elif max_index==1:
|
| 57 |
+
print('sadness、悲しい')
|
| 58 |
+
elif max_index==2:
|
| 59 |
+
print('anticipation、期待')
|
| 60 |
+
elif max_index==3:
|
| 61 |
+
print('surprise、驚き')
|
| 62 |
+
elif max_index==4:
|
| 63 |
+
print('anger、怒り')
|
| 64 |
+
elif max_index==5:
|
| 65 |
+
print('fear、恐れ')
|
| 66 |
+
elif max_index==6:
|
| 67 |
+
print('disgust、嫌悪')
|
| 68 |
+
elif max_index==7:
|
| 69 |
+
print('trust、信頼')
|
| 70 |
+
```
|
| 71 |
+
|
| 72 |
+
# Acknowledgments 謝辞
|
| 73 |
+
Lukeの開発者である山田先生とStudio ousiaさんには感謝いたします。
|
| 74 |
+
I would like to thank Mr.Yamada @ikuyamada and Studio ousia @StudioOusia.
|
| 75 |
+
|
| 76 |
+
# Citation
|
| 77 |
+
[1]@inproceedings{yamada2020luke,
|
| 78 |
+
title={LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention},
|
| 79 |
+
author={Ikuya Yamada and Akari Asai and Hiroyuki Shindo and Hideaki Takeda and Yuji Matsumoto},
|
| 80 |
+
booktitle={EMNLP},
|
| 81 |
+
year={2020}
|
| 82 |
+
}
|