lukecq commited on
Commit
2c7b6c1
·
1 Parent(s): 5583dbe

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +79 -1
README.md CHANGED
@@ -1,4 +1,5 @@
1
  ---
 
2
  license: mit
3
  ---
4
  # Zero-shot text classification (large-sized model) trained with self-supervised tuning
@@ -10,4 +11,81 @@ and first released in [this repository](https://github.com/DAMO-NLP-SG/SSTuning)
10
 
11
  The model backbone is RoBERTa-large.
12
 
13
- Please refer to [zero-shot-classify-SSTuning-base](https://huggingface.co/DAMO-NLP-SG/zero-shot-classify-SSTuning-base) for more information.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+
3
  license: mit
4
  ---
5
  # Zero-shot text classification (large-sized model) trained with self-supervised tuning
 
11
 
12
  The model backbone is RoBERTa-large.
13
 
14
+ ## Model description
15
+ The model is tuned with unlabeled data using a learning objective called first sentence prediction (FSP).
16
+ The FSP task is designed by considering both the nature of the unlabeled corpus and the input/output format of classification tasks.
17
+ The training and validation sets are constructed from the unlabeled corpus using FSP.
18
+
19
+ During tuning, BERT-like pre-trained masked language
20
+ models such as RoBERTa and ALBERT are employed as the backbone, and an output layer for classification is added.
21
+ The learning objective for FSP is to predict the index of the correct label.
22
+ A cross-entropy loss is used for tuning the model.
23
+
24
+ ## Model variations
25
+ There are three versions of models released. The details are:
26
+
27
+ | Model | Backbone | #params | accuracy | Speed | #Training data
28
+ |------------|-----------|----------|-------|-------|----|
29
+ | [zero-shot-classify-SSTuning-base](https://huggingface.co/DAMO-NLP-SG/zero-shot-classify-SSTuning-base) | [roberta-base](https://huggingface.co/roberta-base) | 125M | Low | High | 20.48M |
30
+ | [zero-shot-classify-SSTuning-large](https://huggingface.co/DAMO-NLP-SG/zero-shot-classify-SSTuning-large) | [roberta-large](https://huggingface.co/roberta-large) | 355M | Medium | Medium | 5.12M |
31
+ | [zero-shot-classify-SSTuning-ALBERT](https://huggingface.co/DAMO-NLP-SG/zero-shot-classify-SSTuning-ALBERT) | [albert-xxlarge-v2](https://huggingface.co/albert-xxlarge-v2) | 235M | High | Low| 5.12M |
32
+
33
+ Please note that zero-shot-classify-SSTuning-base is trained with more data (20.48M) than the paper, as this will increase the accuracy.
34
+
35
+
36
+ ## Intended uses & limitations
37
+ The model can be used for zero-shot text classification such as sentiment analysis and topic classification. No further finetuning is needed.
38
+
39
+ The number of labels should be 2 ~ 20.
40
+
41
+ ### How to use
42
+ You can try the model with the Colab [Notebook](https://colab.research.google.com/drive/17bqc8cXFF-wDmZ0o8j7sbrQB9Cq7Gowr?usp=sharing).
43
+
44
+ ```python
45
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
46
+ import torch, string, random
47
+
48
+ tokenizer = AutoTokenizer.from_pretrained("DAMO-NLP-SG/zero-shot-classify-SSTuning-base")
49
+ model = AutoModelForSequenceClassification.from_pretrained("DAMO-NLP-SG/zero-shot-classify-SSTuning-base")
50
+
51
+ text = "I love this place! The food is always so fresh and delicious."
52
+ list_label = ["negative", "positive"]
53
+
54
+ list_ABC = [x for x in string.ascii_uppercase]
55
+ def add_prefix(text, list_label, shuffle = False):
56
+ list_label = [x+'.' if x[-1] != '.' else x for x in list_label]
57
+ list_label_new = list_label + [tokenizer.pad_token]* (20 - len(list_label))
58
+ if shuffle:
59
+ random.shuffle(list_label_new)
60
+ s_option = ' '.join(['('+list_ABC[i]+') '+list_label_new[i] for i in range(len(list_label_new))])
61
+ return f'{s_option} {tokenizer.sep_token} {text}', list_label_new
62
+
63
+ text_new, list_label_new = add_prefix(text,list_label,shuffle=False)
64
+
65
+ encoding = tokenizer([text_new],truncation=True, padding='max_length',max_length=512, return_tensors='pt')
66
+ with torch.no_grad():
67
+ logits = model(**encoding).logits
68
+ probs = torch.nn.functional.softmax(logits, dim = -1).tolist()
69
+ predictions = torch.argmax(logits, dim=-1)
70
+
71
+ print(probs)
72
+ print(predictions)
73
+ ```
74
+
75
+
76
+ ### BibTeX entry and citation info
77
+ ```bibtxt
78
+ @inproceedings{acl23/SSTuning,
79
+ author = {Chaoqun Liu and
80
+ Wenxuan Zhang and
81
+ Guizhen Chen and
82
+ Xiaobao Wu and
83
+ Anh Tuan Luu and
84
+ Chip Hong Chang and
85
+ Lidong Bing},
86
+ title = {Zero-Shot Text Classification via Self-Supervised Tuning},
87
+ booktitle = {Findings of the 2023 ACL},
88
+ year = {2023},
89
+ url = {},
90
+ }
91
+ ```