Transformers
PyTorch
roberta
xww033 commited on
Commit
13a5a6c
·
1 Parent(s): 3cf215e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -44
README.md CHANGED
@@ -2,8 +2,7 @@
2
  license: mit
3
  ---
4
  # From Clozing to Comprehending: Retrofitting Pre-trained Masked Language Model to Pre-trained Machine Reader
5
-
6
- Pre-trained Machine Reading Comprehension (MRC) model trained with Wikipedia Hyperlinks.
7
  It was introduced in the paper From Clozing to Comprehending: Retrofitting Pre-trained Masked Language Model to Pre-trained Machine Reader by
8
  Weiwen Xu, Xin Li, Wenxuan Zhang, Meng Zhou, Wai Lam, Luo Si, Lidong Bing
9
  and first released in [this repository](https://github.com/DAMO-NLP-SG/PMR).
@@ -12,7 +11,6 @@ The model is initialized with roberta-base and further continued pre-trained wit
12
 
13
  ## Model description
14
  The model is pre-trained with distantly labeled data using a learning objective called Wiki Anchor Extraction (WAE).
15
-
16
  Specifically, we constructed a large volume of general-purpose and high-quality MRC-style training data based on Wikipedia anchors (i.e., hyperlinked texts).
17
  For each Wikipedia anchor, we composed a pair of correlated articles.
18
  One side of the pair is the Wikipedia article that contains detailed descriptions of the hyperlinked entity, which we defined as the definition article.
@@ -28,60 +26,28 @@ During fine-tuning, we unified downstream NLU tasks in our MRC formulation, whic
28
  (2) span extraction with natural questions (e.g., EQA) in which the question is treated as the query for answer extraction from the given passage (context);
29
  (3) sequence classification with pre-defined task labels, such as sentiment analysis. Each task label is used as a query for the input text (context); and
30
  (4) sequence classification with natural questions on multiple choices, such as multi-choice QA (MCQA). We treated the concatenation of the question and one choice as the query for the given passage (context).
31
-
32
  Then, in the output space, we tackle span extraction problems by predicting the probability of context span being the answer.
33
  We tackle sequence classification problems by conducting relevance classification on [CLS] (extracting [CLS] if relevant).
34
 
35
  ## Model variations
36
- There are three versions of models released. The details are:
37
 
38
- | Model | Backbone | #params | accuracy | Speed | #Training data
39
  |------------|-----------|----------|-------|-------|----|
40
- | [zero-shot-classify-SSTuning-base](https://huggingface.co/DAMO-NLP-SG/zero-shot-classify-SSTuning-base) | [roberta-base](https://huggingface.co/roberta-base) | 125M | Low | High | 20.48M |
41
- | [zero-shot-classify-SSTuning-large](https://huggingface.co/DAMO-NLP-SG/zero-shot-classify-SSTuning-large) | [roberta-large](https://huggingface.co/roberta-large) | 355M | Medium | Medium | 5.12M |
42
- | [zero-shot-classify-SSTuning-ALBERT](https://huggingface.co/DAMO-NLP-SG/zero-shot-classify-SSTuning-ALBERT) | [albert-xxlarge-v2](https://huggingface.co/albert-xxlarge-v2) | 235M | High | Low| 5.12M |
 
 
43
 
44
- Please note that zero-shot-classify-SSTuning-base is trained with more data (20.48M) than the paper, as this will increase the accuracy.
45
 
46
 
47
  ## Intended uses & limitations
48
- The model can be used for zero-shot text classification such as sentiment analysis and topic classification. No further finetuning is needed.
49
-
50
- The number of labels should be 2 ~ 20.
51
 
52
  ### How to use
53
- You can try the model with the Colab [Notebook](https://colab.research.google.com/drive/17bqc8cXFF-wDmZ0o8j7sbrQB9Cq7Gowr?usp=sharing).
54
-
55
- ```python
56
- from transformers import AutoTokenizer, AutoModelForSequenceClassification
57
- import torch, string, random
58
-
59
- tokenizer = AutoTokenizer.from_pretrained("DAMO-NLP-SG/zero-shot-classify-SSTuning-base")
60
- model = AutoModelForSequenceClassification.from_pretrained("DAMO-NLP-SG/zero-shot-classify-SSTuning-base")
61
-
62
- text = "I love this place! The food is always so fresh and delicious."
63
- list_label = ["negative", "positive"]
64
-
65
- list_ABC = [x for x in string.ascii_uppercase]
66
- def add_prefix(text, list_label, shuffle = False):
67
- list_label = [x+'.' if x[-1] != '.' else x for x in list_label]
68
- list_label_new = list_label + [tokenizer.pad_token]* (20 - len(list_label))
69
- if shuffle:
70
- random.shuffle(list_label_new)
71
- s_option = ' '.join(['('+list_ABC[i]+') '+list_label_new[i] for i in range(len(list_label_new))])
72
- return f'{s_option} {tokenizer.sep_token} {text}', list_label_new
73
-
74
- text_new, list_label_new = add_prefix(text,list_label,shuffle=False)
75
-
76
- encoding = tokenizer([text_new],truncation=True, padding='max_length',max_length=512, return_tensors='pt')
77
- with torch.no_grad():
78
- logits = model(**encoding).logits
79
- probs = torch.nn.functional.softmax(logits, dim = -1).tolist()
80
- predictions = torch.argmax(logits, dim=-1)
81
 
82
- print(probs)
83
- print(predictions)
84
- ```
85
 
86
 
87
  ### BibTeX entry and citation info
 
2
  license: mit
3
  ---
4
  # From Clozing to Comprehending: Retrofitting Pre-trained Masked Language Model to Pre-trained Machine Reader
5
+ Pre-trained Machine Reader (PMR) is pre-trained with 18 million Machine Reading Comprehension (MRC) examples constructed with Wikipedia Hyperlinks.
 
6
  It was introduced in the paper From Clozing to Comprehending: Retrofitting Pre-trained Masked Language Model to Pre-trained Machine Reader by
7
  Weiwen Xu, Xin Li, Wenxuan Zhang, Meng Zhou, Wai Lam, Luo Si, Lidong Bing
8
  and first released in [this repository](https://github.com/DAMO-NLP-SG/PMR).
 
11
 
12
  ## Model description
13
  The model is pre-trained with distantly labeled data using a learning objective called Wiki Anchor Extraction (WAE).
 
14
  Specifically, we constructed a large volume of general-purpose and high-quality MRC-style training data based on Wikipedia anchors (i.e., hyperlinked texts).
15
  For each Wikipedia anchor, we composed a pair of correlated articles.
16
  One side of the pair is the Wikipedia article that contains detailed descriptions of the hyperlinked entity, which we defined as the definition article.
 
26
  (2) span extraction with natural questions (e.g., EQA) in which the question is treated as the query for answer extraction from the given passage (context);
27
  (3) sequence classification with pre-defined task labels, such as sentiment analysis. Each task label is used as a query for the input text (context); and
28
  (4) sequence classification with natural questions on multiple choices, such as multi-choice QA (MCQA). We treated the concatenation of the question and one choice as the query for the given passage (context).
 
29
  Then, in the output space, we tackle span extraction problems by predicting the probability of context span being the answer.
30
  We tackle sequence classification problems by conducting relevance classification on [CLS] (extracting [CLS] if relevant).
31
 
32
  ## Model variations
33
+ There are five (including two multilingual variations) versions of models released. The details are:
34
 
35
+ | Model | Backbone | #params |
36
  |------------|-----------|----------|-------|-------|----|
37
+ | [PMR-base](https://huggingface.co/DAMO-NLP-SG/PMR-base) | [roberta-base](https://huggingface.co/roberta-base) | 125M |
38
+ | [PMR-large](https://huggingface.co/DAMO-NLP-SG/PMR-large) | [roberta-large](https://huggingface.co/roberta-large) | 355M |
39
+ | [PMR-xxlarge](https://huggingface.co/DAMO-NLP-SG/PMR-xxlarge) | [albert-xxlarge-v2](https://huggingface.co/albert-xxlarge-v2) | 235M |
40
+ | [mPMR-base](https://huggingface.co/DAMO-NLP-SG/mPMR-base) | [xlm-roberta-base](https://huggingface.co/xlm-roberta-base) | 270M |
41
+ | [mPMR-large](https://huggingface.co/DAMO-NLP-SG/mPMR-large) | [xlm-roberta-large](https://huggingface.co/xlm-roberta-large) | 550M |
42
 
 
43
 
44
 
45
  ## Intended uses & limitations
46
+ The models need to be fine-tuned on the data downstream tasks. During fine-tuning, no task-specific layer is required.
 
 
47
 
48
  ### How to use
49
+ You can try the scripts from [this repo](https://github.com/DAMO-NLP-SG/PMR).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
50
 
 
 
 
51
 
52
 
53
  ### BibTeX entry and citation info