Simon Levine commited on
Commit
c26dd12
·
1 Parent(s): b24bb18

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -0
README.md ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ - You'll need to instantiate a special RoBERTa class. Though technically a "Longformer", the elongated RoBERTa model will still need to be pulled in as such.
2
+ - To do so, use the following classes:
3
+
4
+ ```python
5
+ class RobertaLongSelfAttention(LongformerSelfAttention):
6
+ def forward(
7
+ self,
8
+ hidden_states,
9
+ attention_mask=None,
10
+ head_mask=None,
11
+ encoder_hidden_states=None,
12
+ encoder_attention_mask=None,
13
+ output_attentions=False,
14
+ ):
15
+ return super().forward(hidden_states, attention_mask=attention_mask, output_attentions=output_attentions)
16
+
17
+ class RobertaLongForMaskedLM(RobertaForMaskedLM):
18
+ def __init__(self, config):
19
+ super().__init__(config)
20
+ for i, layer in enumerate(self.roberta.encoder.layer):
21
+ # replace the `modeling_bert.BertSelfAttention` object with `LongformerSelfAttention`
22
+ layer.attention.self = RobertaLongSelfAttention(config, layer_id=i)
23
+ ```
24
+ - Then, pull the model as ```RobertaLongForMaskedLM.from_pretrained('simonlevine/bioclinical-roberta-long')```
25
+ - Now, it can be used as usual. Note you may get untrained weights warnings.
26
+ - Note that you can replace ```RobertaForMaskedLM``` with a different task-specific RoBERTa from Huggingface, such as RobertaForSequenceClassification.