simonlevine
/

clinical-longformer

Model card Files Files and versions

Simon Levine commited on Dec 2, 2020

Commit

c26dd12

·

1 Parent(s): b24bb18

Create README.md

Files changed (1) hide show

README.md +26 -0

README.md ADDED Viewed

	@@ -0,0 +1,26 @@

+- You'll need to instantiate a special RoBERTa class. Though technically a "Longformer", the elongated RoBERTa model will still need to be pulled in as such.
+- To do so, use the following classes:
+```python
+class RobertaLongSelfAttention(LongformerSelfAttention):
+    def forward(
+        self,
+        hidden_states,
+        attention_mask=None,
+        head_mask=None,
+        encoder_hidden_states=None,
+        encoder_attention_mask=None,
+        output_attentions=False,
+    ):
+        return super().forward(hidden_states, attention_mask=attention_mask, output_attentions=output_attentions)
+class RobertaLongForMaskedLM(RobertaForMaskedLM):
+    def __init__(self, config):
+        super().__init__(config)
+        for i, layer in enumerate(self.roberta.encoder.layer):
+            # replace the `modeling_bert.BertSelfAttention` object with `LongformerSelfAttention`
+            layer.attention.self = RobertaLongSelfAttention(config, layer_id=i)
+```
+- Then, pull the model as ```RobertaLongForMaskedLM.from_pretrained('simonlevine/bioclinical-roberta-long')```
+- Now, it can be used as usual. Note you may get untrained weights warnings.
+- Note that you can replace ```RobertaForMaskedLM``` with a different task-specific RoBERTa from Huggingface, such as RobertaForSequenceClassification.