nsadeq commited on
Commit
08b1b81
·
1 Parent(s): 3e362ff

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +28 -0
README.md CHANGED
@@ -1,3 +1,31 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+
5
+ # InformBERT
6
+
7
+ ## Introduction
8
+
9
+ InformBERT is pretrained model trained using variable masking strategy, where informative tokens are masked more frequently compared to other tokens. InformBERT outperforms random masking based pretrained models on the factual recall benchmark LAMA and extractive question answering benchmark SQuAD.
10
+
11
+ ## How to use
12
+ ```Python
13
+ from transformers import AutoTokenizer, AutoModel
14
+ tokenizer = AutoTokenizer.from_pretrained("nsadeq/InformBERT")
15
+ model = AutoModel.from_pretrained("nsadeq/InformBERT")
16
+ ```
17
+
18
+ ## Citation
19
+
20
+ ```bibtex
21
+ @misc{https://doi.org/10.48550/arxiv.2210.11771,
22
+ doi = {10.48550/ARXIV.2210.11771},
23
+ url = {https://arxiv.org/abs/2210.11771},
24
+ author = {Sadeq, Nafis and Xu, Canwen and McAuley, Julian},
25
+ keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
26
+ title = {InforMask: Unsupervised Informative Masking for Language Model Pretraining},
27
+ publisher = {arXiv},
28
+ year = {2022},
29
+ copyright = {arXiv.org perpetual, non-exclusive license}
30
+ }
31
+ ```