kiddothe2b
/

longformer-base-4096

@@ -1,9 +1,10 @@
 ---
-license: cc-by-nc-sa-4.0
 pipeline_tag: fill-mask
 language: en
 tags:
-- long_documents
 datasets:
 - c4
 model-index:
@@ -15,7 +16,7 @@ model-index:
 ## Model description
-[Longformer](https://arxiv.org/abs/2004.05150) is a transformer model for long documents.  This version of Longformer presented in [An Exploration of Hierarchical Attention Transformers for Efficient Long Document Classification (Chalkidis et al., 2022)](https://arxiv.org/abs/xxx).
 The model has been warm-started re-using the weights of RoBERTa (Liu et al., 2019), and continued pre-trained for MLM in long sequences following the paradigm of original Longformer released by Beltagy et al. (2020). It supports sequences of length up to 4,096.
@@ -39,12 +40,12 @@ mlm_model = pipeline('fill-mask', model='kiddothe2b/longformer-base-4096', trust
 mlm_model("Hello I'm a <mask> model.")
 ```
-You can also fine-tun it for SequenceClassification, SequentialSentenceClassification, and MultipleChoice down-stream tasks:
 ```python
 from transformers import AutoTokenizer, AutoModelforSequenceClassification
 tokenizer = AutoTokenizer.from_pretrained("kiddothe2b/longformer-base-4096", trust_remote_code=True)
-doc_classifier = AutoModelforSequenceClassification(model='kiddothe2b/longformer-base-4096', trust_remote_code=True)
 ```
 ## Limitations and bias
@@ -94,18 +95,22 @@ TThe following hyperparameters were used during training:
 ## Citing
-If you use this Longformer model in your research, please cite [An Exploration of Hierarchical Attention Transformers for Efficient Long Document Classification](https://arxiv.org/abs/xxx), alongside [Longformer: The Long-Document Transformer](https://arxiv.org/abs/2004.05150).
 ```
 @misc{chalkidis-etal-2022-hat,
-  url = {https://arxiv.org/abs/xxx},
   author = {Chalkidis, Ilias and Dai, Xiang and Fergadiotis, Manos and Malakasiotis, Prodromos and Elliott, Desmond},
   title = {An Exploration of Hierarchical Attention Transformers for Efficient Long Document Classification},
   publisher = {arXiv},
   year = {2022},
 }
 @article{Beltagy2020Longformer,
   title={Longformer: The Long-Document Transformer},
   author={Iz Beltagy and Matthew E. Peters and Arman Cohan},

 ---
+license: cc-by-sa-4.0
 pipeline_tag: fill-mask
+arxiv: 2210.05529
 language: en
 tags:
+- long-documents
 datasets:
 - c4
 model-index:
 ## Model description
+[Longformer](https://arxiv.org/abs/2004.05150) is a transformer model for long documents.  This version of Longformer presented in [An Exploration of Hierarchical Attention Transformers for Efficient Long Document Classification (Chalkidis et al., 2022)](https://arxiv.org/abs/2210.05529).
 The model has been warm-started re-using the weights of RoBERTa (Liu et al., 2019), and continued pre-trained for MLM in long sequences following the paradigm of original Longformer released by Beltagy et al. (2020). It supports sequences of length up to 4,096.
 mlm_model("Hello I'm a <mask> model.")
 ```
+You can also fine-tune it for SequenceClassification, SequentialSentenceClassification, and MultipleChoice down-stream tasks:
 ```python
 from transformers import AutoTokenizer, AutoModelforSequenceClassification
 tokenizer = AutoTokenizer.from_pretrained("kiddothe2b/longformer-base-4096", trust_remote_code=True)
+doc_classifier = AutoModelforSequenceClassification("kiddothe2b/longformer-base-4096", trust_remote_code=True)
 ```
 ## Limitations and bias
 ## Citing
+If you use HAT in your research, please cite:
+[An Exploration of Hierarchical Attention Transformers for Efficient Long Document Classification](https://arxiv.org/abs/2210.05529). Ilias Chalkidis, Xiang Dai, Manos Fergadiotis, Prodromos Malakasiotis, and Desmond Elliott. 2022. arXiv:2210.05529 (Preprint).
 ```
 @misc{chalkidis-etal-2022-hat,
+  url = {https://arxiv.org/abs/2210.05529},
   author = {Chalkidis, Ilias and Dai, Xiang and Fergadiotis, Manos and Malakasiotis, Prodromos and Elliott, Desmond},
   title = {An Exploration of Hierarchical Attention Transformers for Efficient Long Document Classification},
   publisher = {arXiv},
   year = {2022},
 }
+```
+Also cite the original work: [Longformer: The Long-Document Transformer](https://arxiv.org/abs/2004.05150).
+```
 @article{Beltagy2020Longformer,
   title={Longformer: The Long-Document Transformer},
   author={Iz Beltagy and Matthew E. Peters and Arman Cohan},