nlpie
/

miniALBERT-128

Model card Files Files and versions

mohammadmahdinouri commited on Feb 7, 2023

Commit

74ae76c

·

1 Parent(s): 6376ae0

Update README.md

Files changed (1) hide show

README.md +33 -1

README.md CHANGED Viewed

@@ -5,4 +5,36 @@ license: mit
 # Model
 miniALBERT is a recursive transformer model which uses cross-layer parameter sharing, embedding factorisation, and bottleneck adapters to achieve high parameter efficiency.
 Since miniALBERT is a compact model, it is trained using a layer-to-layer distillation technique, using the bert-base model as the teacher. Currently, this model is trained for one epoch on the English subset of Wikipedia.
-In terms of architecture, this model uses an embedding dimension of 128, a hidden size of 768, an MLP expansion rate of 4, and a reduction factor of 16 for bottleneck adapters. In general, this model uses 6 recursions and has a unique parameter count of 11 million parameters.

 # Model
 miniALBERT is a recursive transformer model which uses cross-layer parameter sharing, embedding factorisation, and bottleneck adapters to achieve high parameter efficiency.
 Since miniALBERT is a compact model, it is trained using a layer-to-layer distillation technique, using the bert-base model as the teacher. Currently, this model is trained for one epoch on the English subset of Wikipedia.
+In terms of architecture, this model uses an embedding dimension of 128, a hidden size of 768, an MLP expansion rate of 4, and a reduction factor of 16 for bottleneck adapters. In general, this model uses 6 recursions and has a unique parameter count of 11 million parameters.
+# Usage
+Since miniALBERT uses a unique architecture it can not be loaded using ts.AutoModel for now. To load the model, first, clone the miniALBERT GitHub project, using the below code:
+```bash
+git clone https://github.com/nlpie-research/MiniALBERT.git
+```
+Then use the ```sys.path.append``` to add the miniALBERT files to your project and then import the miniALBERT modeling file using the below code:
+```bash
+import sys
+sys.path.append("PATH_TO_CLONED_PROJECT/MiniALBERT/")
+from minialbert_modeling import MiniAlbertForSequenceClassification, MiniAlbertForTokenClassification
+```
+Finally, load the model like a regular model in the transformers library using the below code:
+```python
+# For NER use the below code
+model = MiniAlbertForTokenClassification.from_pretrained("nlpie/miniALBERT-128")
+# For Sequence Classification use the below code
+model = MiniAlbertForTokenClassification.from_pretrained("nlpie/miniALBERT-128")
+```
+# Citation
+If you use the model, please cite our paper:
+```
+@article{nouriborji2022minialbert,
+  title={MiniALBERT: Model Distillation via Parameter-Efficient Recursive Transformers},
+  author={Nouriborji, Mohammadmahdi and Rohanian, Omid and Kouchaki, Samaneh and Clifton, David A},
+  journal={arXiv preprint arXiv:2210.06425},
+  year={2022}
+}
+```