| This is the ELECTRA-Tiny language model that is effective for discriminative tasks with fewer than 6 million parameters. The embeddings were enriched using multimodal embeddings from a multiplex network. It was trained on the BabyLM 100M dataset. | |
| ### Citation | |
| @inproceedings{fields-etal-2023-tiny, | |
| title = "Tiny Language Models Enriched with Multimodal Knowledge from Multiplex Networks", | |
| author = "Fields, Clayton and | |
| Natouf, Osama and | |
| McMains, Andrew and | |
| Henry, Catherine and | |
| Kennington, Casey", | |
| editor = "Warstadt, Alex and | |
| Mueller, Aaron and | |
| Choshen, Leshem and | |
| Wilcox, Ethan and | |
| Zhuang, Chengxu and | |
| Ciro, Juan and | |
| Mosquera, Rafael and | |
| Paranjabe, Bhargavi and | |
| Williams, Adina and | |
| Linzen, Tal and | |
| Cotterell, Ryan", | |
| booktitle = "Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning", | |
| month = dec, | |
| year = "2023", | |
| address = "Singapore", | |
| publisher = "Association for Computational Linguistics", | |
| url = "https://aclanthology.org/2023.conll-babylm.3/", | |
| doi = "10.18653/v1/2023.conll-babylm.3", | |
| pages = "47--57" | |
| } | |
| } | |
| ### Acknowledgements | |
| This material is based upon work supported by the National Science Foundation under Grant No. 2140642. | |