File size: 1,330 Bytes

a210590

This is the ELECTRA-Tiny language model that is effective for discriminative tasks with fewer than 6 million parameters. The embeddings were enriched using multimodal embeddings from a multiplex network. It was trained on the BabyLM 100M dataset. 

### Citation

@inproceedings{fields-etal-2023-tiny,
    title = "Tiny Language Models Enriched with Multimodal Knowledge from Multiplex Networks",
    author = "Fields, Clayton  and
      Natouf, Osama  and
      McMains, Andrew  and
      Henry, Catherine  and
      Kennington, Casey",
    editor = "Warstadt, Alex  and
      Mueller, Aaron  and
      Choshen, Leshem  and
      Wilcox, Ethan  and
      Zhuang, Chengxu  and
      Ciro, Juan  and
      Mosquera, Rafael  and
      Paranjabe, Bhargavi  and
      Williams, Adina  and
      Linzen, Tal  and
      Cotterell, Ryan",
    booktitle = "Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning",
    month = dec,
    year = "2023",
    address = "Singapore",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.conll-babylm.3/",
    doi = "10.18653/v1/2023.conll-babylm.3",
    pages = "47--57"
}
}

### Acknowledgements
This material is based upon work supported by the National Science Foundation under Grant No. 2140642.