--- license: mit datasets: - HuggingFaceTB/smollm-corpus language: - en --- # Raw 1B Shared ## How to Get Started with the Model Use the code below to get started with the model. ```python from transformers import AutoTokenizer, AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained( "l2t-project/raw-1b-shared" ) tokenizer = AutoTokenizer.from_pretrained( "l2t-project/raw-1b-shared" ) ``` ## Citation ``` @article{yamaguchi2026enhancinglinguisticcompetencelanguage, title={Enhancing Linguistic Competence of Language Models through Pre-training with Language Learning Tasks}, author={Atsuki Yamaguchi and Maggie Mi and Nikolaos Aletras}, year={2026}, eprint={2601.03448}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2601.03448}, journal={arXiv}, volume={abs/2601.03448} } ```