All datasets used for evaluation in our paper titled: Continually Adding New Languages to Multilingual Language Models