Some weights of XLMRobertaForQuestionAnswering were not initialized from the model checkpoint at xlm-roberta-base and are newly initialized: ['qa_outputs.bias', 'qa_outputs.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. ***Training with "xlm-roberta-base" *** Map: 0%| | 0/10570 [00:00