Polyglot Teachers: Evaluating Language Models for Multilingual Synthetic Data Generation
Paper • 2604.11290 • Published • 1
Evaluating LLMs for Multilingual Synthetic Data Generation
Note Multilingual Synthetic Data using the best teacher from our experiments across seven languages.
Note Instruction-tuned model using the best teacher model for Arabic
Note Instruction-tuned model using the best teacher model for Arabic
Note Instruction-tuned model using the best teacher model for Spanish
Note Instruction-tuned model using the best teacher model for Czech
Note Instruction-tuned model using the best teacher model for German
Note Instruction-tuned model using the best teacher model for German
Note Instruction-tuned model using the best teacher model for Indonesian
Note Instruction-tuned model using the best teacher model for Tagalog