Sample of a 2.2B+ word textbook corpus across 32K+ books, 5K+ subjects, and 14 languages for LLM training and multilingual knowledge modeling.