Experiment with and compare different tokenizers
Classify images of documents into 16 types
Evaluating LLMs on BhashaBench tasks