Dataset
#1
by
Kyliroco
- opened
Hello, what dataset are you using to train the classifier? Did you create it yourself or did you use a pre-existing one?
The dataset is synthetically generated. We render text from NLTK's inaugural corpus in various fonts at different sizes and positions to create training images (50 per font). The fonts are a handcrafted selection of Mac OS system fonts and Google Fonts.