Dataset

by Kyliroco - opened Dec 3, 2025

Dec 3, 2025

Hello, what dataset are you using to train the classifier? Did you create it yourself or did you use a pre-existing one?

mlahr

Owner Dec 3, 2025

The dataset is synthetically generated. We render text from NLTK's inaugural corpus in various fonts at different sizes and positions to create training images (50 per font). The fonts are a handcrafted selection of Mac OS system fonts and Google Fonts.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment