Fraktur/Other Text-Line Classifier
A binary CNN classifier that determines whether a scanned text-line image is set in Fraktur (blackletter / Gothic script) or Other (primarily Latin / Roman / Antiqua script).
Developed for the Impresso digital humanities project, which processes millions of historical newspaper pages in German, French, Luxembourgish, and other European languages.
Model Details
| Property | Value |
|---|---|
| Architecture | BinaryClassificationCNN β 3-layer CNN with LayerNorm and Dropout |
| Input | Grayscale text-line image, resized/padded to 60 Γ 800 px |
| Output | Single logit; logit > 0 β Fraktur (equivalent to sigmoid(logit) > 0.5) |
| Parameters | ~2.1 M |
| Training data | ~32 000 manually labeled line crops from Swiss/Luxembourgish newspapers |
| Framework | PyTorch |
Architecture
Input (1, 60, 800)
β Conv2d(1β32) + ReLU + MaxPool2d β LayerNorm[32, 30, 400]
β Conv2d(32β64) + ReLU + MaxPool2d β LayerNorm[64, 15, 200] + Dropout(0.15)
β Conv2d(64β128) + LayerNorm[128,15,200] + ReLU + AdaptiveMaxPool2d(1Γ8)
β Flatten(1 024) β FC(128) + ReLU β FC(1)
Training
- Loss:
BCEWithLogitsLoss - Optimizer: Adam, lr = 1e-4 with
ReduceLROnPlateau(factor 0.5, patience 2) - Epochs: up to 20 with early stopping (patience 5)
- Augmentation: random rotation Β±2Β°, Gaussian noise (Ο=0.05), random right-masking (p=0.15, up to 50 % of width) to improve robustness on short lines
- Class balancing:
WeightedRandomSampler(other β 20β―k, fraktur β 14β―k)
Performance
Evaluated on the companion held-out test set (impresso-project/frakturline-testset) β 2 000 balanced images (1 000 per class), strictly excluded from training:
| Metric | Score |
|---|---|
| Accuracy | 99.75 % |
| Precision (Fraktur) | 100.0 % |
| Recall (Fraktur) | 99.5 % |
| F1 (Fraktur) | 99.75 % |
| FP / FN | 0 FP / 5 FN |
Evaluation Dataset
The test set is published as a separate frozen HF dataset:
β impresso-project/frakturline-testset
It is not included in the training corpus and is released under CC BY-NC 4.0. Do not use it for training.
from datasets import load_dataset
ds = load_dataset("impresso-project/frakturline-testset", split="test")
# 2 000 images: {"image": <PIL.Image>, "label": "fraktur"|"other", ...}
Usage
Install dependencies
pip install torch torchvision Pillow huggingface_hub
Classify images
from huggingface_hub import hf_hub_download
import importlib.util, sys
# Load pipeline.py from the hub
spec = importlib.util.spec_from_file_location(
"pipeline",
hf_hub_download("impresso-project/frakturline-classification-cnn", "pipeline.py"),
)
pipeline_module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(pipeline_module)
pipe = pipeline_module.FrakturPipeline.from_pretrained(
"impresso-project/frakturline-classification-cnn"
)
# Single image β local path
result = pipe("path/to/line.png")
# β {"label": "fraktur", "score": 0.9731}
# Single image β https:// URL (fetched via urllib, no extra dependencies)
result = pipe("https://example.com/line.png")
# Batch
results = pipe(["line1.png", "line2.png", "line3.png"])
Input format
- Any PIL-readable image format (PNG, JPEG, TIFF, β¦)
- Ideally a single text line crop extracted by an OCR layout-analysis tool
- The pipeline handles grayscale conversion and resizing internally
Output format
{"label": "fraktur", "score": 0.9731} # sigmoid probability of predicted class
{"label": "other", "score": 0.9954}
Limitations
- Designed for single text lines. Mixed-typeface lines or non-text content may produce unreliable results.
- Short headers, ornaments, or lines with very few characters can be ambiguous.
- The training data is drawn primarily from 19thβ20th century European (mainly German-language) newspapers; performance on other periods or regions is not guaranteed.
Citation
If you use this model, please cite the Impresso project:
@misc{impresso2025fraktur,
title = {Fraktur/Antiqua Text-Line Classifier},
author = {Impresso Project},
year = {2025},
url = {https://huggingface.co/impresso-project/frakturline-classification-cnn}
}
License
The code in this repository is released under the GNU Affero General Public License v3.0 (AGPL-3.0).
The model was trained on data derived from multiple upstream sources. Rights in the underlying source materials remain subject to their respective original terms. For dataset-specific provenance and licensing details, please consult the linked dataset cards.
If you use this model, please cite the Impresso project and link to this repository.
- Downloads last month
- 41