simon-clmtd's picture
change licence to agpl 3
39eca91 verified
metadata
language:
  - de
  - fr
  - lb
license: agpl-3.0
tags:
  - image-classification
  - pytorch
  - fraktur
  - other
  - historical-documents
  - ocr
  - impresso
datasets:
  - impresso-project/frakturline-dataset
  - impresso-project/frakturline-testset
pipeline_tag: image-classification

Fraktur/Other Text-Line Classifier

A binary CNN classifier that determines whether a scanned text-line image is set in Fraktur (blackletter / Gothic script) or Other (primarily Latin / Roman / Antiqua script).

Developed for the Impresso digital humanities project, which processes millions of historical newspaper pages in German, French, Luxembourgish, and other European languages.


Model Details

Property Value
Architecture BinaryClassificationCNN β€” 3-layer CNN with LayerNorm and Dropout
Input Grayscale text-line image, resized/padded to 60 Γ— 800 px
Output Single logit; logit > 0 β†’ Fraktur (equivalent to sigmoid(logit) > 0.5)
Parameters ~2.1 M
Training data ~32 000 manually labeled line crops from Swiss/Luxembourgish newspapers
Framework PyTorch

Architecture

Input (1, 60, 800)
  β†’ Conv2d(1β†’32) + ReLU + MaxPool2d      β†’ LayerNorm[32, 30, 400]
  β†’ Conv2d(32β†’64) + ReLU + MaxPool2d     β†’ LayerNorm[64, 15, 200] + Dropout(0.15)
  β†’ Conv2d(64β†’128) + LayerNorm[128,15,200] + ReLU + AdaptiveMaxPool2d(1Γ—8)
  β†’ Flatten(1 024) β†’ FC(128) + ReLU β†’ FC(1)

Training

  • Loss: BCEWithLogitsLoss
  • Optimizer: Adam, lr = 1e-4 with ReduceLROnPlateau (factor 0.5, patience 2)
  • Epochs: up to 20 with early stopping (patience 5)
  • Augmentation: random rotation Β±2Β°, Gaussian noise (Οƒ=0.05), random right-masking (p=0.15, up to 50 % of width) to improve robustness on short lines
  • Class balancing: WeightedRandomSampler (other β‰ˆ 20β€―k, fraktur β‰ˆ 14β€―k)

Performance

Evaluated on the companion held-out test set (impresso-project/frakturline-testset) β€” 2 000 balanced images (1 000 per class), strictly excluded from training:

Metric Score
Accuracy 99.75 %
Precision (Fraktur) 100.0 %
Recall (Fraktur) 99.5 %
F1 (Fraktur) 99.75 %
FP / FN 0 FP / 5 FN

Evaluation Dataset

The test set is published as a separate frozen HF dataset:

β†’ impresso-project/frakturline-testset

It is not included in the training corpus and is released under CC BY-NC 4.0. Do not use it for training.

from datasets import load_dataset
ds = load_dataset("impresso-project/frakturline-testset", split="test")
# 2 000 images: {"image": <PIL.Image>, "label": "fraktur"|"other", ...}

Usage

Install dependencies

pip install torch torchvision Pillow huggingface_hub

Classify images

from huggingface_hub import hf_hub_download
import importlib.util, sys

# Load pipeline.py from the hub
spec = importlib.util.spec_from_file_location(
    "pipeline",
    hf_hub_download("impresso-project/frakturline-classification-cnn", "pipeline.py"),
)
pipeline_module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(pipeline_module)

pipe = pipeline_module.FrakturPipeline.from_pretrained(
    "impresso-project/frakturline-classification-cnn"
)

# Single image β€” local path
result = pipe("path/to/line.png")
# β†’ {"label": "fraktur", "score": 0.9731}

# Single image β€” https:// URL (fetched via urllib, no extra dependencies)
result = pipe("https://example.com/line.png")

# Batch
results = pipe(["line1.png", "line2.png", "line3.png"])

Input format

  • Any PIL-readable image format (PNG, JPEG, TIFF, …)
  • Ideally a single text line crop extracted by an OCR layout-analysis tool
  • The pipeline handles grayscale conversion and resizing internally

Output format

{"label": "fraktur",  "score": 0.9731}  # sigmoid probability of predicted class
{"label": "other",    "score": 0.9954}

Limitations

  • Designed for single text lines. Mixed-typeface lines or non-text content may produce unreliable results.
  • Short headers, ornaments, or lines with very few characters can be ambiguous.
  • The training data is drawn primarily from 19th–20th century European (mainly German-language) newspapers; performance on other periods or regions is not guaranteed.

Citation

If you use this model, please cite the Impresso project:

@misc{impresso2025fraktur,
  title  = {Fraktur/Antiqua Text-Line Classifier},
  author = {Impresso Project},
  year   = {2025},
  url    = {https://huggingface.co/impresso-project/frakturline-classification-cnn}
}

License

The code in this repository is released under the GNU Affero General Public License v3.0 (AGPL-3.0).

The model was trained on data derived from multiple upstream sources. Rights in the underlying source materials remain subject to their respective original terms. For dataset-specific provenance and licensing details, please consult the linked dataset cards.

If you use this model, please cite the Impresso project and link to this repository.