b5_model / README.md
WARAJA's picture
Add model card
56846a7 verified
metadata
license: cc-by-nc-3.0
pipeline_tag: image-to-image
tags:
  - vision
  - document-processing
  - binarization
  - segmentation

Tzefa Binarization Model (mit_b5 HighResMAnet)

Custom-trained document binarization model for the Tzefa OCR pipeline.

Architecture

  • Encoder: MiT-B5 (Mix Transformer)
  • Decoder: MAnet with custom High-Resolution Stem + Fusion Head
  • Framework: segmentation-models-pytorch
  • Input: RGB image tiles (640x640)
  • Output: Binary mask (ink=0, paper=255)

Usage

from huggingface_hub import hf_hub_download
import torch

# Download weights
ckpt_path = hf_hub_download("WARAJA/b5_model", "b5_model.pth")

# Load model (see Tzefa Binarization Space for full architecture code)
checkpoint = torch.load(ckpt_path, map_location="cpu")

Related