b5_model / README.md

WARAJA

Add model card

56846a7 verified 11 days ago

preview code

raw

history blame contribute delete

1.08 kB

metadata

license: cc-by-nc-3.0
pipeline_tag: image-to-image
tags:
  - vision
  - document-processing
  - binarization
  - segmentation

Tzefa Binarization Model (mit_b5 HighResMAnet)

Custom-trained document binarization model for the Tzefa OCR pipeline.

Architecture

Encoder: MiT-B5 (Mix Transformer)
Decoder: MAnet with custom High-Resolution Stem + Fusion Head
Framework: segmentation-models-pytorch
Input: RGB image tiles (640x640)
Output: Binary mask (ink=0, paper=255)

Usage

from huggingface_hub import hf_hub_download
import torch

# Download weights
ckpt_path = hf_hub_download("WARAJA/b5_model", "b5_model.pth")

# Load model (see Tzefa Binarization Space for full architecture code)
checkpoint = torch.load(ckpt_path, map_location="cpu")

WARAJA
/

b5_model

Tzefa Binarization Model (mit_b5 HighResMAnet)

Architecture

Usage

Related