DiTo97
/

binarization-segformer-b3

Image Segmentation

Generated from Trainer

document-image-binarization

Model card Files Files and versions

DiTo97 commited on May 14, 2023

Commit

e4e5c3d

·

1 Parent(s): f41419b

Fixed pseudo F-metric in README.md

Files changed (1) hide show

README.md +5 -2

README.md CHANGED Viewed

@@ -15,7 +15,7 @@ should probably proofread and complete it, then remove this comment. -->
 # binarization-segformer-b3
-This model is a fine-tuned version of [nvidia/segformer-b3-finetuned-cityscapes-1024-1024](https://huggingface.co/nvidia/segformer-b3-finetuned-cityscapes-1024-1024) on the same ensemble of datasets as the [SauvolaNet work](https://arxiv.org/pdf/2105.05521.pdf). The ensemble is publicly available in the official [SauvolaNet repository](https://github.com/Leedeng/SauvolaNet#datasets).
 It achieves the following results on the evaluation set on DIBCO metrics:
 - loss: 0.1017
@@ -26,13 +26,16 @@ It achieves the following results on the evaluation set on DIBCO metrics:
 where PSNR stands for peak signal-to-noise ratio and DND for distance reciprocal distortion.
-For more information on DIBCO metrics, see the 2017 introductory [paper](https://ieeexplore.ieee.org/document/8270159).
 **Warning:** This model only accepts images with a resolution of 640 due to compute constraints on Colab free tier during training.
 ## Model description
 This model is part of on-going research on pure semantic segmentation models as a formulation of document image binarization (DIBCO).
 ## Intended uses & limitations

 # binarization-segformer-b3
+This model is a fine-tuned version of [nvidia/segformer-b3](https://huggingface.co/nvidia/segformer-b3-finetuned-cityscapes-1024-1024) on the same ensemble of 13 datasets as the [SauvolaNet work](https://arxiv.org/pdf/2105.05521.pdf). The ensemble is publicly available in the official [SauvolaNet repository](https://github.com/Leedeng/SauvolaNet#datasets).
 It achieves the following results on the evaluation set on DIBCO metrics:
 - loss: 0.1017
 where PSNR stands for peak signal-to-noise ratio and DND for distance reciprocal distortion.
+For more information on the above DIBCO metrics, see the 2017 introductory [paper](https://ieeexplore.ieee.org/document/8270159).
 **Warning:** This model only accepts images with a resolution of 640 due to compute constraints on Colab free tier during training.
 ## Model description
 This model is part of on-going research on pure semantic segmentation models as a formulation of document image binarization (DIBCO).
+This is in contrast to the late trend of adapting classic binarization algorithms with neural networks,
+such as [DeepOtsu](https://arxiv.org/abs/1901.06081) or the aforementioned SauvolaNet work,
+as extensions of the classical Otsu's method and Sauvola thresholding, respectively.
 ## Intended uses & limitations