metadata
title: Binary Doc Classifier (Chunked)
emoji: π
colorFrom: indigo
colorTo: purple
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: mit
Binary Document Classifier β Gradio Space
This Space hosts a Gradio app for binary text classification on uploaded documents. It supports long documents by chunking (512-token windows with overlap) and aggregates chunk probabilities into a document-level prediction.
Configuration
Set the following Space variables in the UI (Settings β Variables):
MODEL_IDβ your trained model repo (e.g.,your-username/bert-binclass)MAX_LENGTHβ tokens per chunk (default:512)STRIDEβ overlap tokens between chunks (default:128)
Local run
pip install -r requirements.txt
python app.py
Notes
- PDF extraction uses
pypdffor simplicity.