Spaces:

tomerz14
/

BERT_Text_Source_Classifier

Sleeping

tomerz14 commited on Oct 4, 2025

Commit

988dff5

verified ·

1 Parent(s): c6f65f2

Delete README.md

Files changed (1) hide show

README.md DELETED Viewed

@@ -1,27 +0,0 @@
-# Binary Document Classifier — Gradio Space
-This Space hosts a Gradio app for **binary text classification** on uploaded documents.
-It supports long documents by **chunking** (512-token windows with overlap) and aggregates
-chunk probabilities into a **document-level** prediction.
-## Configure
-Set the environment variable `MODEL_ID` in your Space to point to your trained model,
-e.g. `your-username/bert-binclass`. You can also set:
-- `MAX_LENGTH` — tokens per chunk (default: 512)
-- `STRIDE` — overlap tokens between chunks (default: 128)
-## Run locally
-```bash
-pip install -r requirements.txt
-python app.py
-```
-Then open the printed Gradio URL.
-## Notes
-- PDF extraction uses `pypdf` for simplicity. For higher-quality results or OCR,
-  consider `pymupdf` (fitz) or `unstructured`.