Upload folder using huggingface_hub

Browse files

Files changed (7) hide show

.gitattributes +1 -0
README.md +230 -3
class_index.json +1 -0
fingerprint.pb +3 -0
saved_model.pb +3 -0
variables/variables.data-00000-of-00001 +3 -0
variables/variables.index +0 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+variables/variables.data-00000-of-00001 filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -1,3 +1,230 @@
----
-license: mit
----

+---
+language: en
+tags:
+  - image-classification
+  - document-classification
+  - tensorflow
+  - efficientnet
+  - computer-vision
+license: mit
+framework: tensorflow
+pipeline_tag: image-classification
+---
+# Document Classifier
+A TensorFlow SavedModel for classifying real-world document images into structured categories. Built on **EfficientNet** with preprocessing, the model is designed for production use and includes an extensive validation pipeline covering image quality, fake/AI detection, and confidence thresholding.
+---
+## Supported Document Types
+| Class Key | Label | Description |
+|---|---|---|
+| `1_visiting_card` | Visiting Card | Business cards, name cards |
+| `2_prescription` | Prescription | Medical prescriptions |
+| `3_shop_banner` | Shop Banner | Storefront signage, banners |
+| `4_invalid_image` | Invalid | Rejected / unrecognized documents |
+---
+## Model Details
+| Property | Value |
+|---|---|
+| Architecture | EfficientNet (TF SavedModel) |
+| Input Size | Configured via `settings.IMAGE_SIZE` |
+| Preprocessing | `efficientnet.preprocess_input` |
+| Output | Softmax class probabilities |
+| Confidence Threshold | Configured via `settings.CONFIDENCE_THRESHOLD` |
+---
+## Repository Structure
+```
+document-classifier/
+├── saved_model.pb
+├── variables/
+│   ├── variables.index
+│   └── variables.data-00000-of-00001
+├── class_index.json
+└── README.md
+```
+### `class_index.json` format
+```json
+{
+    "1_visiting_card": 0,
+    "2_prescription":  1,
+    "3_shop_banner":   2,
+    "4_invalid_image": 3
+}
+```
+---
+## Installation
+```bash
+pip install tensorflow opencv-python pillow huggingface_hub
+# Optional but recommended:
+pip install pytesseract   # For AI watermark OCR detection
+```
+---
+## Usage
+### Load from Hugging Face
+```python
+from huggingface_hub import snapshot_download
+import tensorflow as tf
+import json
+# Download model + class index
+local_path = snapshot_download(repo_id="your-username/document-classifier")
+# Load model
+model = tf.saved_model.load(local_path)
+infer = model.signatures["serving_default"]
+# Load class labels
+with open(f"{local_path}/class_index.json") as f:
+    class_indices = json.load(f)
+LABELS = {int(v): k for k, v in class_indices.items()}
+```
+### Run Inference
+```python
+import cv2
+import numpy as np
+from tensorflow.keras.applications.efficientnet import preprocess_input
+IMAGE_SIZE = (224, 224)   # match your training config
+def predict(image_path: str):
+    img = cv2.imread(image_path)
+    img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
+    resized = cv2.resize(img_rgb, IMAGE_SIZE)
+    input_arr = np.expand_dims(resized.astype(np.float32), axis=0)
+    input_arr = preprocess_input(input_arr)
+    outputs = infer(tf.constant(input_arr))
+    preds = list(outputs.values())[0].numpy()[0]
+    class_id   = int(np.argmax(preds))
+    confidence = float(np.max(preds))
+    label      = LABELS.get(class_id, "unknown")
+    return {"label": label, "confidence": round(confidence * 100, 2)}
+result = predict("my_document.jpg")
+print(result)
+# {'label': '1_visiting_card', 'confidence': 97.43}
+```
+---
+## Validation Pipeline
+Before inference runs, every image passes through a multi-stage validation pipeline. Requests are rejected early and cheaply when possible.
+### Image Quality Checks
+| Check | Condition | Rejection Code |
+|---|---|---|
+| Blank image | Grayscale std < 12 | `BLANK_IMAGE` |
+| Blurry image | Laplacian variance < 10 | `BLURRED_IMAGE` |
+| Ruled paper | ≥5 evenly-spaced horizontal lines | `RULED_PAPER` |
+| No text | Fewer than 6 text-like connected components | `NO_MEANINGFUL_TEXT` |
+### AI / Fake Image Detection
+The pipeline runs AI-detection checks from cheapest to most expensive:
+| Step | Method | Description |
+|---|---|---|
+| 1 | **EXIF/XMP Metadata** | Scans for AI tool keywords (`midjourney`, `dall-e`, `stable-diffusion`, etc.) and flags Google ICC profile without camera EXIF tags |
+| 2 | **Screenshot / UI detection** | Rejects app screenshots with >55% near-white pixels or flat white corners |
+| 3 | **AI watermark OCR** | Scans the bottom 20% of the image for known AI generator watermarks via Tesseract |
+| 4 | **Gemini ✦ sparkle** | Detects the characteristic Gemini/Imagen sparkle artifact in the bottom-right corner using both absolute and local-contrast blob analysis |
+| 5 | **AI staged background** | Detects bokeh-blurred backgrounds with a sharp foreground card (card/background sharpness ratio > 5.0) |
+| 6 | **Perspective tilt** | Flags images where >35% of detected lines fall in the 15°–45° diagonal range |
+| 7 | **DCT frequency analysis** | Flags unnaturally uniform high-frequency energy (ratio > 0.12) |
+| 8 | **Texture uniformity** | Flags low patch variance coefficient of variation (< 0.4) combined with low mean variance (< 50) |
+### Response Format
+**Valid document:**
+```json
+{
+    "status": "VALID",
+    "title": "Document Verified Successfully",
+    "message": "Your document has been identified as a Visiting Card.",
+    "document_type": "1_visiting_card",
+    "document_type_label": "Visiting Card",
+    "confidence": 97.43,
+    "doc_type_received": null
+}
+```
+**Invalid / rejected:**
+```json
+{
+    "status": "INVALID",
+    "reason_code": "AI_GENERATED_IMAGE",
+    "title": "AI-Generated Image Detected",
+    "message": "The uploaded image appears to be AI-generated and cannot be accepted.",
+    "suggestion": "Please upload a real photograph of your document."
+}
+```
+### All Rejection Codes
+| Code | Meaning |
+|---|---|
+| `BLANK_IMAGE` | Blank or uniformly white/black image |
+| `BLURRED_IMAGE` | Image too blurry to process |
+| `RULED_PAPER` | Lined/ruled paper detected |
+| `NO_MEANINGFUL_TEXT` | No readable text components found |
+| `SCREENSHOT_DOCUMENT` | App screenshot or web UI render |
+| `AI_GENERATED_IMAGE` | AI-generated image (any detection method) |
+| `MODEL_REJECTED` | Model confidence below threshold or invalid class |
+| `UNREADABLE_IMAGE` | File could not be decoded |
+| `SERVER_ERROR` | Unexpected server-side error |
+---
+## Dependencies
+| Package | Purpose |
+|---|---|
+| `tensorflow` | Model loading and inference |
+| `opencv-python` | Image decoding, quality checks, AI detection |
+| `pillow` | EXIF/XMP metadata reading |
+| `pytesseract` | AI watermark OCR scan (optional) |
+| `numpy` | Array operations |
+---
+## Configuration
+The model reads settings from a `config.py` / `get_settings()` object. Key settings:
+| Setting | Description |
+|---|---|
+| `MODEL_PATH` | Path to the SavedModel directory |
+| `CLASS_INDEX_FILE` | Path to `class_index.json` |
+| `IMAGE_SIZE` | Tuple, e.g. `(224, 224)` |
+| `CONFIDENCE_THRESHOLD` | Float, e.g. `0.75` — minimum confidence to accept |
+---
+## License
+MIT

class_index.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"1_visiting_card": 0, "2_prescription": 1, "3_shop_banner": 2, "4_invalid_image": 3}

fingerprint.pb ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:bb66d3a7ea49d1815a57db751e427f83733297c68ae54d5def094ab65f915bcc
+size 97

saved_model.pb ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:830d7b949a0d27fa1af2f0fae6cf093f46ae14c66396d46c25bd88eeee018169
+size 2350545

variables/variables.data-00000-of-00001 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:39fa2d05ba34b21ff5dfb0b669060ce2a58c60397dd68d1e2340b5b4be392adb
+size 32548732

variables/variables.index ADDED Viewed

Binary file (36.6 kB). View file