--- language: en tags: - image-classification - document-classification - tensorflow - efficientnet - computer-vision license: mit pipeline_tag: image-classification library_name: tf-keras --- # Document Classifier A Keras EfficientNet model for classifying real-world document images into structured categories. Includes a full validation pipeline covering image quality checks and AI/fake image detection. --- ## How to use this model ```python # Step 1 — Install dependencies # pip install huggingface_hub tensorflow opencv-python pillow # Step 2 — Copy and run this complete code from huggingface_hub import snapshot_download import tensorflow as tf import numpy as np import cv2 import json from tensorflow.keras.applications.efficientnet import preprocess_input # Download model from Hugging Face (cached after first run) local_path = snapshot_download(repo_id="shailgsits/document-classifier") # Load model + class labels model = tf.saved_model.load(local_path) infer = model.signatures["serving_default"] with open(f"{local_path}/class_index.json") as f: class_indices = json.load(f) LABELS = {int(v): k for k, v in class_indices.items()} DOCUMENT_TYPE_LABELS = { "1_visiting_card": "Visiting Card", "2_prescription": "Prescription", "3_shop_banner": "Shop Banner", "4_invalid_image": "Invalid", } def predict(image_path: str) -> dict: img = cv2.imread(image_path) if img is None: return {"status": "ERROR", "message": "Could not read image"} img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) resized = cv2.resize(img_rgb, (224, 224)) input_arr = np.expand_dims(resized.astype(np.float32), axis=0) input_arr = preprocess_input(input_arr) outputs = infer(tf.constant(input_arr)) preds = list(outputs.values())[0].numpy()[0] class_id = int(np.argmax(preds)) confidence = float(np.max(preds)) label = LABELS.get(class_id, "unknown") friendly = DOCUMENT_TYPE_LABELS.get(label, label) return { "status": "VALID" if confidence >= 0.75 else "LOW_CONFIDENCE", "document_type": label, "document_type_label": friendly, "confidence": round(confidence * 100, 2), "all_scores": { DOCUMENT_TYPE_LABELS.get(LABELS[i], LABELS[i]): round(float(p) * 100, 2) for i, p in enumerate(preds) } } # --- Run prediction --- result = predict("your_image.jpg") print(result) # Example output: # { # 'status': 'VALID', # 'document_type': '1_visiting_card', # 'document_type_label': 'Visiting Card', # 'confidence': 97.43, # 'all_scores': {'Visiting Card': 97.43, 'Prescription': 1.2, 'Shop Banner': 0.9, 'Invalid': 0.47} # } ``` --- ## Supported Document Types | Label | Description | |---|---| | `visiting_card` | Business / name cards | | `prescription` | Medical prescriptions | | `shop_banner` | Storefront signage, banners | | `invalid_image` | Rejected / unrecognized documents | --- ## Files in this repo | File | Description | |---|---| | `document_classifier_final.keras` | Trained Keras model (EfficientNet) | | `class_index.json` | Class name → index mapping | --- ## Quick Test in Google Colab ```python !pip install huggingface_hub tensorflow pillow opencv-python requests -q import tensorflow as tf, numpy as np, cv2, requests, json from PIL import Image from io import BytesIO from huggingface_hub import hf_hub_download from tensorflow.keras.applications.efficientnet import preprocess_input # Load model + class mapping model = tf.keras.models.load_model( hf_hub_download("shailgsits/document-classifier", "document_classifier_final.keras") ) with open(hf_hub_download("shailgsits/document-classifier", "class_index.json")) as f: index_to_label = {v: k.split("_", 1)[1] for k, v in json.load(f).items()} # Predict from any image URL def predict_from_url(url: str): img = np.array(Image.open(BytesIO(requests.get(url).content)).convert("RGB"))[:, :, ::-1] h, w = img.shape[:2] scale = min(224 / w, 224 / h) nw, nh = int(w * scale), int(h * scale) res = cv2.resize(img, (nw, nh)) canvas = np.ones((224, 224, 3), np.uint8) * 255 canvas[(224 - nh) // 2:(224 - nh) // 2 + nh, (224 - nw) // 2:(224 - nw) // 2 + nw] = res input_arr = preprocess_input(np.expand_dims(canvas.astype(np.float32), 0)) pred = model.predict(input_arr)[0] idx = int(np.argmax(pred)) return {"label": index_to_label[idx], "confidence": round(float(pred[idx]) * 100, 2)} # Test with a Google Drive image url = "https://drive.google.com/uc?export=download&id=YOUR_FILE_ID" print(predict_from_url(url)) # {'label': 'visiting_card', 'confidence': 97.43} ``` --- ## Predict from local file (Colab upload) ```python from google.colab import files uploaded = files.upload() image_path = list(uploaded.keys())[0] img = cv2.imread(image_path) h, w = img.shape[:2] scale = min(224 / w, 224 / h) nw, nh = int(w * scale), int(h * scale) res = cv2.resize(img, (nw, nh)) canvas = np.ones((224, 224, 3), np.uint8) * 255 canvas[(224 - nh) // 2:(224 - nh) // 2 + nh, (224 - nw) // 2:(224 - nw) // 2 + nw] = res input_arr = preprocess_input(np.expand_dims(canvas.astype(np.float32), 0)) pred = model.predict(input_arr)[0] idx = int(np.argmax(pred)) print({"label": index_to_label[idx], "confidence": round(float(pred[idx]) * 100, 2)}) ``` --- ## Preprocessing Details Images are resized with **letterboxing** (aspect-ratio preserved, white padding) to 224×224, then passed through `EfficientNet`'s `preprocess_input`. --- ## Validation Pipeline Before inference, every image passes through: | Check | Condition | |---|---| | Blank image | Grayscale std < 12 | | Blurry image | Laplacian variance < 10 | | Ruled paper | ≥5 evenly-spaced horizontal lines | | No text detected | Fewer than 6 connected text components | | AI metadata | EXIF/XMP contains AI tool keywords | | Screenshot/UI | >55% near-white pixels | | AI watermark | OCR detects generator text in bottom strip | | Gemini sparkle | Sparkle artifact in bottom-right corner | | AI staged background | Card/background sharpness ratio > 5.0 | | Perspective tilt | >35% lines in 15°–45° diagonal range | | DCT frequency | High-freq energy ratio > 0.12 | | Texture uniformity | Patch variance CV < 0.4 and mean var < 50 | --- ## License MIT --- ## Author Developed and trained by **[Shailendra Singh Tiwari](https://www.linkedin.com/in/shailendra-singh-tiwari/)**