noticecheck

Running on Zero

App Files Files Community

noticecheck / docs /field-notes.md

Abid Ali Awan

Add local AI field notes

453edd1 17 days ago

preview code

Raw

History Blame Contribute Delete

3.26 kB

A newer version of the Gradio SDK is available: 6.19.0

Upgrade

Field Notes: Making NoticeCheck Fully Local

NoticeCheck started as a cloud-backed version of my Pakistan Notice Helper app. For the Hugging Face Hackathon, I rebuilt it so the same pipeline could also run locally with Docker Compose and an NVIDIA GPU.

What I Tried

I tested several vision-language model setups before settling on the current architecture. Smaller MiniCPM-V experiments were not reliable enough on high-risk scam cases. Qwen experiments performed better, but introduced larger models, separate vision projectors, cold starts, and more infrastructure.

The final app uses:

openbmb/MiniCPM5-1B for structured notice assessment
nvidia/NVIDIA-Nemotron-Parse-v1.2 for screenshot text extraction
Hugging Face ZeroGPU for the hosted demo
Docker Compose and local CUDA for private local deployment

Problems I Hit

Structured output was one of the first major issues. Models sometimes returned incomplete or malformed JSON. I added a strict schema, bounded prompts, normalization, retries, and a repair pass so every successful result follows the same contract.

The ZeroGPU deployment exposed several integration problems:

CUDA and PyTorch ABI mismatches
missing OCR dependencies such as einops, open_clip_torch, and ftfy
GPU quota handling and Hugging Face iframe token forwarding
model-loading and cold-start failures hidden behind worker wrappers

Screenshot handling also required more than OCR. Ordinary photos could produce image descriptions or parser output instead of notice text. Sending that output to the language model caused generic generation failures. I added semantic region filtering and a dedicated warning that asks the user to upload a clear notice or message screenshot.

The local Docker build revealed another practical problem: one Python dependency needed compilation, so the CUDA image required build-essential. CUDA base images and model caches are also large, which made persistent volumes and Docker disk cleanup important parts of testing.

What I Learned

Making an AI application local is not only about downloading model weights. A usable local product also needs:

reproducible GPU and dependency setup
predictable structured output
explicit input validation
clear user-facing failure messages
privacy-aware tracing
persistent model caching
realistic disk and VRAM planning

I also learned to treat model evaluation as part of product development. A model that works in a simple smoke test may still fail on phishing links, OTP theft, Roman Urdu screenshots, harmless reminders, or the application's JSON contract.

Result

NoticeCheck now has a redesigned English interface and can run in two modes:

hosted on Hugging Face ZeroGPU
fully local on an NVIDIA GPU

The local version starts with:

docker compose up --build

Field Notes: Making NoticeCheck Fully Local

What I Tried

Problems I Hit

What I Learned

Result

Links