--- title: Project Halide sdk: gradio sdk_version: 6.10.0 app_file: app.py license: apache-2.0 models: - Lonelyguyse1/halide-vision - openbmb/MiniCPM-V-4.6 - nvidia/Nemotron-Mini-4B-Instruct tags: - gradio - film - computer-vision - diagnostics - track:backyard - sponsor:openbmb - sponsor:nvidia - sponsor:modal - sponsor:openai - badge:off-brand - badge:offbrand - badge:tiny-titan - badge:tiny - badge:best-demo - badge:demo - badge:best-agent - badge:bonus-quest - badge:quest-champion - badge:quest - achievement:offgrid - achievement:welltuned - achievement:offbrand - achievement:fieldnotes --- # Project Halide Project Halide is an edge-native diagnostic workbench for analog film scans by [Lonelyguyse1](https://huggingface.co/Lonelyguyse1). The runtime uses MiniCPM-V 4.6 for defect extraction and Nemotron-Mini-4B-Instruct for diagnostic reasoning. The vision pass combines full-frame inspection, tiled fallback for large scans, a conservative image-analysis validator for obvious scratches, and geometric filtering for sprocket or frame-edge artifacts. Model inference runs on the Space GPU runtime without cloud inference APIs. Fine-tuned vision model: Fine-tuning improved the vision stage where it mattered most for the app: structured defect JSON, consistent film-defect labels, scratch and emulsion-damage vocabulary, and fewer obvious false positives on clean or lookalike regions. The runtime still treats model output as candidate evidence and validates every box. The data bottleneck was central to the build. Public damaged-film examples are scattered, noisy, and often not real negatives, so the training curriculum combines FilmDamageSimulator annotations, procedural defect positives, synthetic scratches and stains, hard clean negatives, and lookalike counterexamples such as grass, subject hair, sprocket holes, borders, and glare. The five private negatives stayed held out for evaluation only. Source repository: Demo video: Public launch post: Technical blog: Modal was used for offline training, held-out GPU evaluation, checkpoint upload, GGUF conversion, and Space deployment. The runtime app itself does not call Modal or any hosted inference API. ## How It Works 1. Upload a film scan, negative photo, or contact-sheet crop. 2. MiniCPM-V 4.6 extracts candidate defects as structured JSON. 3. The validator normalizes boxes, filters bad geometry, removes duplicate or sprocket-like edge artifacts, and adds high-precision scratch candidates when clear linear evidence is visible. 4. Nemotron-Mini-4B-Instruct reads the validated evidence plus user metadata and writes a lab-style diagnosis with physical fixes. 5. SQLite stores local diagnostic history so earlier runs can be reopened. ## Sponsor Usage - OpenBMB: MiniCPM-V 4.6 is the primary vision model, fine-tuned for film defect extraction and published at `Lonelyguyse1/halide-vision`. - NVIDIA: Nemotron-Mini-4B-Instruct produces the diagnostic report and keeps uncertain film metadata lower priority than visible evidence. - Modal: used offline for training, evaluation, checkpoint export, GGUF conversion, model upload, and Space deployment support. - OpenAI: assisted implementation, review, and source-control hygiene through the linked repository workflow. ## Field Guide Alignment - Gradio Space under the official `build-small-hackathon` organization. - All runtime inference uses open weights on the Space GPU, with no hosted model API calls. - Model sizes stay under the 32B limit, with MiniCPM-V 4.6 at 1.3B parameters and Nemotron-Mini-4B-Instruct at 4B parameters. - Custom autumn-themed UI with a purpose-built compare viewer and diagnostic history. - Fine-tuned vision model and GGUF artifact are published on the author's Hugging Face profile. - Demo video, technical blog, public launch post, and field notes are linked from this Space. Held-out validation summary: - Four visibly damaged private negatives were detected with scratch and emulsion-damage evidence. - One near-clean private negative returned zero defects. - A broad lifted crack network that failed full-frame inference was recovered by the tiled fallback.