project-halide / README.md
Lonelyguyse1's picture
Add data story to presentation (#6)
23bf051
|
Raw
History Blame Contribute Delete
4.66 kB
---
title: Project Halide
sdk: gradio
sdk_version: 6.10.0
app_file: app.py
license: apache-2.0
models:
- Lonelyguyse1/halide-vision
- openbmb/MiniCPM-V-4.6
- nvidia/Nemotron-Mini-4B-Instruct
tags:
- gradio
- film
- computer-vision
- diagnostics
- track:backyard
- sponsor:openbmb
- sponsor:nvidia
- sponsor:modal
- sponsor:openai
- badge:off-brand
- badge:offbrand
- badge:tiny-titan
- badge:tiny
- badge:best-demo
- badge:demo
- badge:best-agent
- badge:bonus-quest
- badge:quest-champion
- badge:quest
- achievement:offgrid
- achievement:welltuned
- achievement:offbrand
- achievement:fieldnotes
---
# Project Halide
Project Halide is an edge-native diagnostic workbench for analog film scans by
[Lonelyguyse1](https://huggingface.co/Lonelyguyse1).
The runtime uses MiniCPM-V 4.6 for defect extraction and
Nemotron-Mini-4B-Instruct for diagnostic reasoning. The vision pass combines
full-frame inspection, tiled fallback for large scans, a conservative
image-analysis validator for obvious scratches, and geometric filtering for
sprocket or frame-edge artifacts. Model inference runs on the Space GPU runtime
without cloud inference APIs.
Fine-tuned vision model:
<https://huggingface.co/Lonelyguyse1/halide-vision>
Fine-tuning improved the vision stage where it mattered most for the app:
structured defect JSON, consistent film-defect labels, scratch and
emulsion-damage vocabulary, and fewer obvious false positives on clean or
lookalike regions. The runtime still treats model output as candidate evidence
and validates every box.
The data bottleneck was central to the build. Public damaged-film examples are
scattered, noisy, and often not real negatives, so the training curriculum
combines FilmDamageSimulator annotations, procedural defect positives, synthetic
scratches and stains, hard clean negatives, and lookalike counterexamples such
as grass, subject hair, sprocket holes, borders, and glare. The five private
negatives stayed held out for evaluation only.
Source repository:
<https://github.com/LonelyGuy-SE1/Project-Halide>
Demo video:
<https://youtube.com/watch?si=apzCiBZcIZWC1nFt&v=DGJ2M1aQCrE&feature=youtu.be>
Public launch post:
<https://x.com/lonelyguyse1/status/2066631507956105423?s=20>
Technical blog:
<https://lonelyguy.vercel.app/articles/2026-06-16-project-halide>
Modal was used for offline training, held-out GPU evaluation, checkpoint upload,
GGUF conversion, and Space deployment. The runtime app itself does not call
Modal or any hosted inference API.
## How It Works
1. Upload a film scan, negative photo, or contact-sheet crop.
2. MiniCPM-V 4.6 extracts candidate defects as structured JSON.
3. The validator normalizes boxes, filters bad geometry, removes duplicate or
sprocket-like edge artifacts, and adds high-precision scratch candidates
when clear linear evidence is visible.
4. Nemotron-Mini-4B-Instruct reads the validated evidence plus user metadata and
writes a lab-style diagnosis with physical fixes.
5. SQLite stores local diagnostic history so earlier runs can be reopened.
## Sponsor Usage
- OpenBMB: MiniCPM-V 4.6 is the primary vision model, fine-tuned for film defect
extraction and published at `Lonelyguyse1/halide-vision`.
- NVIDIA: Nemotron-Mini-4B-Instruct produces the diagnostic report and keeps
uncertain film metadata lower priority than visible evidence.
- Modal: used offline for training, evaluation, checkpoint export, GGUF
conversion, model upload, and Space deployment support.
- OpenAI: assisted implementation, review, and source-control hygiene through
the linked repository workflow.
## Field Guide Alignment
- Gradio Space under the official `build-small-hackathon` organization.
- All runtime inference uses open weights on the Space GPU, with no hosted model
API calls.
- Model sizes stay under the 32B limit, with MiniCPM-V 4.6 at 1.3B parameters
and Nemotron-Mini-4B-Instruct at 4B parameters.
- Custom autumn-themed UI with a purpose-built compare viewer and diagnostic
history.
- Fine-tuned vision model and GGUF artifact are published on the author's
Hugging Face profile.
- Demo video, technical blog, public launch post, and field notes are linked
from this Space.
Held-out validation summary:
- Four visibly damaged private negatives were detected with scratch and
emulsion-damage evidence.
- One near-clean private negative returned zero defects.
- A broad lifted crack network that failed full-frame inference was recovered by
the tiled fallback.