Spaces:

build-small-hackathon
/

FitCheck

Running on Zero

App Files Files Community

cn0303 commited on 2 days ago

Commit

ebf621f

verified ·

1 Parent(s): e34beb2

Fix kernels version clash: pin to transformers' declared range

Browse files

Files changed (2) hide show

README.md +47 -41
requirements.txt +3 -1

README.md CHANGED Viewed

@@ -15,53 +15,56 @@ models:
 ---
 <!--
-ZeroGPU: select "ZeroGPU" hardware in the Space's Settings (the README can't
-set it). The model brick (/api/ask) only loads the LLM when SPACE_ID is set, so
-local `python app.py` stays instant and the chat uses a deterministic fallback.
-Swap the model with no code change via the FITCHECK_MODEL Space secret/variable,
-e.g. FITCHECK_MODEL=Qwen/Qwen3-4B-Instruct-2507 (clean Apache fallback).
 -->
 # FitCheck
-**What AI can your computer actually run?**
-Tell FitCheck about your computer in plain words. It tells you honestly what AI
-you can run today, from chatbots to object detection, what it will feel like,
-and exactly how to start. No jargon, no hype, no fake promises.
-Built for the [Build Small hackathon](https://huggingface.co/build-small-hackathon)
-(Backyard AI track).
-## What makes it different
-- **Plain words.** Built for people who do not know what VRAM or quantisation
-  means, and it explains those words when they appear.
-- **The whole realm, not just chatbots.** Goals span LLMs and specialised deep
-  learning: object detection (YOLO), segmentation, 6-DoF pose (FoundationPose),
-  image and video generation, speech-to-text, embeddings, fine-tuning, plus a
-  custom "describe it" option.
-- **Honest by design.** Three plain bands (Runs great, Tight but works, Will not
-  fit) with conservative estimates that under-promise rather than over.
-## How it is built
-Three bricks behind one hand-built HTML/CSS/JS frontend (no framework, no build
-step), served by Gradio server mode (`gr.Server`, which is a FastAPI app):
-1. **The frontend** (`static/`) gathers your setup in plain words.
-2. **The deterministic engine** (`engine/`) does the real memory arithmetic and
-   returns an honest verdict over `POST /api/advise`. No AI in the loop, so
-   every number is inspectable. (LLM goals run on the engine today; vision /
-   image / audio / data goals use a conservative placeholder until the engine
-   models those families.)
-3. **The model brick** (`model_brick.py`) is a small local LLM
-   (NVIDIA Nemotron 3 Nano 4B) that *explains* the engine's numbers in plain
-   words over `/gradio_api/call/ask`. It is a closed-context narrator: it never
-   invents a number, only re-voices the facts the engine produced. On a ZeroGPU
-   Space it runs on a GPU via `@spaces.GPU`; locally it degrades to a
-   deterministic explainer so the chat always answers.
 ## Run it locally
@@ -72,6 +75,9 @@ pip install -r requirements.txt
 python app.py
 ```
-Then open http://127.0.0.1:7860/ (add `?go` for a sample result). Locally the
-follow-up chat uses the deterministic explainer; the Nemotron model loads only
-on a Space (when `SPACE_ID` is set).

 ---
 <!--
+ZeroGPU is selected in the Space's Settings (the README can't set it). The
+model brick (/api/ask) only loads the LLM when SPACES_ZERO_GPU is set, so
+local `python app.py` stays instant.
 -->
 # FitCheck
+**What AI can your computer actually run?** And the other way round: **what
+computer do you need for the AI you want to run?**
+Tell FitCheck about your machine in plain words. It answers honestly — real
+models, real memory figures, real licenses, real copy-paste commands — from
+chatbots to object detection, image generation, speech, and robotics.
+## Why it's trustworthy
+- **A deterministic engine does the math, not an AI.** Verdicts come from a
+  transparent rules engine over `catalogue.json` — 83 real models verified
+  against the Hugging Face API. Nothing in the verdict can be hallucinated.
+- **Model sizes are exact.** For GGUF models the weights figure is the actual
+  file size in bytes from the Hub — not a params-times-bits estimate. Chat
+  memory uses each model's real architecture (GQA-aware), and every estimate
+  includes a 0.58 GB safety buffer (the 95% load-success margin fitted from
+  ~19,500 community measurements).
+- **Provenance on every number.** The UI says whether a figure is an exact
+  file size, a vendor-published number, community-reported, or estimated.
+- **Licenses up front.** AGPL, non-commercial, and gated models are labelled
+  on every card — before you build your project on one.
+- **Conservative by design.** Three plain bands (Runs great / Tight, but works
+  / Won't fit) that would rather under-promise than over-promise.
+## What's inside
+1. **The catalogue** — `scripts/curation.json` (hand-picked models across
+   LLM, vision-language, vision, image/video generation, speech, music,
+   embeddings, forecasting) enriched by `scripts/refresh_catalogue.py` from
+   public Hub endpoints into `catalogue.json`. Refreshed nightly; baked in at
+   build time so the running app is fully offline.
+2. **The engine** (`engine/`) — pure Python memory math and honest banding.
+   Also answers the reverse question: minimum vs comfortable hardware tiers
+   for a goal ("Help me pick one" mode).
+3. **The model brick** (`model_brick.py`) — NVIDIA Nemotron 3 Nano 4B running
+   in-Space on ZeroGPU (hybrid Mamba-2, accelerated by prebuilt hub kernels),
+   explaining the engine's numbers in plain words. It never does the math; if
+   it states a figure that isn't in the engine's facts, the gate logs it.
+4. **The frontend** (`static/`) — hand-built HTML/CSS/JS, no framework, served
+   by Gradio server mode (`gr.Server`). Optional extra: paste any Hugging Face
+   model id and FitCheck walks its finetune/quantized lineage to a known base
+   ("if the base runs, your finetune runs") — the one clearly-labelled online
+   feature.
 ## Run it locally
 python app.py
 ```
+Open http://127.0.0.1:7860/ (add `?go` for an instant sample result). Locally
+the explainer reports the model isn't loaded (it only loads on the Space) —
+everything else works fully offline.
+Built for the [Build Small hackathon](https://huggingface.co/build-small-hackathon)
+(Backyard AI track).

requirements.txt CHANGED Viewed

@@ -3,7 +3,9 @@ gradio==6.16.0            # gr.Server (FastAPI) + @app.api queue + ZeroGPU glue
 spaces                    # @spaces.GPU — ZeroGPU allocation on Hugging Face
 torch>=2.8.0              # ZeroGPU requirement (>=2.8)
 transformers>=5.8.0       # native NemotronH wired to hub kernels (>=5.8)
-kernels>=0.10.2           # pulls kernels-community/mamba-ssm + causal-conv1d
                           # prebuilt for torch>=2.8 — replaces the naive Mamba
                           # path (pip mamba-ssm wheels stop at torch 2.7 and
                           # ZeroGPU build containers have no nvcc; never add

 spaces                    # @spaces.GPU — ZeroGPU allocation on Hugging Face
 torch>=2.8.0              # ZeroGPU requirement (>=2.8)
 transformers>=5.8.0       # native NemotronH wired to hub kernels (>=5.8)
+kernels>=0.12.0,<0.13     # transformers' own declared range — 0.15.x broke
+                          # LayerRepository (revision became mandatory).
+                          # Pulls kernels-community/mamba-ssm + causal-conv1d
                           # prebuilt for torch>=2.8 — replaces the naive Mamba
                           # path (pip mamba-ssm wheels stop at torch 2.7 and
                           # ZeroGPU build containers have no nvcc; never add