cn0303 commited on
Commit
ebf621f
·
verified ·
1 Parent(s): e34beb2

Fix kernels version clash: pin to transformers' declared range

Browse files
Files changed (2) hide show
  1. README.md +47 -41
  2. requirements.txt +3 -1
README.md CHANGED
@@ -15,53 +15,56 @@ models:
15
  ---
16
 
17
  <!--
18
- ZeroGPU: select "ZeroGPU" hardware in the Space's Settings (the README can't
19
- set it). The model brick (/api/ask) only loads the LLM when SPACE_ID is set, so
20
- local `python app.py` stays instant and the chat uses a deterministic fallback.
21
- Swap the model with no code change via the FITCHECK_MODEL Space secret/variable,
22
- e.g. FITCHECK_MODEL=Qwen/Qwen3-4B-Instruct-2507 (clean Apache fallback).
23
  -->
24
 
25
-
26
  # FitCheck
27
 
28
- **What AI can your computer actually run?**
29
-
30
- Tell FitCheck about your computer in plain words. It tells you honestly what AI
31
- you can run today, from chatbots to object detection, what it will feel like,
32
- and exactly how to start. No jargon, no hype, no fake promises.
33
-
34
- Built for the [Build Small hackathon](https://huggingface.co/build-small-hackathon)
35
- (Backyard AI track).
36
 
37
- ## What makes it different
 
 
38
 
39
- - **Plain words.** Built for people who do not know what VRAM or quantisation
40
- means, and it explains those words when they appear.
41
- - **The whole realm, not just chatbots.** Goals span LLMs and specialised deep
42
- learning: object detection (YOLO), segmentation, 6-DoF pose (FoundationPose),
43
- image and video generation, speech-to-text, embeddings, fine-tuning, plus a
44
- custom "describe it" option.
45
- - **Honest by design.** Three plain bands (Runs great, Tight but works, Will not
46
- fit) with conservative estimates that under-promise rather than over.
47
 
48
- ## How it is built
 
 
 
 
 
 
 
 
 
 
 
 
 
49
 
50
- Three bricks behind one hand-built HTML/CSS/JS frontend (no framework, no build
51
- step), served by Gradio server mode (`gr.Server`, which is a FastAPI app):
52
 
53
- 1. **The frontend** (`static/`) gathers your setup in plain words.
54
- 2. **The deterministic engine** (`engine/`) does the real memory arithmetic and
55
- returns an honest verdict over `POST /api/advise`. No AI in the loop, so
56
- every number is inspectable. (LLM goals run on the engine today; vision /
57
- image / audio / data goals use a conservative placeholder until the engine
58
- models those families.)
59
- 3. **The model brick** (`model_brick.py`) is a small local LLM
60
- (NVIDIA Nemotron 3 Nano 4B) that *explains* the engine's numbers in plain
61
- words over `/gradio_api/call/ask`. It is a closed-context narrator: it never
62
- invents a number, only re-voices the facts the engine produced. On a ZeroGPU
63
- Space it runs on a GPU via `@spaces.GPU`; locally it degrades to a
64
- deterministic explainer so the chat always answers.
 
 
 
 
 
65
 
66
  ## Run it locally
67
 
@@ -72,6 +75,9 @@ pip install -r requirements.txt
72
  python app.py
73
  ```
74
 
75
- Then open http://127.0.0.1:7860/ (add `?go` for a sample result). Locally the
76
- follow-up chat uses the deterministic explainer; the Nemotron model loads only
77
- on a Space (when `SPACE_ID` is set).
 
 
 
 
15
  ---
16
 
17
  <!--
18
+ ZeroGPU is selected in the Space's Settings (the README can't set it). The
19
+ model brick (/api/ask) only loads the LLM when SPACES_ZERO_GPU is set, so
20
+ local `python app.py` stays instant.
 
 
21
  -->
22
 
 
23
  # FitCheck
24
 
25
+ **What AI can your computer actually run?** And the other way round: **what
26
+ computer do you need for the AI you want to run?**
 
 
 
 
 
 
27
 
28
+ Tell FitCheck about your machine in plain words. It answers honestly — real
29
+ models, real memory figures, real licenses, real copy-paste commands — from
30
+ chatbots to object detection, image generation, speech, and robotics.
31
 
32
+ ## Why it's trustworthy
 
 
 
 
 
 
 
33
 
34
+ - **A deterministic engine does the math, not an AI.** Verdicts come from a
35
+ transparent rules engine over `catalogue.json` — 83 real models verified
36
+ against the Hugging Face API. Nothing in the verdict can be hallucinated.
37
+ - **Model sizes are exact.** For GGUF models the weights figure is the actual
38
+ file size in bytes from the Hub — not a params-times-bits estimate. Chat
39
+ memory uses each model's real architecture (GQA-aware), and every estimate
40
+ includes a 0.58 GB safety buffer (the 95% load-success margin fitted from
41
+ ~19,500 community measurements).
42
+ - **Provenance on every number.** The UI says whether a figure is an exact
43
+ file size, a vendor-published number, community-reported, or estimated.
44
+ - **Licenses up front.** AGPL, non-commercial, and gated models are labelled
45
+ on every card — before you build your project on one.
46
+ - **Conservative by design.** Three plain bands (Runs great / Tight, but works
47
+ / Won't fit) that would rather under-promise than over-promise.
48
 
49
+ ## What's inside
 
50
 
51
+ 1. **The catalogue** `scripts/curation.json` (hand-picked models across
52
+ LLM, vision-language, vision, image/video generation, speech, music,
53
+ embeddings, forecasting) enriched by `scripts/refresh_catalogue.py` from
54
+ public Hub endpoints into `catalogue.json`. Refreshed nightly; baked in at
55
+ build time so the running app is fully offline.
56
+ 2. **The engine** (`engine/`) — pure Python memory math and honest banding.
57
+ Also answers the reverse question: minimum vs comfortable hardware tiers
58
+ for a goal ("Help me pick one" mode).
59
+ 3. **The model brick** (`model_brick.py`) NVIDIA Nemotron 3 Nano 4B running
60
+ in-Space on ZeroGPU (hybrid Mamba-2, accelerated by prebuilt hub kernels),
61
+ explaining the engine's numbers in plain words. It never does the math; if
62
+ it states a figure that isn't in the engine's facts, the gate logs it.
63
+ 4. **The frontend** (`static/`) — hand-built HTML/CSS/JS, no framework, served
64
+ by Gradio server mode (`gr.Server`). Optional extra: paste any Hugging Face
65
+ model id and FitCheck walks its finetune/quantized lineage to a known base
66
+ ("if the base runs, your finetune runs") — the one clearly-labelled online
67
+ feature.
68
 
69
  ## Run it locally
70
 
 
75
  python app.py
76
  ```
77
 
78
+ Open http://127.0.0.1:7860/ (add `?go` for an instant sample result). Locally
79
+ the explainer reports the model isn't loaded (it only loads on the Space) —
80
+ everything else works fully offline.
81
+
82
+ Built for the [Build Small hackathon](https://huggingface.co/build-small-hackathon)
83
+ (Backyard AI track).
requirements.txt CHANGED
@@ -3,7 +3,9 @@ gradio==6.16.0 # gr.Server (FastAPI) + @app.api queue + ZeroGPU glue
3
  spaces # @spaces.GPU — ZeroGPU allocation on Hugging Face
4
  torch>=2.8.0 # ZeroGPU requirement (>=2.8)
5
  transformers>=5.8.0 # native NemotronH wired to hub kernels (>=5.8)
6
- kernels>=0.10.2 # pulls kernels-community/mamba-ssm + causal-conv1d
 
 
7
  # prebuilt for torch>=2.8 — replaces the naive Mamba
8
  # path (pip mamba-ssm wheels stop at torch 2.7 and
9
  # ZeroGPU build containers have no nvcc; never add
 
3
  spaces # @spaces.GPU — ZeroGPU allocation on Hugging Face
4
  torch>=2.8.0 # ZeroGPU requirement (>=2.8)
5
  transformers>=5.8.0 # native NemotronH wired to hub kernels (>=5.8)
6
+ kernels>=0.12.0,<0.13 # transformers' own declared range — 0.15.x broke
7
+ # LayerRepository (revision became mandatory).
8
+ # Pulls kernels-community/mamba-ssm + causal-conv1d
9
  # prebuilt for torch>=2.8 — replaces the naive Mamba
10
  # path (pip mamba-ssm wheels stop at torch 2.7 and
11
  # ZeroGPU build containers have no nvcc; never add