DIV-45 commited on
Commit
9fc7779
·
verified ·
1 Parent(s): a3d1427

chore: upgrade PacketCourt to Gradio 6.18.0

Browse files

Built with OpenAI Codex: upgrade the public release candidate to Gradio 6.18.0 and include the full Hugging Face article source.

HF_ARTICLE.md ADDED
@@ -0,0 +1,259 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # PacketCourt: The packet takes the stand
2
+
3
+ Food packets are unusually good at telling two different stories at once.
4
+
5
+ The front has seconds to persuade: **HIGH PROTEIN**, **MULTIGRAIN**, **100%
6
+ NATURAL**, **BAKED NOT FRIED**. The back carries the evidence needed to
7
+ interpret those claims: ingredient order, nutrition basis, package size,
8
+ licensing text, dates, and instructions that only matter after opening.
9
+
10
+ PacketCourt is my attempt to make those two surfaces answer to each other.
11
+
12
+ It is a phone-first Gradio app for Indian packaged-food labels. A user
13
+ photographs the front and back of a packet. PacketCourt reads the visible text,
14
+ plans an evidence investigation, performs deterministic calculations, and
15
+ returns conservative verdicts with citations.
16
+
17
+ It does not produce a health score. It asks a narrower question:
18
+
19
+ > Does the evidence printed on this packet support the impression created by
20
+ > its front?
21
+
22
+ Try the Space: https://huggingface.co/spaces/build-small-hackathon/packetcourt
23
+
24
+ Read the Codex-attributed source:
25
+ https://github.com/N-45div/PacketCourt
26
+
27
+ ## The product decision that shaped everything
28
+
29
+ An early version of the idea was a general nutrition scanner. That direction
30
+ was broad, crowded, and difficult to trust. A single red, yellow, or green score
31
+ would hide too many judgments:
32
+
33
+ - Is sugar always worse than protein is good?
34
+ - How should serving size affect a score?
35
+ - Does an FSSAI license imply a health endorsement?
36
+ - Can OCR uncertainty silently change the answer?
37
+
38
+ PacketCourt therefore avoids ranking products. It audits claims against
39
+ evidence from the same supplied packet.
40
+
41
+ The output language is intentionally constrained:
42
+
43
+ - `SUPPORTED BY PROVIDED LABEL`
44
+ - `CONTRADICTED BY PROVIDED LABEL`
45
+ - `TECHNICALLY TRUE, CONTEXT MISSING`
46
+ - `CANNOT VERIFY`
47
+
48
+ The phrase **provided label** matters. PacketCourt does not pretend that a
49
+ photograph is a laboratory analysis or that a missing line of text does not
50
+ exist.
51
+
52
+ ## A three-model investigation with a deterministic judge
53
+
54
+ PacketCourt uses small models where interpretation is useful and deterministic
55
+ code where exactness is required.
56
+
57
+ ```mermaid
58
+ flowchart LR
59
+ Phone["Phone or desktop<br/>front + back photos"] --> App["PacketCourt<br/>custom Gradio app"]
60
+ App --> Vision["OpenBMB MiniCPM-V-4.6<br/>1.30B visual witness"]
61
+ Vision --> Agent["Evidence investigation agent"]
62
+ Agent --> Router["Fine-tuned PacketCourt router<br/>4.38M parameters"]
63
+ Router --> Agent
64
+ Agent --> Nemotron["NVIDIA Nemotron Mini 4B<br/>independent evidence-gap review"]
65
+ Nemotron --> Agent
66
+ Agent --> Judge["Deterministic evidence judge"]
67
+ Judge --> Report["Verdicts, citations,<br/>calculations, and trace"]
68
+ ```
69
+
70
+ ### OpenBMB MiniCPM-V-4.6: the visual witness
71
+
72
+ The vision companion runs privately on ZeroGPU. It receives a packet image and
73
+ transcribes only visibly printed evidence. The front prompt focuses on claims.
74
+ The back prompt focuses on ingredients, nutrition values and basis, net weight,
75
+ FSSAI license text, dates, and after-opening instructions.
76
+
77
+ The model is asked not to explain or infer. Its responsibility is to surface
78
+ what is visible for the next stage.
79
+
80
+ ### A fine-tuned 4.38M-parameter evidence router
81
+
82
+ Different claims require different evidence.
83
+
84
+ - `NO ADDED SUGAR` requires ingredient inspection.
85
+ - `HIGH PROTEIN` requires nutrition values and their measurement basis.
86
+ - `FSSAI APPROVED` requires license evidence and a registration-versus-
87
+ endorsement distinction.
88
+ - `100% NATURAL` requires the safety boundary because the absolute claim cannot
89
+ be established from packet text alone.
90
+
91
+ I fine-tuned a tiny BERT classifier to route claims to five bounded tools:
92
+ `ingredients`, `nutrition`, `license`, `dates`, and `refuse_absolute`.
93
+
94
+ The first training run reached only `0.40` held-out accuracy. The random split
95
+ did not preserve every routing class, and the dataset was too thin. I did not
96
+ enable that checkpoint.
97
+
98
+ After balancing the claim variants and using a stratified five-class holdout,
99
+ the corrected checkpoint reached `1.000` on the small held-out set. That result
100
+ is useful evidence that the routing task is learnable, not proof of broad
101
+ generalization. Deterministic policy fallback remains available when the model
102
+ cannot load.
103
+
104
+ Model: https://huggingface.co/build-small-hackathon/packetcourt-evidence-router
105
+
106
+ Training data:
107
+ https://huggingface.co/datasets/build-small-hackathon/packetcourt-router-training
108
+
109
+ ### NVIDIA Nemotron: an independent reviewer, not the judge
110
+
111
+ After the investigation plan completes, NVIDIA
112
+ `Nemotron-Mini-4B-Instruct` reviews the structured case for missing evidence.
113
+ It can identify the highest-priority next action or confirm that the bounded
114
+ investigation is complete.
115
+
116
+ It cannot change a verdict.
117
+
118
+ This separation matters. A language model is useful for reviewing whether the
119
+ investigation overlooked an evidence gap. It should not silently override
120
+ arithmetic or invent a regulatory conclusion.
121
+
122
+ The first Nemotron deployment also failed. I initially used
123
+ `NVIDIA-Nemotron-3-Nano-4B-BF16`, but a real ZeroGPU probe exposed a dependency
124
+ on a specialized Mamba CUDA runtime unavailable in the standard Gradio image.
125
+ I switched to Nemotron Mini 4B only after the replacement completed a real
126
+ ZeroGPU review.
127
+
128
+ ## The deterministic evidence judge
129
+
130
+ The final verdict path is ordinary Python.
131
+
132
+ That code:
133
+
134
+ - detects known front claims;
135
+ - extracts ingredients;
136
+ - parses nutrition values and their declared basis;
137
+ - calculates whole-packet protein, sugar, sodium, and saturated fat;
138
+ - converts total sugar into a teaspoon equivalent;
139
+ - resolves direct and relative best-before dates;
140
+ - extracts after-opening deadlines;
141
+ - applies conservative claim-specific verdict rules.
142
+
143
+ For example, when a nutrition panel declares values per `100g` and the packet
144
+ contains `300g`, PacketCourt scales the values by exactly `3`. It does not ask a
145
+ language model to perform that arithmetic.
146
+
147
+ ## Persuasion Gap
148
+
149
+ Claim verification alone did not capture the most interesting part of the
150
+ problem.
151
+
152
+ A `HIGH PROTEIN` claim can be supported by visible protein evidence while the
153
+ complete packet also contains substantial sugar or sodium. A multigrain claim
154
+ can be technically true while refined flour remains the first ingredient.
155
+
156
+ PacketCourt therefore calculates a **Persuasion Gap**: material context on the
157
+ back that competes with the impression emphasized on the front.
158
+
159
+ Examples include:
160
+
161
+ - “Protein leads. Whole-packet sugar stays quiet.”
162
+ - “A positive front claim competes with substantial sodium.”
163
+ - “Grain variety is prominent. The first ingredient is refined.”
164
+ - “Registration language can look like a health endorsement.”
165
+
166
+ Each finding cites the exact evidence or calculation. PacketCourt still leaves
167
+ the final decision with the user.
168
+
169
+ ## What makes the agent bounded
170
+
171
+ For every packet, PacketCourt emits an explicit investigation record:
172
+
173
+ - objective;
174
+ - selected evidence tools;
175
+ - reason each tool was selected;
176
+ - whether the fine-tuned router or policy fallback selected it;
177
+ - missing-evidence requests;
178
+ - stop reason;
179
+ - independent Nemotron review;
180
+ - deterministic verdicts and limitations.
181
+
182
+ There are only two valid stopping conditions:
183
+
184
+ 1. every evidence tool required by the detected claims completed; or
185
+ 2. required evidence is missing, so PacketCourt stops and asks for it.
186
+
187
+ The public trace dataset contains no hidden chain-of-thought. It exposes tool
188
+ decisions, evidence outputs, calculations, and boundaries suitable for
189
+ inspection.
190
+
191
+ Traces:
192
+ https://huggingface.co/datasets/build-small-hackathon/packetcourt-traces
193
+
194
+ ## Evaluation
195
+
196
+ The current release has:
197
+
198
+ - `9` passing unit tests;
199
+ - `35/35` passing checks across `10` golden packet cases;
200
+ - `10` transparent investigation traces;
201
+ - one published real end-to-end Nemotron review trace;
202
+ - a successful live audit using the fine-tuned router and Nemotron reviewer.
203
+
204
+ The golden cases cover contradictions, supported claims, missing context,
205
+ whole-packet calculations, refined-grain context, FSSAI registration language,
206
+ relative shelf-life arithmetic, and after-opening instructions.
207
+
208
+ Golden cases:
209
+ https://huggingface.co/datasets/build-small-hackathon/packetcourt-golden-cases
210
+
211
+ ## The interface is part of the evidence standard
212
+
213
+ PacketCourt uses a custom responsive frontend mounted over a Gradio engine.
214
+ The phone workflow matters because the packet is physically in the user's
215
+ hand. The results view shows the investigation path before the verdict cards,
216
+ then separates persuasion gaps, claim findings, nutrition calculations, date
217
+ evidence, and machine-readable JSON.
218
+
219
+ Uncertainty is not hidden in a tooltip. It is part of the primary result.
220
+
221
+ ## What PacketCourt refuses to claim
222
+
223
+ PacketCourt does not declare a food:
224
+
225
+ - healthy;
226
+ - safe;
227
+ - illegal;
228
+ - fraudulent;
229
+ - suitable for a medical condition.
230
+
231
+ It audits only supplied packet evidence. OCR should be checked against the
232
+ physical label. `CANNOT VERIFY` is a successful outcome when the evidence is
233
+ insufficient.
234
+
235
+ That refusal is not a missing feature. It is PacketCourt's standard of proof.
236
+
237
+ ## Built small
238
+
239
+ The complete model budget is approximately `5.3B` parameters:
240
+
241
+ - OpenBMB MiniCPM-V-4.6: `1.30B`;
242
+ - NVIDIA Nemotron Mini: approximately `4B`;
243
+ - fine-tuned PacketCourt router: `4.38M`.
244
+
245
+ The main evidence judge remains deterministic and CPU-based. ZeroGPU is
246
+ requested only for visual transcription and the independent Nemotron review.
247
+
248
+ PacketCourt was built with OpenAI Codex as the primary coding agent. The public
249
+ GitHub repository preserves Codex-attributed commits covering the architecture,
250
+ tests, fine-tuning workflow, model companions, trace publication, UI, and
251
+ deployment.
252
+
253
+ Space: https://huggingface.co/spaces/build-small-hackathon/packetcourt
254
+
255
+ GitHub: https://github.com/N-45div/PacketCourt
256
+
257
+ Model: https://huggingface.co/build-small-hackathon/packetcourt-evidence-router
258
+
259
+ Traces: https://huggingface.co/datasets/build-small-hackathon/packetcourt-traces
README.md CHANGED
@@ -4,7 +4,7 @@ emoji: ⚖️
4
  colorFrom: yellow
5
  colorTo: red
6
  sdk: gradio
7
- sdk_version: 5.49.1
8
  app_file: app.py
9
  pinned: false
10
  license: mit
 
4
  colorFrom: yellow
5
  colorTo: red
6
  sdk: gradio
7
+ sdk_version: 6.18.0
8
  app_file: app.py
9
  pinned: false
10
  license: mit
requirements.txt CHANGED
@@ -1,5 +1,5 @@
1
- gradio==5.49.1
2
- gradio_client==1.13.3
3
  pillow>=11.0.0
4
  pydantic>=2.10.0
5
  pytesseract>=0.3.13
 
1
+ gradio==6.18.0
2
+ gradio_client==2.5.0
3
  pillow>=11.0.0
4
  pydantic>=2.10.0
5
  pytesseract>=0.3.13
src/packetcourt/remote_vision.py CHANGED
@@ -12,10 +12,16 @@ def is_configured() -> bool:
12
  def _client():
13
  from gradio_client import Client
14
 
15
- return Client(
16
- os.environ["PACKETCOURT_VISION_SPACE"],
17
- hf_token=os.getenv("HF_TOKEN"),
18
- )
 
 
 
 
 
 
19
 
20
 
21
  def extract_remote(image_path: str, side: str) -> str:
 
12
  def _client():
13
  from gradio_client import Client
14
 
15
+ try:
16
+ return Client(
17
+ os.environ["PACKETCOURT_VISION_SPACE"],
18
+ hf_token=os.getenv("HF_TOKEN"),
19
+ )
20
+ except TypeError:
21
+ return Client(
22
+ os.environ["PACKETCOURT_VISION_SPACE"],
23
+ token=os.getenv("HF_TOKEN"),
24
+ )
25
 
26
 
27
  def extract_remote(image_path: str, side: str) -> str:
traces/README.md CHANGED
@@ -26,3 +26,7 @@ decision outputs suitable for debugging and evaluation. Each trace records:
26
  - whether a tool came from the fine-tuned router or policy fallback;
27
  - explicit missing-evidence requests and stop reason;
28
  - extracted evidence, calculations, verdicts, and safety limitations.
 
 
 
 
 
26
  - whether a tool came from the fine-tuned router or policy fallback;
27
  - explicit missing-evidence requests and stop reason;
28
  - extracted evidence, calculations, verdicts, and safety limitations.
29
+
30
+ `nemotron_live_review.json` records a real end-to-end review from the private
31
+ NVIDIA Nemotron Mini 4B ZeroGPU companion. It demonstrates that Nemotron can
32
+ review evidence gaps but cannot alter deterministic verdicts.
traces/nemotron_live_review.json ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "trace_id": "trace-nemotron-live-001",
3
+ "input": {
4
+ "front_text": "HIGH PROTEIN 100% NATURAL",
5
+ "back_text": "Ingredients: oats, sugar, salt. Nutrition per 100g: Protein 12g, Total Sugars 22g. Net weight 300g."
6
+ },
7
+ "fine_tuned_router": {
8
+ "model": "build-small-hackathon/packetcourt-evidence-router",
9
+ "status": "active"
10
+ },
11
+ "nemotron_review": {
12
+ "status": "COMPLETE",
13
+ "priority": "Verify the '100% Natural' claim against the evidence printed on the same packet.",
14
+ "evidence_request": "",
15
+ "rationale": "PacketCourt cannot verify an absolute naturalness claim based on the front claim alone.",
16
+ "model": "nvidia/Nemotron-Mini-4B-Instruct"
17
+ },
18
+ "deterministic_verdicts": [
19
+ {
20
+ "claim": "High Protein",
21
+ "verdict": "TECHNICALLY TRUE, CONTEXT MISSING"
22
+ },
23
+ {
24
+ "claim": "100% Natural",
25
+ "verdict": "CANNOT VERIFY"
26
+ }
27
+ ],
28
+ "boundary": "Nemotron reviews evidence gaps but cannot alter deterministic verdicts."
29
+ }