mlbench123 commited on
Commit
2a47873
Β·
verified Β·
1 Parent(s): 19f9b44

Create app.py

Browse files
Files changed (1) hide show
  1. app.py +922 -0
app.py ADDED
@@ -0,0 +1,922 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Amazon Trailer Inspector β€” app.py
3
+ HuggingFace Spaces Β· FastAPI Β· Free vision LLMs
4
+
5
+ REST API that accepts 6 labeled images and runs all 6 aspect inspections
6
+ in parallel, returning a structured JSON inspection report.
7
+
8
+ Endpoint: POST /inspect
9
+ """
10
+
11
+ import base64
12
+ import concurrent.futures
13
+ import io
14
+ import json
15
+ import os
16
+ import re
17
+ import traceback
18
+ from typing import Optional
19
+
20
+ import uvicorn
21
+ from fastapi import FastAPI, HTTPException
22
+ from fastapi.middleware.cors import CORSMiddleware
23
+ from fastapi.responses import JSONResponse
24
+ from PIL import Image
25
+ from huggingface_hub import InferenceClient
26
+ from pydantic import BaseModel, Field
27
+
28
+ # ──────────────────────────────────────────────────────────────────────────────
29
+ # MODELS (tried in order β€” first success wins per image)
30
+ # ──────────────────────────────────────────────────────────────────────────────
31
+ MODELS = [
32
+ # ── Tier 1: best vision quality ──────────────────────────────────────────
33
+ "google/gemma-4-27b-it", # Primary β€” Gemma 4 27B (stable HF serverless name)
34
+ "meta-llama/Llama-4-Scout-17B-16E-Instruct", # Llama 4 Scout β€” excellent vision, free HF
35
+ # ── Tier 2: dedicated vision models ──────────────────────────────────────
36
+ "Qwen/Qwen2.5-VL-7B-Instruct", # Qwen 2.5 VL β€” strong free-tier vision
37
+ "Qwen/Qwen2-VL-7B-Instruct", # Qwen 2 VL β€” previous gen, very stable
38
+ # ── Tier 3: additional fallbacks ─────────────────────────────────────────
39
+ "meta-llama/Llama-3.2-11B-Vision-Instruct", # Llama 3.2 11B Vision β€” reliable fallback
40
+ "microsoft/Phi-3.5-vision-instruct", # Phi-3.5 Vision β€” lightweight, good accuracy
41
+ "HuggingFaceM4/idefics3-8b-llama3", # IDEFICS3 β€” HF native, always available
42
+ "mistralai/Pixtral-12B-2409", # Pixtral 12B β€” Mistral's vision model, free tier
43
+ ]
44
+
45
+ # ──────────────────────────────────────────────────────────────────────────────
46
+ # ASPECT PROMPTS
47
+ # ──────────────────────────────────────────────────────────────────────────────
48
+
49
+ PROMPTS = {
50
+
51
+ "front": """You are a precise visual inspector for Amazon trailer fleets.
52
+
53
+ ════════════════════════════════════════════════════════
54
+ STEP 1 β€” IMAGE VALIDATION (do this BEFORE anything else)
55
+ ════════════════════════════════════════════════════════
56
+ Determine whether this is a valid FRONT LEFT or FRONT RIGHT image of an Amazon trailer.
57
+
58
+ A VALID front-aspect image shows the trailer from the FRONT or FRONT-CORNER area:
59
+ - The main subject is the SIDE PANEL of the trailer β€” the large blue/white body with branding
60
+ - The image is shot from the FRONT HALF looking toward the rear, OR from the front corner
61
+ - The rear dual-axle truck tires are NOT visible (or are tiny/distant in the far background)
62
+ - Components like sensors, GPS, Prime logo, and the green Trailer ID label are the focus
63
+
64
+ An INVALID image is one where:
65
+ - The trailer's REAR DUAL-AXLE TRUCK TIRES are LARGE, PROMINENT, and CLEARLY VISIBLE
66
+ - These are specifically: large inflated rubber truck tires on the REAR BOGIE AXLES,
67
+ appearing as 4 large grouped tires (2 axles Γ— 2 tires each = 4 tires together)
68
+ at the REAR UNDERCARRIAGE of the trailer body
69
+ - They appear in the foreground or mid-frame, bottom-center of the image, large in size
70
+ - The shot is clearly taken from behind or the rear half of the trailer
71
+
72
+ ⚠️ CRITICAL β€” DO NOT CONFUSE THESE WITH REAR TIRES:
73
+ ❌ LANDING GEAR / SUPPORT LEGS: The retractable metal support struts/legs under the
74
+ front of the trailer when it is parked (not attached to a truck). These are METAL
75
+ POLES/STRUTS, not rubber tires. They hold up the front of a parked trailer.
76
+ β†’ DO NOT flag landing gear as rear tires.
77
+ ❌ SINGLE FRONT STEER AXLE: If a truck cab is attached, its single front steering wheel
78
+ (one tire on each side, much smaller than rear bogie) is NOT the rear dual axle.
79
+ β†’ DO NOT flag single front steer wheels as rear dual-axle tires.
80
+ ❌ TRAILER DOLLIES / SMALL WHEELS: Any small wheels used for maneuvering a parked
81
+ trailer are not the rear axle tires.
82
+
83
+ POSITIVE IDENTIFICATION β€” only flag as INVALID if you see ALL of these:
84
+ βœ” Large inflated RUBBER TRUCK TIRES (clearly rubber, round, with tread)
85
+ βœ” DUAL AXLE configuration β€” two sets of large tires grouped together (4 tires total)
86
+ βœ” Located at the REAR of the trailer body / rear undercarriage
87
+ βœ” LARGE in the frame β€” prominent, not a tiny distant element
88
+
89
+ DECISION:
90
+ β†’ If rear dual-axle RUBBER TRUCK TIRES (4 grouped) are LARGE AND PROMINENT in frame:
91
+ Set image_valid = "missing", Set ALL other components to "missing"
92
+
93
+ β†’ In ALL other cases (no tires, landing gear visible, single wheels, distant tires, etc.):
94
+ Set image_valid = "detected"
95
+ Proceed to STEP 2 below.
96
+
97
+ ════════════════════════════════════════════════════════
98
+ STEP 2 β€” COMPONENT DETECTION (only if image_valid = "detected")
99
+ ════════════════════════════════════════════════════════
100
+ This image shows the FRONT-LEFT or FRONT-RIGHT corner of an Amazon trailer β€” the rear corner
101
+ area is visible from the side/front angle showing the side panels and rear corner post.
102
+ Carefully locate all 4 components described below.
103
+
104
+ ────────────────────────────────────────────────────────
105
+ COMPONENT 1 β€” SENSORS
106
+ ────────────────────────────────────────────────────────
107
+ WHERE: On the REAR DOOR FACE or the lower area of the trailer near the rear corner.
108
+ Look at the lower-middle or lower-left area of the rear panel visible in this image.
109
+
110
+ WHAT: Exactly TWO metal plates shaped like DIAMONDS (rotated squares / rhombuses).
111
+ - Each plate has diagonal cross-bracing visible on its face (an X pattern of raised ridges)
112
+ - They are mounted SIDE BY SIDE, touching or close together
113
+ - Color: beige, gold, tan, or silver-gray metallic
114
+ - Size: roughly the size of a dinner plate each
115
+ - They appear as a PAIR β€” two identical diamond shapes next to each other
116
+ - May be on the rear face of the trailer or on the lower panel near the door area
117
+
118
+ ────────────────────────────────────────────────────────
119
+ COMPONENT 2 β€” GPS_DEVICE
120
+ ────────────────────────────────────────────────────────
121
+ ⚠️ THIS IS THE MOST COMMONLY MISSED COMPONENT β€” READ CAREFULLY ⚠️
122
+
123
+ WHERE: At the VERY TOP of the REAR CORNER POST. The corner post is the narrow vertical
124
+ aluminum pillar/column at the rear corner of the trailer β€” where the SIDE WALL meets
125
+ the rear face. Look at the TOP of this post, right at or just below the ROOF LINE.
126
+
127
+ CRITICAL SEARCH STRATEGY β€” do this before answering:
128
+ 1. First locate the GREEN TRAILER ID STRIP (component 4 β€” the lime-green vertical label)
129
+ 2. Look DIRECTLY ABOVE that green strip, on the SAME vertical corner post
130
+ 3. Search for a small white or light-gray rectangular box mounted there
131
+ 4. Also check the VERY TOP CORNER where the corner post meets the roof rail
132
+
133
+ WHAT IT LOOKS LIKE:
134
+ - A small white, off-white, or light gray rectangular electronic housing/box
135
+ - Roughly the size of a large book or small tablet (wider than tall, or square)
136
+ - Has a visible FRONT FACE β€” may show a small digital display, sensor window, or LED
137
+ - Mounted FLUSH to or BRACKETED onto the corner post or roof/top rail junction
138
+
139
+ CONFIDENCE GUIDANCE: If you see ANY small rectangular box or housing at the top of the
140
+ corner post, even if partially visible or unclear, mark "detected". Only mark "missing"
141
+ if you can clearly confirm there is NO box/device at the top of the corner post.
142
+
143
+ ────────────────────────────────────────────────────────
144
+ COMPONENT 3 β€” PRIME_LOGO
145
+ ────────────────────────────────────────────────────────
146
+ WHERE: On the main side panels of the trailer body β€” the large blue (or white) surface.
147
+
148
+ WHAT: Any Amazon Prime branding β€” ANY of the following counts:
149
+ - The word "prime" in white letters on the trailer body
150
+ - The word "amazon" with or without the arrow/smile logo
151
+ - The Amazon arrow/smile swoosh logo alone (curved arrow shape)
152
+ - Any partial visibility of the above β€” even one letter or partial arrow
153
+
154
+ ─────────────────────────────────��──────────────────────
155
+ COMPONENT 4 β€” TRAILER_ID
156
+ ────────────────────────────────────────────────────────
157
+ WHERE: On the REAR VERTICAL CORNER POST β€” the narrow vertical aluminum pillar/column
158
+ at the rear corner of the trailer, where the side panel meets the rear face.
159
+
160
+ WHAT: A fluorescent GREEN or LIME-GREEN vertical label strip affixed to this corner post.
161
+ - The strip runs VERTICALLY down a section of the corner post
162
+ - Displays an alphanumeric code running vertically: e.g. "SV2602705", "AZNG..."
163
+ - The green background color is very distinctive β€” bright lime-green
164
+ - Located roughly at mid-height to upper-middle of the corner post
165
+
166
+ IMPORTANT: Even if only PART of the green strip is visible β†’ still mark "detected".
167
+
168
+ Reply ONLY with a single flat JSON object β€” no extra text, no markdown fences, no nested objects:
169
+ {
170
+ "image_valid": "detected",
171
+ "sensors": "missing",
172
+ "gps_device": "missing",
173
+ "prime_logo": "detected",
174
+ "trailer_id": "detected"
175
+ }
176
+ Each value must be exactly "detected" or "missing". Nothing else.""",
177
+
178
+ "rear": """You are a precise visual inspector for Amazon trailer fleets.
179
+
180
+ ════════════════════════════════════════════════════════
181
+ STEP 1 β€” IMAGE VALIDATION: IS THIS A VALID REAR-SIDE VIEW?
182
+ ════════════════════════════════════════════════════════
183
+ Your FIRST task is to determine whether this image shows the REAR HALF / REAR SIDE of an Amazon
184
+ trailer. This is critical β€” FRONT-SIDE views of the trailer must be rejected.
185
+
186
+ THE SINGLE MOST RELIABLE RULE β€” TIRE PROXIMITY TEST:
187
+ Look at the BOTTOM of the image, near the side of the trailer CLOSEST TO THE CAMERA:
188
+
189
+ REAR-SIDE IMAGE (VALID):
190
+ β†’ The trailer's REAR DUAL-AXLE TIRES are on the NEAR SIDE β€” CLOSE to the camera,
191
+ appearing LARGE and PROMINENT in the lower portion of the image.
192
+ β†’ "Rear dual axle" = a GROUP of 4 large rubber truck tires (2 axles Γ— 2 tires each),
193
+ all packed together at the rear undercarriage.
194
+ β†’ The trailer's REAR DOORS / REAR FACE is also visible in this view.
195
+
196
+ FRONT-SIDE IMAGE (INVALID β€” must reject):
197
+ β†’ The area CLOSEST TO THE CAMERA shows NO LARGE TIRES β€” only:
198
+ β€’ Metal support legs / landing gear struts
199
+ β€’ Open undercarriage with no dominant tire group visible on the near side
200
+ β†’ The rear dual-axle tires, IF visible at all, appear SMALL and FAR AWAY.
201
+
202
+ VALIDATION DECISION:
203
+ Q1: Are large rubber truck tires (dual-axle group) visible CLOSE TO THE CAMERA?
204
+ β†’ YES β†’ image_valid = "detected" β†’ proceed to STEP 2
205
+ β†’ NO β†’ image_valid = "missing", set ALL other components to "missing", STOP.
206
+
207
+ ════════════════════════════════════════════════════════
208
+ STEP 2 β€” COMPONENT DETECTION (only if image_valid = "detected")
209
+ ════════════════════════════════════════════════════════
210
+
211
+ ════════════════════════════════════════════════════════
212
+ COMPONENT 1 β€” SIDE SKIRT / FIN
213
+ ════════════════════════════════════════════════════════
214
+ WHERE: Directly below the trailer body floor, along the BOTTOM SIDE of the trailer.
215
+ Just below the horizontal red-and-white reflective tape stripe at the trailer bottom.
216
+
217
+ WHAT: A flat, solid rectangular panel hanging vertically below the trailer chassis.
218
+ - Fills the gap between the trailer floor underside and the ground level, beside the axles
219
+ - May be dark gray, charcoal, black, silver, or metallic in color
220
+ - IN SHADOW: look for its RECTANGULAR OUTLINE and STRAIGHT EDGES instead of color
221
+ - Look for a SOLID FLAT SURFACE blocking the view through to the undercarriage
222
+
223
+ ════════════════════════════════════════════════════════
224
+ COMPONENT 2 β€” EDGE KIT
225
+ ════════════════════════════════════════════════════════
226
+ WHERE: On the SIDE SURFACE of the trailer body, near the REAR END.
227
+ Located at roughly mid-to-upper height on the side panel, just before the rear corner.
228
+
229
+ WHAT:
230
+ - A BODY-COLORED rectangular panel β€” the SAME COLOR as the trailer body
231
+ - Has VISIBLE BOLT HOLES or screw holes (several dots/holes visible in the panel)
232
+ - Taller than it is wide β€” roughly portrait-orientation rectangle
233
+ - Mounted flush against the trailer side near the rear-door corner post area
234
+
235
+ ════════════════════════════════════════════════════════
236
+ SIDE IDENTIFICATION RULES
237
+ ════════════════════════════════════════════════════════
238
+ - First identify: which side of the trailer is facing the camera?
239
+ - LEFT SIDE view: rear doors visible on the LEFT; side extends RIGHT
240
+ - RIGHT SIDE view: rear doors visible on the RIGHT; side extends LEFT
241
+ - Only mark a side as "detected" if that side is ACTUALLY VISIBLE in this image
242
+ - "left" and "right" are from the TRAILER'S own perspective (driver's point of view)
243
+
244
+ Reply ONLY with a single flat JSON object β€” no extra text, no markdown fences, no nested objects:
245
+ {
246
+ "image_valid": "detected",
247
+ "side_skirts_left": "missing",
248
+ "side_skirts_right": "detected",
249
+ "edge_kit_left": "missing",
250
+ "edge_kit_right": "detected"
251
+ }
252
+ Each value must be exactly "detected" or "missing". Nothing else.""",
253
+
254
+ "inside": """You are a precise visual inspector for Amazon trailer fleets.
255
+ Examine this image of an Amazon trailer interior.
256
+
257
+ ════════════════════════════════════════════════════════
258
+ STEP 1 β€” DOOR STATUS CHECK (do this FIRST)
259
+ ════════════════════════════════════════════════════════
260
+ DOORS ARE OPEN if you can see INTO the trailer cargo area:
261
+ - A long dark tunnel/corridor extending into the trailer depth
262
+ - Corrugated ribbed metal side walls running into the distance
263
+ - A wooden or composite floor surface at the entrance threshold
264
+
265
+ DOORS ARE CLOSED if the image shows flat door panel surfaces as the main subject.
266
+
267
+ If doors are CLOSED β†’ set BOTH components to "missing"
268
+ If doors are OPEN β†’ proceed to STEP 2.
269
+
270
+ ════════════════════════════════════════════════════════
271
+ STEP 2 β€” COMPONENT DETECTION (only if doors are OPEN)
272
+ ════════════════════════════════════════════════════════
273
+
274
+ COMPONENT 1 β€” SIDE_GUARDS
275
+ WHERE: Along the LEFT and RIGHT interior side walls of the trailer cargo area.
276
+ WHAT: Corrugated or ribbed protective panels lining the inside walls β€” typically silver/gray
277
+ metal with horizontal or diagonal ribbing/corrugation. They run from near the floor upward
278
+ along both interior side walls. Mark "detected" if visible on at least one side wall.
279
+
280
+ COMPONENT 2 β€” FLOORING
281
+ WHERE: At the BOTTOM of the trailer interior opening β€” the floor surface at the entrance.
282
+ WHAT: Wooden plank flooring β€” individual wooden planks running parallel lengthwise.
283
+ - Color: brown, amber, tan, or light brown wood tone
284
+ - The planks span the full width of the trailer floor
285
+ - ONLY mark "detected" if you can clearly see the brown wooden plank surface INSIDE the trailer
286
+ - Do NOT count asphalt/concrete ground outside the trailer
287
+
288
+ Reply ONLY with a single flat JSON object β€” no extra text, no markdown fences, no nested objects:
289
+ {
290
+ "side_guards": "detected",
291
+ "flooring": "missing"
292
+ }
293
+ Each value must be exactly "detected" or "missing". Nothing else.""",
294
+
295
+ "door": """You are a precise visual inspector for Amazon trailer fleets.
296
+
297
+ ════════════════════════════════════════════════════════
298
+ STEP 1 β€” IMAGE VALIDATION (do this BEFORE anything else)
299
+ ════════════════════════════════════════════════════════
300
+ A VALID door-details image has ALL of the following:
301
+ βœ” The REAR SWING DOORS of the trailer are the main subject β€” both door panels visible face-on
302
+ βœ” The doors are CLOSED (flat white/gray/metal door panels visible β€” NOT an open interior view)
303
+ βœ” The BOTTOM of the door frame is visible
304
+ βœ” The image is taken straight-on or slightly angled from the REAR of the trailer
305
+
306
+ An INVALID image:
307
+ - A FRONT or SIDE view of the trailer
308
+ - Doors are OPEN
309
+ - Not showing the rear swing door panels as the main subject
310
+ - Bottom of door frame is cut off
311
+
312
+ DECISION:
313
+ β†’ If NOT a valid door-details image:
314
+ Set image_valid = "missing", Set BOTH other components to "missing"
315
+ β†’ If IS a valid closed rear-door image:
316
+ Set image_valid = "detected", Proceed to STEP 2.
317
+
318
+ ════════════════════════════════════════════════════════
319
+ STEP 2 β€” COMPONENT DETECTION (only if image_valid = "detected")
320
+ ════════════════════════════════════════════════════════
321
+
322
+ COMPONENT 1 β€” LATCH_KIT_LASH_LINKS
323
+ Door securing hardware β€” ANY of the following:
324
+ a) LATCH KIT: Metal door latching/locking mechanism β€” horizontal latch bars, vertical locking
325
+ rods, T-handles, cam locks, keeper plates, door handle assemblies, lock rod brackets,
326
+ or any hardware that keeps the door closed.
327
+ b) LASH LINKS: Metal chain links, D-rings, anchor hooks, or tie-down rings on door/inner frame.
328
+ Mark "detected" if ANY latch hardware OR lash link hardware is visible.
329
+
330
+ COMPONENT 2 β€” GROTE_LED_LIGHTS
331
+ LED light fixtures at the bottom of the door frame:
332
+ - Look specifically at the BOTTOM CORNERS of the rear door frame / underside of the trailer
333
+ - Grote lights appear as rectangular or square metal housing boxes (silver, black, or chrome)
334
+ with LED lenses inside β€” typically red but may be white or amber
335
+ - They are mounted at the lower edge of the door frame, one on each bottom corner
336
+ - Even if only one side is visible, mark "detected"
337
+ - Do NOT count reflective tape or passive reflectors β€” only active LED light fixtures
338
+
339
+ Reply ONLY with a single flat JSON object β€” no extra text, no markdown fences, no nested objects:
340
+ {
341
+ "image_valid": "detected",
342
+ "latch_kit_lash_links": "detected",
343
+ "grote_led_lights": "missing"
344
+ }
345
+ Each value must be exactly "detected" or "missing". Nothing else."""
346
+
347
+ }
348
+
349
+ # ──────────────────────────────────────────────────────────────────────────────
350
+ # ASPECT METADATA
351
+ # ──────────────────────────────────────────────────────────────────────────────
352
+
353
+ ASPECT_KEYS = {
354
+ "front": ["image_valid", "sensors", "gps_device", "prime_logo", "trailer_id"],
355
+ "rear": ["image_valid", "side_skirts_left", "side_skirts_right", "edge_kit_left", "edge_kit_right"],
356
+ "inside": ["side_guards", "flooring"],
357
+ "door": ["image_valid", "latch_kit_lash_links", "grote_led_lights"],
358
+ }
359
+
360
+ CONF_RANK = {"high": 3, "medium": 2, "low": 1, "": 0}
361
+
362
+ # Valid label names accepted by the API
363
+ VALID_LABELS = {"front_right", "front_left", "rear_right", "rear_left", "inside", "door"}
364
+
365
+ # Map each label to its inspection aspect
366
+ LABEL_TO_ASPECT = {
367
+ "front_right": "front",
368
+ "front_left": "front",
369
+ "rear_right": "rear",
370
+ "rear_left": "rear",
371
+ "inside": "inside",
372
+ "door": "door",
373
+ }
374
+
375
+ # ──────────────────────────────────────────────────────────────────────────────
376
+ # HF CLIENT
377
+ # ──────────────────────────────────────────────────────────────────────────────
378
+
379
+ _hf_client: InferenceClient | None = None
380
+
381
+ def _get_client(token: str) -> InferenceClient:
382
+ global _hf_client
383
+ if _hf_client is None:
384
+ _hf_client = InferenceClient(provider="auto", api_key=token)
385
+ return _hf_client
386
+
387
+ # ──────────────────────────────────────────────────────────────────────────────
388
+ # IMAGE HELPERS
389
+ # ──────────────────────────────────────────────────────────────────────────────
390
+
391
+ def pil_to_b64(img: Image.Image, max_side: int = 1024) -> str:
392
+ img = img.copy().convert("RGB")
393
+ if max(img.size) > max_side:
394
+ img.thumbnail((max_side, max_side), Image.LANCZOS)
395
+ buf = io.BytesIO()
396
+ img.save(buf, format="JPEG", quality=82)
397
+ return base64.b64encode(buf.getvalue()).decode("utf-8")
398
+
399
+
400
+ def decode_b64_image(b64_str: str) -> Image.Image:
401
+ """Decode a base64 string (with or without data-URI prefix) to a PIL Image."""
402
+ if "," in b64_str:
403
+ b64_str = b64_str.split(",", 1)[1]
404
+ raw = base64.b64decode(b64_str)
405
+ return Image.open(io.BytesIO(raw)).convert("RGB")
406
+
407
+ # ──────────────────────────────────────────────────────────────────────────────
408
+ # JSON EXTRACTION
409
+ # ───────────────────────────────────────────────��──────────────────────────────
410
+
411
+ def extract_json(text: str, keys: list) -> dict | None:
412
+ if not text:
413
+ return None
414
+ text = re.sub(r"<think>[\s\S]*?</think>", "", text, flags=re.IGNORECASE)
415
+ text = re.sub(r"```(?:json)?", "", text, flags=re.IGNORECASE).replace("```", "")
416
+ brace = text.find("{")
417
+ if brace > 0:
418
+ text = text[brace:]
419
+ text = text.strip()
420
+ m = re.search(r"\{[\s\S]*\}", text)
421
+ if not m:
422
+ return None
423
+ raw = m.group()
424
+ try:
425
+ return json.loads(raw)
426
+ except json.JSONDecodeError:
427
+ pass
428
+ fixed = re.sub(r",\s*([}\]])", r"\1", raw)
429
+ try:
430
+ return json.loads(fixed)
431
+ except json.JSONDecodeError:
432
+ pass
433
+ try:
434
+ rebuilt = {}
435
+ for key in keys:
436
+ m_str = re.search(rf'"{key}"\s*:\s*"([^"]+)"', raw)
437
+ if m_str:
438
+ rebuilt[key] = m_str.group(1)
439
+ continue
440
+ m_obj = re.search(rf'"{key}"\s*:\s*(\{{[^}}]+\}})', raw, re.DOTALL)
441
+ if m_obj:
442
+ try:
443
+ rebuilt[key] = json.loads(m_obj.group(1))
444
+ except Exception:
445
+ pass
446
+ if rebuilt:
447
+ return rebuilt
448
+ except Exception:
449
+ pass
450
+ return None
451
+
452
+
453
+ def validate_result(data: dict, keys: list) -> dict | None:
454
+ if not data:
455
+ return None
456
+ out = {}
457
+ for key in keys:
458
+ item = data.get(key)
459
+ if item is None:
460
+ return None
461
+ if isinstance(item, str):
462
+ found = item.strip().lower() == "detected"
463
+ elif isinstance(item, dict):
464
+ found = item.get("found", False)
465
+ if isinstance(found, str):
466
+ found = found.lower() in ("true", "yes", "1")
467
+ found = bool(found)
468
+ else:
469
+ return None
470
+ out[key] = {"found": found, "confidence": "high", "notes": ""}
471
+ return out
472
+
473
+ # ──────────────────────────────────────────────────────────────────────────────
474
+ # PER-IMAGE ANALYSIS
475
+ # ──────────────────────────────────────────────────────────────────────────────
476
+
477
+ def analyze_one(img: Image.Image, aspect: str, token: str) -> tuple:
478
+ """
479
+ Try MODELS in order for a single image.
480
+ Returns (result_dict, model_short_name) on success,
481
+ (None, joined_error_string) on total failure.
482
+ """
483
+ b64 = pil_to_b64(img)
484
+ keys = ASPECT_KEYS[aspect]
485
+ prompt = PROMPTS[aspect]
486
+ errors = []
487
+
488
+ for model in MODELS:
489
+ short = model.split("/")[-1]
490
+ try:
491
+ client = _get_client(token)
492
+ resp = client.chat_completion(
493
+ model=model,
494
+ messages=[
495
+ {
496
+ "role": "system",
497
+ "content": (
498
+ "You are a JSON-only API for trailer inspection. "
499
+ "You MUST respond with a single valid flat JSON object and absolutely "
500
+ "nothing else β€” no explanation, no preamble, no markdown fences, "
501
+ "no reasoning text, no nested objects. "
502
+ "Every value must be exactly the string \"detected\" or \"missing\". "
503
+ "Start your response with '{' and end with '}'."
504
+ ),
505
+ },
506
+ {
507
+ "role": "user",
508
+ "content": [
509
+ {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{b64}"}},
510
+ {"type": "text", "text": prompt},
511
+ ],
512
+ },
513
+ ],
514
+ max_tokens=120,
515
+ temperature=0.05,
516
+ )
517
+ raw_content = resp.choices[0].message.content
518
+ print(f"[{short}][{aspect}] raw: {raw_content[:300]}")
519
+ data = extract_json(raw_content, keys)
520
+ result = validate_result(data, keys)
521
+ if result is not None:
522
+ return result, short
523
+ errors.append(f"{short}: JSON parse failed. Raw: {raw_content[:150]}")
524
+ except Exception as e:
525
+ err = str(e)
526
+ if "401" in err or "403" in err:
527
+ errors.append(f"{short}: auth error β€” check HF_TOKEN ({err[:100]})")
528
+ elif "404" in err:
529
+ errors.append(f"{short}: 404 β€” model unavailable ({err[:100]})")
530
+ elif "429" in err:
531
+ errors.append(f"{short}: rate limited β€” retrying next model")
532
+ elif "503" in err or "502" in err:
533
+ errors.append(f"{short}: model loading β€” retrying next model")
534
+ else:
535
+ errors.append(f"{short}: {err[:180]}")
536
+
537
+ return None, " | ".join(errors)
538
+
539
+
540
+ def merge_results(results: list, aspect: str) -> dict:
541
+ """OR-merge multiple image results: if any image detected it, it's found."""
542
+ keys = ASPECT_KEYS[aspect]
543
+ merged = {k: {"found": False, "confidence": "low", "notes": ""} for k in keys}
544
+ for res in results:
545
+ if not res:
546
+ continue
547
+ for k in keys:
548
+ src = res.get(k, {})
549
+ if src.get("found"):
550
+ merged[k]["found"] = True
551
+ if CONF_RANK.get(src.get("confidence", ""), 0) > CONF_RANK.get(merged[k]["confidence"], 0):
552
+ merged[k]["confidence"] = src["confidence"]
553
+ return merged
554
+
555
+ # ──────────────────────────────────────────────────────────────────────────────
556
+ # REPORT BUILDERS
557
+ # ──────────────────────────────────────────────────────────────────────────────
558
+
559
+ def build_front_report(merged_left: dict | None, merged_right: dict | None) -> dict:
560
+ """
561
+ Combine front_left and front_right results.
562
+ For image_valid: if either side's image is invalid, note it.
563
+ For components: detected if found in EITHER side (OR logic).
564
+ """
565
+ components = {}
566
+
567
+ comp_keys = ["sensors", "gps_device", "prime_logo", "trailer_id"]
568
+ comp_names = {
569
+ "sensors": "Sensors",
570
+ "gps_device": "GPS Device",
571
+ "prime_logo": "Prime Logo",
572
+ "trailer_id": "Trailer ID Label",
573
+ }
574
+
575
+ for key in comp_keys:
576
+ left_found = merged_left.get(key, {}).get("found", False) if merged_left else False
577
+ right_found = merged_right.get(key, {}).get("found", False) if merged_right else False
578
+ detected = left_found or right_found
579
+ components[comp_names[key]] = "detected" if detected else "missing"
580
+
581
+ # Image validity notes
582
+ notes = []
583
+ if merged_left is None:
584
+ notes.append("front_left: image missing from input")
585
+ elif not merged_left.get("image_valid", {}).get("found", True):
586
+ notes.append("front_left: invalid image (wrong angle)")
587
+ if merged_right is None:
588
+ notes.append("front_right: image missing from input")
589
+ elif not merged_right.get("image_valid", {}).get("found", True):
590
+ notes.append("front_right: invalid image (wrong angle)")
591
+
592
+ return {"components": components, "notes": notes}
593
+
594
+
595
+ def build_rear_report(merged_left: dict | None, merged_right: dict | None) -> dict:
596
+ """
597
+ Combine rear_left and rear_right results.
598
+ Each component shows X/2 detection count.
599
+ Both sides MUST be independently detected for full confirmation.
600
+ """
601
+ components = {}
602
+ notes = []
603
+
604
+ # Check image validity
605
+ left_valid = merged_left is not None and merged_left.get("image_valid", {}).get("found", True)
606
+ right_valid = merged_right is not None and merged_right.get("image_valid", {}).get("found", True)
607
+
608
+ if merged_left is None:
609
+ notes.append("rear_left: image missing from input")
610
+ elif not left_valid:
611
+ notes.append("rear_left: invalid image (wrong angle/side)")
612
+
613
+ if merged_right is None:
614
+ notes.append("rear_right: image missing from input")
615
+ elif not right_valid:
616
+ notes.append("rear_right: invalid image (wrong angle/side)")
617
+
618
+ comp_pairs = [
619
+ ("side_skirts_left", "side_skirts_right", "Side Skirts / Fins"),
620
+ ("edge_kit_left", "edge_kit_right", "Edge Kit"),
621
+ ]
622
+
623
+ for left_key, right_key, display_name in comp_pairs:
624
+ left_found = merged_left.get(left_key, {}).get("found", False) if merged_left else False
625
+ right_found = merged_right.get(right_key, {}).get("found", False) if merged_right else False
626
+
627
+ count = int(left_found) + int(right_found)
628
+ status = f"{count}/2"
629
+
630
+ if count == 2:
631
+ result = "detected"
632
+ elif count == 1:
633
+ side = "left side" if left_found else "right side"
634
+ result = f"partially detected ({side} only)"
635
+ else:
636
+ result = "missing"
637
+
638
+ components[display_name] = {
639
+ "status": result,
640
+ "count": status,
641
+ }
642
+
643
+ return {"components": components, "notes": notes}
644
+
645
+
646
+ def build_inside_report(merged: dict | None) -> dict:
647
+ if merged is None:
648
+ return {
649
+ "components": {
650
+ "Side Guards": "missing",
651
+ "Flooring": "missing",
652
+ },
653
+ "notes": ["inside: image missing from input"],
654
+ }
655
+ return {
656
+ "components": {
657
+ "Side Guards": "detected" if merged.get("side_guards", {}).get("found") else "missing",
658
+ "Flooring": "detected" if merged.get("flooring", {}).get("found") else "missing",
659
+ },
660
+ "notes": [],
661
+ }
662
+
663
+
664
+ def build_door_report(merged: dict | None) -> dict:
665
+ if merged is None:
666
+ return {
667
+ "components": {
668
+ "Latch Kit & Lash Links": "missing",
669
+ "Grote LED Lights": "missing",
670
+ },
671
+ "notes": ["door: image missing from input"],
672
+ }
673
+ notes = []
674
+ if not merged.get("image_valid", {}).get("found", True):
675
+ notes.append("door: invalid image (not a valid rear door view)")
676
+
677
+ return {
678
+ "components": {
679
+ "Latch Kit & Lash Links": "detected" if merged.get("latch_kit_lash_links", {}).get("found") else "missing",
680
+ "Grote LED Lights": "detected" if merged.get("grote_led_lights", {}).get("found") else "missing",
681
+ },
682
+ "notes": notes,
683
+ }
684
+
685
+ # ──────────────────────────────────────────────────────────────────────────────
686
+ # FASTAPI APP
687
+ # ──────────────────────────────────────────────────────────────────────────────
688
+
689
+ app = FastAPI(
690
+ title="Amazon Trailer Inspector API",
691
+ description=(
692
+ "AI-powered trailer inspection API. "
693
+ "Submit up to 6 labeled images (front_right, front_left, rear_right, rear_left, inside, door) "
694
+ "and receive a structured component detection report."
695
+ ),
696
+ version="2.0.0",
697
+ )
698
+
699
+ app.add_middleware(
700
+ CORSMiddleware,
701
+ allow_origins=["*"],
702
+ allow_methods=["*"],
703
+ allow_headers=["*"],
704
+ )
705
+
706
+
707
+ # ── Pydantic models ─────────────────────────────────────────────────────────
708
+
709
+ class ImageInput(BaseModel):
710
+ label: str = Field(
711
+ ...,
712
+ description="One of: front_right, front_left, rear_right, rear_left, inside, door",
713
+ example="front_left",
714
+ )
715
+ image_base64: str = Field(
716
+ ...,
717
+ description=(
718
+ "Base64-encoded image. Can be raw base64 or a data URI "
719
+ "(e.g. 'data:image/jpeg;base64,...'). "
720
+ "Supported formats: JPEG, PNG, WEBP."
721
+ ),
722
+ )
723
+
724
+
725
+ class InspectRequest(BaseModel):
726
+ images: list[ImageInput] = Field(
727
+ ...,
728
+ min_length=1,
729
+ max_length=6,
730
+ description="List of labeled images. Each label may appear at most once.",
731
+ example=[
732
+ {"label": "front_left", "image_base64": "<base64>"},
733
+ {"label": "front_right", "image_base64": "<base64>"},
734
+ {"label": "rear_left", "image_base64": "<base64>"},
735
+ {"label": "rear_right", "image_base64": "<base64>"},
736
+ {"label": "inside", "image_base64": "<base64>"},
737
+ {"label": "door", "image_base64": "<base64>"},
738
+ ],
739
+ )
740
+
741
+
742
+ # ── Routes ──────────────────────────────────────────────────────────────────
743
+
744
+ @app.get("/", tags=["Health"])
745
+ def root():
746
+ return {
747
+ "status": "ok",
748
+ "service": "Amazon Trailer Inspector API",
749
+ "version": "2.0.0",
750
+ "endpoint": "POST /inspect",
751
+ }
752
+
753
+
754
+ @app.get("/health", tags=["Health"])
755
+ def health():
756
+ token = os.environ.get("HF_TOKEN", "").strip()
757
+ return {
758
+ "status": "ok",
759
+ "hf_token_set": bool(token),
760
+ "models": [m.split("/")[-1] for m in MODELS],
761
+ }
762
+
763
+
764
+ @app.post("/inspect", tags=["Inspection"])
765
+ def inspect(request: InspectRequest):
766
+ """
767
+ Run full trailer inspection on all submitted images in parallel.
768
+
769
+ **Input:** Up to 6 labeled base64 images.
770
+ **Output:** Per-label report with component detection results.
771
+
772
+ Labels accepted: `front_right`, `front_left`, `rear_right`, `rear_left`, `inside`, `door`
773
+ """
774
+ token = os.environ.get("HF_TOKEN", "").strip()
775
+ if not token:
776
+ raise HTTPException(
777
+ status_code=503,
778
+ detail=(
779
+ "HF_TOKEN not configured. "
780
+ "Set it in Space Settings β†’ Repository Secrets."
781
+ ),
782
+ )
783
+
784
+ # Validate labels and deduplicate
785
+ seen_labels = {}
786
+ for item in request.images:
787
+ if item.label not in VALID_LABELS:
788
+ raise HTTPException(
789
+ status_code=422,
790
+ detail=f"Invalid label '{item.label}'. Must be one of: {sorted(VALID_LABELS)}",
791
+ )
792
+ if item.label in seen_labels:
793
+ raise HTTPException(
794
+ status_code=422,
795
+ detail=f"Duplicate label '{item.label}'. Each label may only appear once.",
796
+ )
797
+ seen_labels[item.label] = item.image_base64
798
+
799
+ # Decode all images
800
+ decoded: dict[str, Image.Image] = {}
801
+ for label, b64 in seen_labels.items():
802
+ try:
803
+ decoded[label] = decode_b64_image(b64)
804
+ except Exception as e:
805
+ raise HTTPException(
806
+ status_code=422,
807
+ detail=f"Could not decode image for label '{label}': {e}",
808
+ )
809
+
810
+ # ── Run all label analyses in parallel ──────────────────────────────────
811
+ # Each label β†’ its aspect β†’ analyze_one(img, aspect, token)
812
+ # front_left and front_right each run independently with the "front" prompt
813
+ # rear_left and rear_right each run independently with the "rear" prompt
814
+ # Results for front and rear are then merged across their respective sides
815
+
816
+ label_results: dict[str, dict | None] = {}
817
+
818
+ def run_label(label: str) -> tuple[str, dict | None]:
819
+ aspect = LABEL_TO_ASPECT[label]
820
+ img = decoded[label]
821
+ # analyze_one returns (result_dict, model_name_or_error)
822
+ result, meta = analyze_one(img, aspect, token)
823
+ if result is not None:
824
+ print(f"[API] {label} β†’ success via {meta}")
825
+ return label, result
826
+ else:
827
+ print(f"[API] {label} β†’ all models failed: {meta}")
828
+ return label, None
829
+
830
+ with concurrent.futures.ThreadPoolExecutor(max_workers=6) as pool:
831
+ futures = {pool.submit(run_label, label): label for label in decoded}
832
+ for fut in concurrent.futures.as_completed(futures):
833
+ label, result = fut.result()
834
+ label_results[label] = result
835
+
836
+ # ── Build the final report ───────────────────────────────────────────────
837
+
838
+ # FRONT: merge left + right with OR logic
839
+ front_left_raw = label_results.get("front_left")
840
+ front_right_raw = label_results.get("front_right")
841
+
842
+ front_report = None
843
+ if "front_left" in decoded or "front_right" in decoded:
844
+ front_report = build_front_report(front_left_raw, front_right_raw)
845
+
846
+ # REAR: left and right reported with X/2 count logic
847
+ rear_left_raw = label_results.get("rear_left")
848
+ rear_right_raw = label_results.get("rear_right")
849
+
850
+ rear_report = None
851
+ if "rear_left" in decoded or "rear_right" in decoded:
852
+ rear_report = build_rear_report(rear_left_raw, rear_right_raw)
853
+
854
+ # INSIDE
855
+ inside_report = None
856
+ if "inside" in decoded:
857
+ inside_report = build_inside_report(label_results.get("inside"))
858
+
859
+ # DOOR
860
+ door_report = None
861
+ if "door" in decoded:
862
+ door_report = build_door_report(label_results.get("door"))
863
+
864
+ # ── Assemble response ────────────────────────────────────────────────────
865
+ report = {}
866
+
867
+ if front_report is not None:
868
+ report["front"] = {
869
+ "label": "Front Left / Right",
870
+ "images_provided": [l for l in ("front_left", "front_right") if l in decoded],
871
+ "components": front_report["components"],
872
+ "notes": front_report["notes"],
873
+ }
874
+
875
+ if rear_report is not None:
876
+ report["rear"] = {
877
+ "label": "Rear Left / Right",
878
+ "images_provided": [l for l in ("rear_left", "rear_right") if l in decoded],
879
+ "components": rear_report["components"],
880
+ "notes": rear_report["notes"],
881
+ }
882
+
883
+ if inside_report is not None:
884
+ report["inside"] = {
885
+ "label": "Inside Trailer",
886
+ "images_provided": ["inside"],
887
+ "components": inside_report["components"],
888
+ "notes": inside_report["notes"],
889
+ }
890
+
891
+ if door_report is not None:
892
+ report["door"] = {
893
+ "label": "Door Details",
894
+ "images_provided": ["door"],
895
+ "components": door_report["components"],
896
+ "notes": door_report["notes"],
897
+ }
898
+
899
+ # Note any labels that were not submitted
900
+ missing_labels = sorted(VALID_LABELS - set(decoded.keys()))
901
+
902
+ return JSONResponse(content={
903
+ "status": "success",
904
+ "images_received": list(decoded.keys()),
905
+ "labels_missing": missing_labels,
906
+ "report": report,
907
+ })
908
+
909
+
910
+ # ──────────────────────────────────────────────────────────────────────────────
911
+ # STARTUP
912
+ # ──────────────────────────────────────────────────────────────────────────────
913
+
914
+ _tok = os.environ.get("HF_TOKEN", "")
915
+ print("=" * 60)
916
+ print(" Amazon Trailer Inspector β€” API mode")
917
+ print(f" HF_TOKEN : {'SET (' + str(len(_tok)) + ' chars)' if _tok else 'NOT SET ⚠️'}")
918
+ print(f" Models : {[m.split('/')[-1] for m in MODELS]}")
919
+ print("=" * 60)
920
+
921
+ if __name__ == "__main__":
922
+ uvicorn.run("app:app", host="0.0.0.0", port=7860, reload=False)