RuslanKain commited on
Commit
5378b38
Β·
1 Parent(s): f656247

add app file and requirements.txt

Browse files
Files changed (2) hide show
  1. app.py +529 -0
  2. requirements.txt +55 -0
app.py ADDED
@@ -0,0 +1,529 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ ╔══════════════════════════════════════════════════════════════════════════════╗
3
+ β•‘ CISC 121 - HAND GESTURE RECOGNITION APP β•‘
4
+ β•‘ Queen's University β•‘
5
+ β•‘ β•‘
6
+ β•‘ PURPOSE: This app uses AI to recognize hand gestures (one, peace, etc.) β•‘
7
+ β•‘ VERSION: Procedural (step-by-step) - Great for beginners! β•‘
8
+ β•‘ β•‘
9
+ β•‘ HOW TO RUN: python app.py β•‘
10
+ β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•
11
+ """
12
+
13
+ # ==============================================================================
14
+ # SECTION 1: IMPORTS
15
+ # ==============================================================================
16
+ # What are imports?
17
+ # Imports let us use code that other people wrote.
18
+ # Instead of writing everything from scratch, we can use "libraries".
19
+ #
20
+ # Think of it like borrowing tools:
21
+ # - gradio = tools for building web pages
22
+ # - transformers = tools for AI/machine learning
23
+ # - time = tools for measuring how long things take
24
+ # - os = tools for working with the operating system (like reading files)
25
+ # ==============================================================================
26
+
27
+ import gradio as gr
28
+ # "gr" is a short nickname for "gradio" - it saves us typing!
29
+ # Example: instead of gradio.Button(), we can write gr.Button()
30
+
31
+ from transformers import pipeline
32
+ # "pipeline" is a function that makes using AI models easy.
33
+ # It handles all the complicated setup for us.
34
+
35
+ from time import perf_counter
36
+ # "perf_counter" is like a stopwatch - it measures time very precisely.
37
+
38
+ import os
39
+ # "os" lets us interact with the operating system
40
+ # We use it to read environment variables (like secret tokens)
41
+
42
+
43
+ # ==============================================================================
44
+ # SECTION 2: CONFIGURATION (SETTINGS)
45
+ # ==============================================================================
46
+ # What is configuration?
47
+ # These are settings we can change to customize how the app works.
48
+ # By putting them at the top, they're easy to find and modify.
49
+ # ==============================================================================
50
+
51
+ # The AI model we will use for hand gesture recognition
52
+ #
53
+ # MODEL OPTIONS:
54
+ # 1. "dima806/hand_gestures_image_detection" (RECOMMENDED)
55
+ # - Recognizes: one, two, three, four, fist, ok, like, peace, etc.
56
+ # - Trained specifically for hand gestures!
57
+ #
58
+ # 2. "google/vit-base-patch16-224" (General purpose)
59
+ # - Recognizes 1000 everyday objects (cats, cars, etc.)
60
+ # - NOT trained for hand gestures - won't work for finger counting
61
+ #
62
+ # 3. "microsoft/resnet-50" (General purpose, faster)
63
+ # - Similar to Google's model, but faster
64
+ #
65
+ MODEL_NAME = "dima806/hand_gestures_image_detection"
66
+
67
+ # Hugging Face Token (Optional but recommended)
68
+ # Some models require authentication to download.
69
+ # Get your free token at: https://huggingface.co/settings/tokens
70
+ #
71
+ # Option 1: Set as environment variable (recommended for security)
72
+ # export HF_TOKEN="your_token_here"
73
+ #
74
+ # Option 2: Paste directly here (less secure, but okay for learning)
75
+ # HF_TOKEN = "hf_xxxxxxxxxxxxxxxxxxxxx"
76
+ #
77
+ HF_TOKEN = os.environ.get("HF_TOKEN", None)
78
+ # os.environ.get() tries to read the HF_TOKEN from environment variables
79
+ # If not found, it returns None (which means "no token")
80
+
81
+ # App title and description
82
+ APP_TITLE = "## πŸŽ“ CISC 121 - Hand Gesture Recognition App"
83
+ APP_DESCRIPTION = """
84
+ Welcome! This app uses AI to recognize **hand gestures**.
85
+
86
+ **Supported Gestures:**
87
+ βœ‹ one, ✌️ two/peace, 🀟 three, πŸ–– four, ✊ fist, πŸ‘ like, πŸ‘Ž dislike, πŸ‘Œ ok, 🀚 stop
88
+
89
+ **How to use:**
90
+ 1. **Upload an image** OR **use your webcam**
91
+ 2. Show a hand gesture clearly in frame
92
+ 3. Click **"πŸ” Analyze Image"** to see the AI's prediction
93
+
94
+ > πŸ’‘ **Tip:** Make sure your hand is well-lit and clearly visible!
95
+ """
96
+
97
+
98
+ # ==============================================================================
99
+ # SECTION 3: HELPER FUNCTIONS
100
+ # ==============================================================================
101
+ # What are functions?
102
+ # Functions are reusable blocks of code that do one specific job.
103
+ # We give them a name, and then we can "call" them whenever we need them.
104
+ #
105
+ # Why use functions?
106
+ # 1. Reusability - write once, use many times
107
+ # 2. Organization - break big problems into small pieces
108
+ # 3. Readability - give meaningful names to actions
109
+ # ==============================================================================
110
+
111
+ def create_greeting(name):
112
+ """
113
+ Creates a personalized greeting message.
114
+
115
+ What is a docstring? (This text you're reading!)
116
+ A docstring explains what a function does.
117
+ It helps other programmers (and future you!) understand the code.
118
+
119
+ Parameters:
120
+ -----------
121
+ name : str
122
+ The name of the person to greet.
123
+ "str" means "string" - a piece of text.
124
+
125
+ Returns:
126
+ --------
127
+ str
128
+ A greeting message as a string.
129
+
130
+ Example:
131
+ --------
132
+ >>> create_greeting("Alice")
133
+ "Hello Alice! Welcome to CISC 121!"
134
+ """
135
+ # f-strings let us put variables inside text
136
+ # The {name} gets replaced with the actual value of 'name'
137
+ greeting = f"Hello {name}! Welcome to CISC 121!"
138
+ return greeting
139
+
140
+
141
+ def analyze_image(image):
142
+ """
143
+ Sends an image to the AI model and gets back predictions.
144
+
145
+ How does this work?
146
+ 1. We send the image to Hugging Face's servers
147
+ 2. The AI model analyzes the image
148
+ 3. We get back a list of predictions with confidence scores
149
+
150
+ Parameters:
151
+ -----------
152
+ image : PIL.Image or numpy.ndarray
153
+ The image to analyze. Gradio handles the format for us.
154
+
155
+ Returns:
156
+ --------
157
+ tuple
158
+ A tuple containing:
159
+ - results (list): The AI's predictions
160
+ - elapsed_time (float): How long the analysis took in seconds
161
+
162
+ What is a tuple?
163
+ A tuple is like a container that holds multiple values.
164
+ We use it when a function needs to return more than one thing.
165
+ """
166
+ # Safety check: make sure we actually received an image
167
+ # "None" means "nothing" - the user might not have taken a photo yet
168
+ if image is None:
169
+ print("⚠️ No image provided")
170
+ return None, 0.0
171
+
172
+ # Debug: Print what type of image we received
173
+ print(f"πŸ“· Received image type: {type(image)}")
174
+ print(f"πŸ“· Image info: {image if not hasattr(image, 'size') else f'Size: {image.size}'}")
175
+
176
+ # Start the stopwatch
177
+ start_time = perf_counter()
178
+
179
+ # Create the AI classifier
180
+ # "pipeline" sets up everything we need to use the model
181
+ try:
182
+ print(f"πŸ”„ Loading model: {MODEL_NAME}")
183
+ print(f"πŸ”‘ HF Token: {'Set' if HF_TOKEN else 'Not set (may limit some models)'}")
184
+
185
+ # Create the classifier with optional token
186
+ classifier = pipeline(
187
+ task="image-classification", # What kind of task?
188
+ model=MODEL_NAME, # Which AI model to use?
189
+ token=HF_TOKEN # Authentication token (optional)
190
+ )
191
+
192
+ print("πŸ“· Analyzing image...")
193
+
194
+ # Handle different image formats that Gradio might send
195
+ # Gradio can send: PIL Image, numpy array, or file path
196
+ from PIL import Image
197
+
198
+ if isinstance(image, str):
199
+ # It's a file path - open it
200
+ print(" (Converting from file path)")
201
+ image = Image.open(image)
202
+ elif hasattr(image, 'convert'):
203
+ # It's already a PIL Image - ensure it's in RGB format
204
+ print(" (Image is PIL format)")
205
+ if image.mode != 'RGB':
206
+ image = image.convert('RGB')
207
+ else:
208
+ # It might be a numpy array - convert to PIL
209
+ print(" (Converting from numpy array)")
210
+ import numpy as np
211
+ if isinstance(image, np.ndarray):
212
+ image = Image.fromarray(image)
213
+
214
+ # Send the image to the model and get predictions
215
+ results = classifier(image)
216
+
217
+ print(f"βœ… Analysis complete! Found {len(results)} predictions.")
218
+
219
+ except Exception as error:
220
+ # If something goes wrong, we catch the error
221
+ # This prevents the app from crashing
222
+ print(f"❌ Error during image analysis: {error}")
223
+ print(f" Error type: {type(error).__name__}")
224
+
225
+ # Print full traceback for debugging
226
+ import traceback
227
+ traceback.print_exc()
228
+
229
+ # Common error explanations
230
+ if "401" in str(error) or "unauthorized" in str(error).lower():
231
+ print(" πŸ’‘ This might be an authentication issue. Try setting HF_TOKEN.")
232
+ elif "connection" in str(error).lower() or "network" in str(error).lower():
233
+ print(" πŸ’‘ Check your internet connection.")
234
+ elif "memory" in str(error).lower():
235
+ print(" πŸ’‘ The model might be too large. Try a smaller model.")
236
+
237
+ return None, 0.0
238
+
239
+ # Stop the stopwatch
240
+ end_time = perf_counter()
241
+
242
+ # Calculate how long it took
243
+ elapsed_time = end_time - start_time
244
+
245
+ return results, elapsed_time
246
+
247
+
248
+ def format_results(results, elapsed_time):
249
+ """
250
+ Formats the AI predictions into a readable string.
251
+
252
+ Why format results?
253
+ The raw data from the AI is hard to read.
254
+ We transform it into a nice, human-friendly format.
255
+
256
+ Parameters:
257
+ -----------
258
+ results : list or None
259
+ The predictions from the AI model.
260
+ Each prediction has a 'label' and a 'score' (confidence).
261
+
262
+ elapsed_time : float
263
+ How long the analysis took, in seconds.
264
+
265
+ Returns:
266
+ --------
267
+ str
268
+ A formatted string showing the predictions.
269
+ """
270
+ # Handle the case where analysis failed
271
+ if results is None:
272
+ return "❌ Could not analyze the image. Please try again."
273
+
274
+ # Start building our output message
275
+ output_lines = []
276
+
277
+ # Add a header
278
+ output_lines.append("## πŸ” Analysis Results\n")
279
+ output_lines.append(f"⏱️ *Analysis completed in {elapsed_time:.2f} seconds*\n")
280
+
281
+ # What does :.2f mean?
282
+ # It formats a number to show 2 decimal places.
283
+ # Example: 1.23456 becomes "1.23"
284
+
285
+ output_lines.append("### Top Predictions:\n")
286
+
287
+ # Loop through the top 5 predictions
288
+ # enumerate() gives us both the index (i) and the item (prediction)
289
+ for i, prediction in enumerate(results[:5]):
290
+ label = prediction['label'] # What the AI thinks it sees
291
+ score = prediction['score'] # How confident it is (0 to 1)
292
+ percentage = score * 100 # Convert to percentage
293
+
294
+ # Add a medal emoji for top 3
295
+ if i == 0:
296
+ medal = "πŸ₯‡"
297
+ elif i == 1:
298
+ medal = "πŸ₯ˆ"
299
+ elif i == 2:
300
+ medal = "πŸ₯‰"
301
+ else:
302
+ medal = " "
303
+
304
+ output_lines.append(f"{medal} **{label}**: {percentage:.1f}%\n")
305
+
306
+ # Join all lines into one string
307
+ # '\n' means "new line" (like pressing Enter)
308
+ return ''.join(output_lines)
309
+
310
+
311
+ # ==============================================================================
312
+ # SECTION 4: MAIN APPLICATION
313
+ # ==============================================================================
314
+ # This section builds the actual web interface.
315
+ # We use Gradio's "Blocks" system to create a custom layout.
316
+ #
317
+ # What is gr.Blocks()?
318
+ # It's like a container for our app.
319
+ # Everything inside the "with" block becomes part of the interface.
320
+ #
321
+ # What does "with" do?
322
+ # "with" creates a context - it's like saying "everything in here belongs together"
323
+ # When we exit the "with" block, Gradio knows our app is complete.
324
+ # ==============================================================================
325
+
326
+ def create_app():
327
+ """
328
+ Creates and returns the Gradio application.
329
+
330
+ Why put this in a function?
331
+ 1. It keeps the code organized
332
+ 2. We can easily test or modify the app
333
+ 3. It's a good habit for larger programs
334
+
335
+ Returns:
336
+ --------
337
+ gr.Blocks
338
+ The complete Gradio application, ready to launch.
339
+ """
340
+
341
+ # Create the app container
342
+ with gr.Blocks(
343
+ title="CISC 121 Gesture App", # Browser tab title
344
+ theme=gr.themes.Soft() # A nice, modern look
345
+ ) as app:
346
+
347
+ # ----------------------------------------------------------------------
348
+ # PART A: HEADER SECTION
349
+ # ----------------------------------------------------------------------
350
+ # gr.Markdown() lets us add formatted text using Markdown syntax
351
+ # Markdown is a simple way to format text (like in README files)
352
+
353
+ gr.Markdown(APP_TITLE)
354
+ gr.Markdown(APP_DESCRIPTION)
355
+
356
+ # Add a horizontal line for visual separation
357
+ gr.Markdown("---")
358
+
359
+ # ----------------------------------------------------------------------
360
+ # PART B: IMAGE INPUT AND RESULTS SECTION
361
+ # ----------------------------------------------------------------------
362
+ # gr.Row() puts components side by side (horizontal layout)
363
+ # gr.Column() stacks components on top of each other (vertical layout)
364
+
365
+ with gr.Row():
366
+
367
+ # Left column: Image input
368
+ with gr.Column(scale=1):
369
+ gr.Markdown("### πŸ“Έ Image Input")
370
+
371
+ # Create tabs for different input methods
372
+ # This makes it clearer for users how to provide an image
373
+ with gr.Tabs():
374
+
375
+ # Tab 1: Upload an image file
376
+ with gr.TabItem("πŸ“ Upload"):
377
+ upload_input = gr.Image(
378
+ label="Click to upload or drag an image here",
379
+ sources=["upload"],
380
+ type="pil",
381
+ height=250
382
+ )
383
+
384
+ # Tab 2: Use webcam (captures on click)
385
+ with gr.TabItem("πŸ“· Webcam"):
386
+ webcam_input = gr.Image(
387
+ label="Click the πŸ“· button below the preview to capture",
388
+ sources=["webcam"],
389
+ type="pil",
390
+ height=250,
391
+ mirror_webcam=True
392
+ )
393
+
394
+ # Status indicator - shows when image is ready
395
+ status_display = gr.Markdown("πŸ‘† *Choose a tab above and provide an image*")
396
+
397
+ # The submit button
398
+ submit_button = gr.Button(
399
+ value="πŸ” Analyze Image",
400
+ variant="primary",
401
+ size="lg"
402
+ )
403
+
404
+ # Right column: Results
405
+ with gr.Column(scale=1):
406
+ gr.Markdown("### πŸ“Š Results")
407
+
408
+ # gr.Markdown() can also display dynamic content
409
+ # We'll update this when the user clicks the button
410
+ results_display = gr.Markdown(
411
+ value="*Upload or capture an image, then click 'Analyze Image' to see results.*"
412
+ )
413
+
414
+ # ----------------------------------------------------------------------
415
+ # PART C: CONNECTING COMPONENTS (EVENT HANDLING)
416
+ # ----------------------------------------------------------------------
417
+ # Now we connect the inputs to our functions.
418
+ # We have TWO input sources (upload and webcam) that both need to work.
419
+
420
+ # State variable to store the current image (from either source)
421
+ # gr.State() is a special Gradio component that stores data between interactions
422
+ current_image = gr.State(value=None)
423
+
424
+ def on_upload(image):
425
+ """Called when user uploads an image."""
426
+ if image is not None:
427
+ return image, "βœ… **Image uploaded!** Click 'Analyze Image' to continue."
428
+ return None, "πŸ‘† *Choose a tab above and provide an image*"
429
+
430
+ def on_webcam_capture(image):
431
+ """Called when user captures from webcam."""
432
+ if image is not None:
433
+ return image, "βœ… **Photo captured!** Click 'Analyze Image' to continue."
434
+ return None, "πŸ‘† *Choose a tab above and provide an image*"
435
+
436
+ def on_submit(stored_image):
437
+ """
438
+ This function runs when the user clicks the submit button.
439
+
440
+ It's called an "event handler" because it handles the click event.
441
+
442
+ Parameters:
443
+ -----------
444
+ stored_image : PIL.Image
445
+ The image stored from upload or webcam capture.
446
+
447
+ Returns:
448
+ --------
449
+ str
450
+ Formatted results to display.
451
+ """
452
+ # Check if we have an image
453
+ if stored_image is None:
454
+ return "⚠️ **No image detected!**\n\n**To fix this:**\n\nπŸ“ **Upload Tab:** Click the upload area and select an image file\n\nπŸ“· **Webcam Tab:** Click the camera button (πŸ“·) to capture a photo\n\nThen click 'Analyze Image' again."
455
+
456
+ # Step 1: Analyze the image
457
+ results, elapsed_time = analyze_image(stored_image)
458
+
459
+ # Step 2: Format the results nicely
460
+ formatted = format_results(results, elapsed_time)
461
+
462
+ # Step 3: Return the formatted text (Gradio displays it)
463
+ return formatted
464
+
465
+ # Connect upload input - when image changes, store it
466
+ upload_input.change(
467
+ fn=on_upload,
468
+ inputs=[upload_input],
469
+ outputs=[current_image, status_display]
470
+ )
471
+
472
+ # Connect webcam input - when image is captured, store it
473
+ webcam_input.change(
474
+ fn=on_webcam_capture,
475
+ inputs=[webcam_input],
476
+ outputs=[current_image, status_display]
477
+ )
478
+
479
+ # Connect the button click to analyze the stored image
480
+ submit_button.click(
481
+ fn=on_submit,
482
+ inputs=[current_image],
483
+ outputs=[results_display]
484
+ )
485
+
486
+ # ----------------------------------------------------------------------
487
+ # PART D: FOOTER
488
+ # ----------------------------------------------------------------------
489
+ gr.Markdown("---")
490
+ gr.Markdown(
491
+ "*Made for CISC 121 at Queen's University* πŸŽ“"
492
+ )
493
+
494
+ # Return the completed app
495
+ return app
496
+
497
+
498
+ # ==============================================================================
499
+ # SECTION 5: RUNNING THE APP
500
+ # ==============================================================================
501
+ # This is where we actually start the application.
502
+ #
503
+ # What does if __name__ == "__main__" mean?
504
+ # This checks if we're running this file directly (not importing it).
505
+ # If we run: python hf_gradio_proj.py β†’ this code runs
506
+ # If we import: from hf_gradio_proj import create_app β†’ this code doesn't run
507
+ #
508
+ # Why is this useful?
509
+ # It lets us use the same file in two ways:
510
+ # 1. As a standalone app (run it directly)
511
+ # 2. As a module (import functions into other files)
512
+ # ==============================================================================
513
+
514
+ if __name__ == "__main__":
515
+ # Print a welcome message to the terminal
516
+ print("=" * 60)
517
+ print("πŸŽ“ CISC 121 - Gesture Recognition App")
518
+ print("=" * 60)
519
+ print("Starting the application...")
520
+ print("Once ready, open the URL shown below in your browser.")
521
+ print("=" * 60)
522
+
523
+ # Create the app
524
+ app = create_app()
525
+
526
+ # Launch the app
527
+ # share=True creates a public URL anyone can access
528
+ # This is useful for sharing with classmates or instructors
529
+ app.launch(share=True)
requirements.txt ADDED
@@ -0,0 +1,55 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # ==============================================================================
2
+ # CISC 121 - Gesture Recognition App
3
+ # Required Python Packages
4
+ # ==============================================================================
5
+ #
6
+ # HOW TO INSTALL:
7
+ # Open your terminal and run:
8
+ # pip install -r requirements.txt
9
+ #
10
+ # WHAT IS THIS FILE?
11
+ # This file lists all the external libraries (packages) our project needs.
12
+ # Python's package manager (pip) reads this file and installs everything.
13
+ #
14
+ # ==============================================================================
15
+
16
+ # ------------------------------------------------------------------------------
17
+ # CORE PACKAGES
18
+ # ------------------------------------------------------------------------------
19
+
20
+ # Gradio - Creates web interfaces for Python apps
21
+ # Website: https://gradio.app
22
+ # We use it to build the camera input and results display
23
+ gradio>=4.0.0
24
+
25
+ # Transformers - Hugging Face's AI/ML library
26
+ # Website: https://huggingface.co/transformers
27
+ # We use it to access pre-trained image classification models
28
+ transformers>=4.30.0
29
+
30
+ # ------------------------------------------------------------------------------
31
+ # IMAGE PROCESSING
32
+ # ------------------------------------------------------------------------------
33
+
34
+ # Pillow - Python Imaging Library
35
+ # Used for loading and processing images
36
+ Pillow>=9.0.0
37
+
38
+ # ------------------------------------------------------------------------------
39
+ # MACHINE LEARNING BACKEND
40
+ # ------------------------------------------------------------------------------
41
+
42
+ # PyTorch - Deep learning framework
43
+ # Required by transformers to run the AI models
44
+ # Note: This is a large package (~2GB) - installation may take a few minutes
45
+ torch>=2.0.0
46
+
47
+ # ------------------------------------------------------------------------------
48
+ # OPTIONAL (for development)
49
+ # ------------------------------------------------------------------------------
50
+
51
+ # Uncomment these if you want additional development tools:
52
+
53
+ # pytest # For running tests
54
+ # black # For code formatting
55
+ # flake8 # For code style checking