Asmit commited on
Commit
9012453
·
0 Parent(s):

Initial commit

Browse files
.gitattributes ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
.gitignore ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ __pycache__/
2
+ *.py[cod]
3
+ *$py.class
4
+ *.so
5
+ .Python
6
+ build/
7
+ develop-eggs/
8
+ dist/
9
+ downloads/
10
+ eggs/
11
+ .eggs/
12
+ lib/
13
+ lib64/
14
+ parts/
15
+ sdist/
16
+ var/
17
+ wheels/
18
+ *.egg-info/
19
+ .installed.cfg
20
+ *.egg
21
+
22
+ .env
23
+ .venv
24
+ env/
25
+ venv/
26
+ ENV/
27
+ env.bak/
28
+ venv.bak/
29
+
30
+ .DS_Store
31
+ .DS_Store?
32
+ ._*
33
+ .Spotlight-V100
34
+ .Trashes
35
+ ehthumbs.db
36
+ Thumbs.db
37
+
38
+ *.log
39
+ logs/
40
+ temp/
41
+ tmp/
42
+
43
+ # IDE
44
+ .vscode/
45
+ .idea/
46
+ *.swp
47
+ *.swo
48
+
49
+ # Model files
50
+ *.pt
51
+ *.pth
52
+ models/
53
+
54
+ # Credentials
55
+ key.json
56
+ *.json
57
+
58
+ # Gradio
59
+ gradio_cached_examples/
60
+ flagged/
61
+
62
+ run.localconfig
README.md ADDED
@@ -0,0 +1,64 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: DeceptivePatternDetector
3
+ emoji: 🐨
4
+ colorFrom: purple
5
+ colorTo: red
6
+ sdk: gradio
7
+ sdk_version: 5.46.0
8
+ app_file: app.py
9
+ pinned: false
10
+ license: cc-by-nc-4.0
11
+ ---
12
+
13
+ # 🔍 Deceptive Pattern Detector
14
+
15
+ An AI-powered tool that analyzes website screenshots to detect potentially deceptive design patterns (also known as "dark patterns").
16
+
17
+ ## 🚀 Features
18
+
19
+ - **Image Upload**: Upload screenshots of websites for analysis
20
+ - **OCR Analysis**: Extracts text and UI elements from images
21
+ - **Element Detection**: Identifies buttons, checkboxes, and other interactive elements
22
+ - **AI Analysis**: Uses Google Gemini AI to classify potential deceptive patterns
23
+ - **Pattern Categories**: Detects various types including:
24
+ - Confirm-shaming
25
+ - Urgency manipulation
26
+ - Scarcity tactics
27
+ - Misdirection
28
+ - Privacy violations
29
+ - And more...
30
+
31
+ ## 📋 How to Use
32
+
33
+ 1. **Upload Image**: Take a screenshot of a website and upload it
34
+ 2. **API Key**: Enter your Google Gemini API key ([Get one here](https://makersuite.google.com/app/apikey))
35
+ 3. **Analyze**: Click the analyze button and wait for results
36
+ 4. **Review**: Examine the detected patterns and explanations
37
+
38
+ ## 🔧 Requirements
39
+
40
+ - Google Gemini API key for AI analysis
41
+ - Google Cloud Vision API credentials (optional, for enhanced OCR)
42
+
43
+ ## 🛠️ Technical Details
44
+
45
+ This tool combines:
46
+ - **Computer Vision**: For UI element detection
47
+ - **OCR**: For text extraction using Google Cloud Vision
48
+ - **AI Analysis**: Using Google Gemini for pattern classification
49
+ - **Rule-based Fallbacks**: For basic detection when AI is unavailable
50
+
51
+ ## ⚠️ Disclaimer
52
+
53
+ This tool uses AI analysis and may not catch all deceptive patterns or may flag legitimate design elements. The results should be used as a supplementary guide and not as a definitive assessment.
54
+
55
+ ## 🏗️ Architecture
56
+
57
+ - **Frontend**: Gradio interface
58
+ - **Backend**: Python with simplified computer vision
59
+ - **AI**: Google Gemini for pattern analysis
60
+ - **Deployment**: HuggingFace Spaces compatible
61
+
62
+ ## 📄 License
63
+
64
+ MIT License - See LICENSE file for details.
app.py ADDED
@@ -0,0 +1,1156 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import shutil
3
+ import tempfile
4
+ import time
5
+ import uuid
6
+ from pathlib import Path
7
+
8
+ import gradio as gr
9
+ import pandas as pd
10
+ import pybboxes as pbx
11
+ from PIL import Image
12
+ from huggingface_hub import CommitScheduler
13
+
14
+ # Import our custom modules
15
+ from py_files import yolo
16
+ from py_files import dataset_upload
17
+ from py_files.ocr import get_text_from_image_doc
18
+
19
+ def take_screenshot_and_process(url, gemini_api_key):
20
+ """
21
+ Take a screenshot of the provided URL and process it for deceptive pattern detection.
22
+ Returns (dataframe, status_message, image_path, eval_dir_for_cleanup)
23
+ """
24
+ print(f"\n[CONSOLE] ===== STARTING ANALYSIS PROCESS =====")
25
+ print(f"[CONSOLE] URL: {url}")
26
+ print(f"[CONSOLE] Gemini API Key provided: {'Yes' if gemini_api_key else 'No'}")
27
+
28
+ if not url or not (url.startswith("http://") or url.startswith("https://")):
29
+ print(f"[CONSOLE] ERROR: Invalid URL format - {url}")
30
+ yield (None, "❌ Invalid URL format - please use http:// or https://", None, None)
31
+ raise gr.Error("Please enter a valid URL (starting with http:// or https://).")
32
+
33
+ if not gemini_api_key:
34
+ print(f"[CONSOLE] ERROR: No Gemini API key provided")
35
+ yield (None, "❌ No Gemini API key provided", None, None)
36
+ raise gr.Error("Please provide a Gemini API Key.")
37
+
38
+ # Set the Gemini API key in the environment
39
+ os.environ["GEMINI_API"] = gemini_api_key
40
+ print(f"[CONSOLE] Gemini API key set in environment")
41
+
42
+ # Create temporary directory for processing
43
+ eval_dir = tempfile.mkdtemp()
44
+ print(f"[CONSOLE] Created temporary directory: {eval_dir}")
45
+
46
+ try:
47
+ # Step 1: Taking screenshot
48
+ print(f"[CONSOLE] STEP 1/6: Taking screenshot of the website...")
49
+ yield (None, "Step 1/6: Taking screenshot of the website...", None, eval_dir)
50
+ screenshots_dir = os.path.join(eval_dir, "screenshots")
51
+ ocr_dir = os.path.join(eval_dir, "ocr")
52
+ yolo_dir = os.path.join(eval_dir, "yolo")
53
+ csv_yolo_dir = os.path.join(eval_dir, "csv_with_yolo")
54
+ gemini_fs_dir = os.path.join(eval_dir, "gemini_fs")
55
+
56
+ for d in [screenshots_dir, ocr_dir, yolo_dir, csv_yolo_dir, gemini_fs_dir]:
57
+ os.makedirs(d, exist_ok=True)
58
+ print(f"[CONSOLE] Created directory: {d}")
59
+
60
+ # Take screenshot using Selenium
61
+ image_path = os.path.join(screenshots_dir, "screenshot.png")
62
+ print(f"[CONSOLE] Taking screenshot, saving to: {image_path}")
63
+ image = take_website_screenshot(url, image_path)
64
+ print(f"[CONSOLE] Screenshot completed successfully")
65
+
66
+ # Display the original screenshot immediately
67
+ print(f"[CONSOLE] Displaying original screenshot")
68
+ yield (None, "📷 Screenshot captured! Starting analysis...", image_path, eval_dir)
69
+
70
+ # Step 2: Setup directories
71
+ print(f"[CONSOLE] STEP 2/6: Setting up processing directories...")
72
+ yield (None, "Step 2/6: Setting up processing directories...", image_path, eval_dir)
73
+
74
+ # Step 3: Run OCR
75
+ print(f"[CONSOLE] STEP 3/6: Running OCR analysis...")
76
+ yield (None, "Step 3/6: Running OCR analysis...", image_path, eval_dir)
77
+ csv_path = os.path.join(ocr_dir, "screenshot.csv")
78
+ print(f"[CONSOLE] Running OCR on image...")
79
+ ocr_result = get_text_from_image_doc(image)[0]
80
+ ocr_df = ocr_result.get_dataframe(image)
81
+ ocr_df.to_csv(csv_path, index=False)
82
+ print(f"[CONSOLE] OCR completed, saved to: {csv_path}")
83
+ print(f"[CONSOLE] OCR found {len(ocr_df)} text elements")
84
+
85
+ # Step 4: Run YOLO object detection
86
+ print(f"[CONSOLE] STEP 4/6: Running YOLO object detection...")
87
+ yield (None, "Step 4/6: Running YOLO object detection...", image_path, eval_dir)
88
+ yolo_result_path = os.path.join(yolo_dir, "screenshot.txt")
89
+
90
+ # Use real YOLO ensemble
91
+ print(f"[CONSOLE] Loading YOLO ensemble models...")
92
+ models = yolo.YoloEnsemble(weights=["models/vision/16.pt", "models/vision/15.pt", "models/vision/14.pt"])
93
+ print(f"[CONSOLE] Running YOLO prediction with confidence threshold 0.3...")
94
+ results = models.predict(image_path, conf=0.3, verbose=True)
95
+
96
+ if results[0].boxes is None:
97
+ print(f"[CONSOLE] YOLO: No objects detected")
98
+ with open(yolo_result_path, 'w') as f:
99
+ f.write("")
100
+ else:
101
+ print(f"[CONSOLE] YOLO: {len(results[0].boxes)} objects detected")
102
+ results[0].save_txt(yolo_result_path)
103
+ print(f"[CONSOLE] YOLO results saved to: {yolo_result_path}")
104
+
105
+ # Step 5: Combine OCR and YOLO results
106
+ print(f"[CONSOLE] STEP 5/6: Combining OCR and element detection results...")
107
+ yield (None, "Step 5/6: Combining OCR and element detection results...", image_path, eval_dir)
108
+ combined_csv_path = os.path.join(csv_yolo_dir, "screenshot.csv")
109
+
110
+ # Combine results using original logic
111
+ print(f"[CONSOLE] Combining OCR and YOLO results...")
112
+ combined_df = combine_ocr_yolo_results_original(ocr_df, yolo_result_path, image)
113
+ combined_df.to_csv(combined_csv_path, index=False)
114
+ print(f"[CONSOLE] Combined results saved to: {combined_csv_path}")
115
+ print(f"[CONSOLE] Combined dataframe has {len(combined_df)} rows")
116
+
117
+ # Step 6: Analyze with Gemini
118
+ print(f"[CONSOLE] STEP 6/6: Analyzing for deceptive patterns with Gemini...")
119
+ yield (None, "Step 6/6: Analyzing for deceptive patterns with Gemini...", image_path, eval_dir)
120
+
121
+ # Use the generator version for real-time notifications
122
+ from py_files.gemini_analysis import few_shots_generator
123
+
124
+ # Enhanced progress reporting for Gemini analysis
125
+ yield (None, "🔧 Preparing data for Gemini analysis...", image_path, eval_dir)
126
+
127
+ # Save the combined results for few_shots processing
128
+ os.makedirs(gemini_fs_dir, exist_ok=True)
129
+ print(f"[CONSOLE] Running Gemini few_shots analysis...")
130
+ print(f"[CONSOLE] Input file: {combined_csv_path}")
131
+
132
+ yield (None, f"📊 Processing {len(combined_df)} UI elements for deceptive pattern analysis...", image_path, eval_dir)
133
+
134
+ # Use the generator version that yields real-time notifications
135
+ final_df = None
136
+ try:
137
+ for status, data in few_shots_generator(eval_dir=eval_dir, files=[combined_csv_path], api_key=gemini_api_key):
138
+ if status == 'notification':
139
+ # Yield the notification immediately to the UI
140
+ yield None, data, image_path, eval_dir
141
+ elif status == 'result':
142
+ final_df = data
143
+ break
144
+ print(f"[CONSOLE] Gemini analysis completed")
145
+ except gr.Error:
146
+ # Re-raise gr.Error exceptions as they should propagate to the UI
147
+ print(f"[CONSOLE] Gemini analysis raised gr.Error, propagating...")
148
+ raise
149
+ except Exception as gemini_error:
150
+ # Handle any other unexpected errors from Gemini analysis
151
+ print(f"[CONSOLE] Unexpected error in Gemini analysis: {str(gemini_error)}")
152
+ error_msg = f"❌ Gemini analysis failed: {str(gemini_error)}"
153
+ yield (None, error_msg, image_path, eval_dir)
154
+ # Don't raise here - let the function continue with final_df = None
155
+
156
+ if final_df is None:
157
+ print(f"[CONSOLE] Gemini analysis failed completely")
158
+ yield (None, "❌ Gemini analysis failed - please check your API key and try again", image_path, eval_dir)
159
+
160
+ if final_df is not None:
161
+ print(f"[CONSOLE] Final analysis result: {len(final_df)} rows detected")
162
+
163
+ deceptive_count = len(final_df[final_df['Deceptive Design Category'].str.lower() != 'non-deceptive']) if 'Deceptive Design Category' in final_df.columns else 0
164
+ total_count = len(final_df)
165
+ yield (None, f"📊 Analysis complete! Found {deceptive_count} deceptive patterns out of {total_count} UI elements", image_path, eval_dir)
166
+ yield (None, "🎨 Creating annotated screenshot with colored highlights...", image_path, eval_dir)
167
+
168
+ # Create annotated screenshot
169
+ annotated_path = create_annotated_screenshot(image_path, final_df, eval_dir)
170
+ print(f"[CONSOLE] Annotated screenshot created at: {annotated_path}")
171
+
172
+ # Yield the final results with annotated screenshot replacing the original
173
+ # annotated_path will always be valid now (either annotated or original as fallback)
174
+ status_message = "✅ Analysis complete! All elements annotated with colored bounding boxes."
175
+ if annotated_path == image_path:
176
+ status_message = "✅ Analysis complete! (Note: Screenshot annotation failed, showing original)"
177
+
178
+ yield (final_df, status_message, annotated_path, eval_dir)
179
+ else:
180
+ print(f"[CONSOLE] WARNING: Final analysis result is None")
181
+ yield (None, "❌ Analysis failed - unable to process results", None, eval_dir)
182
+
183
+ print(f"[CONSOLE] ===== ANALYSIS PROCESS COMPLETED =====")
184
+
185
+ except Exception as e:
186
+ print(f"[CONSOLE] ERROR in take_screenshot_and_process: {str(e)}")
187
+ print(f"[CONSOLE] Exception type: {type(e).__name__}")
188
+
189
+ # Send notification to user about the error before yielding error state
190
+ error_msg = f"❌ Error occurred: {str(e)}"
191
+ yield (None, error_msg, None, eval_dir)
192
+
193
+ raise gr.Error(f"Error processing website: {str(e)}")
194
+
195
+
196
+ def cleanup_temp_directory(eval_dir):
197
+ """
198
+ Clean up temporary files after image has been displayed to the frontend.
199
+ This should be called after the UI has had time to display the image.
200
+ """
201
+ if not eval_dir:
202
+ return
203
+
204
+ try:
205
+ print(f"[CONSOLE] Cleaning up temporary directory: {eval_dir}")
206
+ if os.path.exists(eval_dir):
207
+ shutil.rmtree(eval_dir)
208
+ print(f"[CONSOLE] Cleanup completed successfully")
209
+ else:
210
+ print(f"[CONSOLE] Temp directory {eval_dir} does not exist or was already cleaned up")
211
+ except Exception as cleanup_error:
212
+ print(f"[CONSOLE] WARNING: Failed to cleanup temp directory: {cleanup_error}")
213
+ # Try to clean up individual files if directory removal fails
214
+ try:
215
+ if eval_dir and os.path.exists(eval_dir):
216
+ for root, dirs, files in os.walk(eval_dir):
217
+ for file in files:
218
+ try:
219
+ os.remove(os.path.join(root, file))
220
+ print(f"[CONSOLE] Removed individual file: {file}")
221
+ except Exception as file_error:
222
+ print(f"[CONSOLE] Failed to remove file {file}: {file_error}")
223
+ # Try to remove empty directories
224
+ for root, dirs, files in os.walk(eval_dir, topdown=False):
225
+ for dir in dirs:
226
+ try:
227
+ os.rmdir(os.path.join(root, dir))
228
+ except Exception:
229
+ pass
230
+ # Try to remove the main directory
231
+ os.rmdir(eval_dir)
232
+ print(f"[CONSOLE] Manual cleanup completed")
233
+ except Exception as manual_cleanup_error:
234
+ print(f"[CONSOLE] ERROR: Complete cleanup failure: {manual_cleanup_error}")
235
+ print(f"[CONSOLE] Temp directory may not be fully cleaned: {eval_dir}")
236
+
237
+ def take_website_screenshot(url, output_path):
238
+ """
239
+ Take a screenshot of a website using Selenium WebDriver.
240
+ """
241
+ print(f"[CONSOLE] take_website_screenshot: Starting selenium screenshot capture for {url}")
242
+ print(f"[CONSOLE] Output path: {output_path}")
243
+
244
+ from selenium import webdriver
245
+ from selenium.webdriver.chrome.options import Options
246
+ import time
247
+
248
+ try:
249
+ # Setup Chrome options for headless mode
250
+ print(f"[CONSOLE] Setting up Chrome WebDriver in headless mode...")
251
+ chrome_options = Options()
252
+ chrome_options.add_argument("--headless")
253
+ chrome_options.add_argument("--no-sandbox")
254
+ chrome_options.add_argument("--disable-dev-shm-usage")
255
+ # chrome_options.add_argument("--disable-gpu")
256
+ chrome_options.add_argument("--window-size=1280,1024")
257
+ # chrome_options.add_argument("--disable-extensions")
258
+ # chrome_options.add_argument("--disable-plugins")
259
+ # chrome_options.add_argument("--disable-images") # Faster loading
260
+ # chrome_options.add_argument("--disable-javascript") # Faster loading, optional
261
+
262
+ # Create WebDriver instance
263
+ print(f"[CONSOLE] Creating Chrome WebDriver instance...")
264
+
265
+
266
+ css_to_inject = ":root { color-scheme: only light; }"
267
+ javascript_code = """
268
+ var style = document.createElement('style');
269
+ style.type = 'text/css';
270
+ style.innerHTML = arguments[0];
271
+ document.head.appendChild(style);
272
+ """
273
+
274
+ driver = webdriver.Chrome(options=chrome_options)
275
+ driver.set_window_size(1280, 1024)
276
+ driver.execute_script(javascript_code, css_to_inject)
277
+ time.sleep(0.5)
278
+
279
+ try:
280
+ # Set page load timeout
281
+ driver.set_page_load_timeout(30)
282
+
283
+ # Navigate to the URL
284
+ print(f"[CONSOLE] Navigating to URL: {url}")
285
+ driver.get(url)
286
+
287
+ # Wait a bit for the page to render
288
+ print(f"[CONSOLE] Waiting for page to load... (5 secs)")
289
+ time.sleep(5)
290
+
291
+ # Take screenshot
292
+ print(f"[CONSOLE] Taking screenshot...")
293
+ driver.save_screenshot(output_path)
294
+ print(f"[CONSOLE] Screenshot saved to: {output_path}")
295
+
296
+ # Load and return the image
297
+ image = Image.open(output_path)
298
+ print(f"[CONSOLE] Screenshot completed successfully, image size: {image.size}")
299
+ return image
300
+
301
+ finally:
302
+ # Always close the driver
303
+ print(f"[CONSOLE] Closing WebDriver...")
304
+ driver.quit()
305
+
306
+ except Exception as e:
307
+ print(f"[CONSOLE] Exception in selenium screenshot: {str(e)}")
308
+ print(f"[CONSOLE] Exception type: {type(e).__name__}")
309
+ raise Exception(f"Screenshot failed: {str(e)}")
310
+
311
+
312
+
313
+
314
+
315
+
316
+ def combine_ocr_yolo_results_original(ocr_df, yolo_result_path, image):
317
+ """
318
+ Combine OCR results with YOLO detection results using the original logic.
319
+ """
320
+ W, H = image.size
321
+
322
+ # Load YOLO results
323
+ if not os.path.exists(yolo_result_path) or os.path.getsize(yolo_result_path) == 0:
324
+ # If no YOLO results, just add Element Type column and return
325
+ ocr_df['Element Type'] = 'text'
326
+ return ocr_df
327
+
328
+ # Read YOLO results
329
+ yolo_df = pd.read_csv(yolo_result_path, sep=" ", names=["class", "x1", "y1", "x2", "y2"])
330
+
331
+ # Convert YOLO format to pixel coordinates
332
+ for j in range(len(yolo_df)):
333
+ scaled = pbx.convert_bbox(
334
+ [yolo_df.iloc[j]['x1'], yolo_df.iloc[j]['y1'], yolo_df.iloc[j]['x2'], yolo_df.iloc[j]['y2']],
335
+ from_type="yolo", to_type="voc", image_size=(W, H)
336
+ )
337
+ yolo_df.iat[j, 1], yolo_df.iat[j, 2], yolo_df.iat[j, 3], yolo_df.iat[j, 4] = scaled
338
+
339
+ # Class mapping
340
+ cls_dict = {
341
+ 0: "button", 1: "checked checkbox", 2: "unchecked checkbox",
342
+ 3: "checked radio button", 4: "unchecked radio button",
343
+ 5: "checked switch", 6: "unchecked switch"
344
+ }
345
+
346
+ # Ensure coordinate columns exist and are strings before processing
347
+ if 'Top Co-ordinates' not in ocr_df.columns or 'Bottom Co-ordinates' not in ocr_df.columns:
348
+ ocr_df['Element Type'] = 'text'
349
+ return ocr_df
350
+
351
+ # Create coordinates column for easier processing
352
+ ocr_df['Coordinates'] = (
353
+ ocr_df['Top Co-ordinates'].astype(str).str.replace('(', '', regex=False).str.replace(')', '', regex=False) + ', ' +
354
+ ocr_df['Bottom Co-ordinates'].astype(str).str.replace('(', '', regex=False).str.replace(')', '', regex=False)
355
+ )
356
+
357
+ ele_types = ["text"] * len(ocr_df)
358
+ bboxes = yolo_df[['x1', 'y1', 'x2', 'y2']].values.tolist()
359
+ clss = yolo_df['class'].tolist()
360
+ if not isinstance(clss, list):
361
+ clss = [clss]
362
+ coords = ocr_df['Coordinates'].tolist()
363
+
364
+ # Match YOLO detections with OCR text
365
+ for ele_cls, ele_rect in zip(clss, bboxes):
366
+ distance_dict = {}
367
+ for ci, coord in enumerate(coords):
368
+ try:
369
+ rect_text = list(map(float, coord.split(',')))
370
+ except (ValueError, AttributeError):
371
+ continue # Skip if coordinate string is invalid
372
+
373
+ if ele_cls == 0: # button
374
+ if yolo.do_rectangles_overlap(ele_rect, rect_text):
375
+ ele_types[ci] = cls_dict[ele_cls]
376
+ break
377
+ elif ele_cls in [1, 2, 3, 4]: # checkbox or radio
378
+ e_y1, e_y2 = ele_rect[1], ele_rect[3]
379
+ r_y1, r_y2 = rect_text[1], rect_text[3]
380
+ text_mid_y = (r_y1 + r_y2) / 2
381
+ if e_y1 < text_mid_y < e_y2 and rect_text[0] > ele_rect[0] and rect_text[0] - ele_rect[2] < 100:
382
+ distance_dict[rect_text[0] - ele_rect[2]] = ci
383
+
384
+ if ele_cls > 0 and len(distance_dict) > 0:
385
+ ele_types[sorted(distance_dict.items(), key=lambda x: x[0])[0][1]] = cls_dict[ele_cls]
386
+
387
+ ocr_df['Element Type'] = ele_types
388
+ ocr_df = ocr_df.drop(columns=['Coordinates'])
389
+
390
+ # Reorder columns
391
+ cols = ocr_df.columns.tolist()
392
+ cols = cols[:1] + cols[-1:] + cols[1:-1]
393
+ ocr_df = ocr_df[cols]
394
+
395
+ return ocr_df
396
+
397
+ def create_result_display(df):
398
+ """
399
+ Create a display of the analysis results.
400
+ """
401
+ if df is None or df.empty:
402
+ return "No results to display."
403
+
404
+ # Count deceptive patterns
405
+ if 'Deceptive Design Category' in df.columns:
406
+ deceptive_count = len(df[df['Deceptive Design Category'].str.lower() != 'non-deceptive'])
407
+ total_count = len(df)
408
+
409
+ html_output = f"""
410
+ <div style="padding: 20px; border: 1px solid var(--border-color-primary); border-radius: 8px; background-color: var(--block-background-fill); color: var(--body-text-color);">
411
+ <h3 style="color: var(--body-text-color); margin-top: 0;">Analysis Results</h3>
412
+ <p style="color: var(--body-text-color);"><strong>Total elements analyzed:</strong> {total_count}</p>
413
+ <p style="color: var(--body-text-color);"><strong>Potentially deceptive elements:</strong> {deceptive_count}</p>
414
+ <p style="color: var(--body-text-color);"><strong>Non-deceptive elements:</strong> {total_count - deceptive_count}</p>
415
+ </div>
416
+ """
417
+
418
+ return html_output
419
+ else:
420
+ return "Analysis completed, but results format is unexpected."
421
+
422
+
423
+ def create_annotated_screenshot(image_path, df, eval_dir=None):
424
+ """
425
+ Create an annotated screenshot with bounding boxes for deceptive patterns.
426
+ """
427
+ from PIL import Image, ImageDraw, ImageFont
428
+ import tempfile
429
+
430
+ print(f"[CONSOLE] Creating annotated screenshot from: {image_path}")
431
+
432
+ try:
433
+ # Load the original image
434
+ image = Image.open(image_path)
435
+ annotated_image = image.copy()
436
+ draw = ImageDraw.Draw(annotated_image)
437
+
438
+ # Define colors for different deceptive pattern categories
439
+ color_map = {
440
+ 'forced-action': '#FF0000', # Red
441
+ 'interface-interference': '#FF8C00', # Dark Orange
442
+ 'obstruction': '#800080', # Purple
443
+ 'sneaking': '#FF1493', # Deep Pink
444
+ 'confirmshaming': '#FF4500', # Orange Red
445
+ 'nudge': '#32CD32', # Lime Green
446
+ 'fake-scarcity-fake-urgency': '#FFD700', # Gold
447
+ 'hard-to-cancel': '#DC143C', # Crimson
448
+ 'pre-selection': '#8A2BE2', # Blue Violet
449
+ 'visual-interference': '#FF6347', # Tomato
450
+ 'jargon': '#4169E1', # Royal Blue
451
+ 'hidden-subscription': '#B22222', # Fire Brick
452
+ 'hidden-costs': '#CD5C5C', # Indian Red
453
+ 'disguised-ads': '#FF69B4', # Hot Pink
454
+ 'trick-wording': '#FF7F50', # Coral
455
+ 'non-deceptive': '#90EE90' # Light Green (for non-deceptive elements)
456
+ }
457
+
458
+ # Default color for unknown categories
459
+ default_color = '#FFFF00' # Yellow
460
+
461
+ # Try to load a bigger font (at least 2x size)
462
+ try:
463
+ font = ImageFont.truetype("arial.ttf", 18)
464
+ except:
465
+ try:
466
+ font = ImageFont.load_default().font_variant(size=18)
467
+ except:
468
+ font = ImageFont.load_default()
469
+
470
+ deceptive_count = 0
471
+ non_deceptive_count = 0
472
+
473
+ # Track used text positions to avoid overlaps
474
+ used_text_regions = []
475
+
476
+ # Draw bounding boxes for each element
477
+ for idx, row in df.iterrows():
478
+ if 'Deceptive Design Category' not in df.columns:
479
+ continue
480
+
481
+ category = str(row.get('Deceptive Design Category', '')).lower().strip()
482
+ subtype = str(row.get('Deceptive Design Subtype', '')).lower().strip()
483
+
484
+ # Count deceptive vs non-deceptive elements
485
+ if category == 'non-deceptive' or category == 'not-applicable':
486
+ non_deceptive_count += 1
487
+ else:
488
+ deceptive_count += 1
489
+
490
+ # Get bounding box coordinates
491
+ x1, y1, x2, y2 = None, None, None, None
492
+
493
+ # Method 1: Try to extract from 'Top Co-ordinates' and 'Bottom Co-ordinates' columns
494
+ try:
495
+ top_coords = row.get('Top Co-ordinates')
496
+ bottom_coords = row.get('Bottom Co-ordinates')
497
+
498
+ if top_coords is not None and bottom_coords is not None:
499
+ # Parse tuple strings like "(10, 20)" or tuple objects
500
+ if isinstance(top_coords, str):
501
+ top_coords = top_coords.strip('()')
502
+ x1, y1 = map(float, top_coords.split(','))
503
+ elif isinstance(top_coords, (tuple, list)):
504
+ x1, y1 = float(top_coords[0]), float(top_coords[1])
505
+
506
+ if isinstance(bottom_coords, str):
507
+ bottom_coords = bottom_coords.strip('()')
508
+ x2, y2 = map(float, bottom_coords.split(','))
509
+ elif isinstance(bottom_coords, (tuple, list)):
510
+ x2, y2 = float(bottom_coords[0]), float(bottom_coords[1])
511
+
512
+ except (ValueError, TypeError, AttributeError):
513
+ # Method 2: Try direct coordinate columns (x1, y1, x2, y2)
514
+ try:
515
+ x1 = float(row.get('x1', 0))
516
+ y1 = float(row.get('y1', 0))
517
+ x2 = float(row.get('x2', 0))
518
+ y2 = float(row.get('y2', 0))
519
+ except (ValueError, TypeError):
520
+ # Method 3: Try alternative coordinate column names (X1, Y1, X2, Y2)
521
+ try:
522
+ x1 = float(row.get('X1', 0))
523
+ y1 = float(row.get('Y1', 0))
524
+ x2 = float(row.get('X2', 0))
525
+ y2 = float(row.get('Y2', 0))
526
+ except (ValueError, TypeError):
527
+ print(f"[CONSOLE] Warning: Could not extract coordinates for row {idx}")
528
+ continue
529
+
530
+ # Validate that all coordinates were successfully extracted
531
+ if any(coord is None for coord in [x1, y1, x2, y2]):
532
+ print(f"[CONSOLE] Warning: Missing coordinates for row {idx}")
533
+ continue
534
+
535
+ # Ensure coordinates are within image bounds
536
+ x1 = max(0, min(x1, image.width))
537
+ x2 = max(0, min(x2, image.width))
538
+ y1 = max(0, min(y1, image.height))
539
+ y2 = max(0, min(y2, image.height))
540
+
541
+ # Ensure x1 <= x2 and y1 <= y2 (swap if necessary)
542
+ if x1 > x2:
543
+ x1, x2 = x2, x1
544
+ if y1 > y2:
545
+ y1, y2 = y2, y1
546
+
547
+ # Skip if box is too small or invalid
548
+ if (x2 - x1) < 5 or (y2 - y1) < 5:
549
+ continue
550
+
551
+ # Choose color based on category or subtype
552
+ color = color_map.get(category, color_map.get(subtype, default_color))
553
+
554
+ # Draw bounding box
555
+ draw.rectangle([x1, y1, x2, y2], outline=color, width=2)
556
+
557
+ # Draw label
558
+ text = f"{category}"
559
+ if subtype and subtype != 'not-applicable' and subtype != 'n/a':
560
+ text = f"{category}: {subtype}"
561
+
562
+ # Get text dimensions
563
+ text_bbox = draw.textbbox((0, 0), text, font=font)
564
+ text_width = text_bbox[2] - text_bbox[0]
565
+ text_height = text_bbox[3] - text_bbox[1]
566
+
567
+ # Function to check if a rectangle overlaps with any used regions
568
+ def check_overlap(x, y, w, h, used_regions):
569
+ new_rect = (x, y, x + w, y + h)
570
+ for used_rect in used_regions:
571
+ if not (new_rect[2] < used_rect[0] or new_rect[0] > used_rect[2] or
572
+ new_rect[3] < used_rect[1] or new_rect[1] > used_rect[3]):
573
+ return True
574
+ return False
575
+
576
+ # Try different positions for the text to avoid overlaps
577
+ text_x = x1
578
+ text_y = None
579
+ padding = 4
580
+
581
+ # Position 1: Above the bounding box
582
+ candidate_y = y1 - text_height - padding
583
+ if candidate_y >= 0: # Within image bounds
584
+ # Adjust x position to stay within image bounds
585
+ if text_x + text_width > image.width:
586
+ text_x = image.width - text_width
587
+ if text_x < 0:
588
+ text_x = 0
589
+
590
+ # Check for overlaps
591
+ if not check_overlap(text_x, candidate_y, text_width, text_height, used_text_regions):
592
+ text_y = candidate_y
593
+
594
+ # Position 2: Below the bounding box (if above didn't work)
595
+ if text_y is None:
596
+ candidate_y = y2 + padding
597
+ if candidate_y + text_height <= image.height: # Within image bounds
598
+ # Adjust x position to stay within image bounds
599
+ text_x = x1
600
+ if text_x + text_width > image.width:
601
+ text_x = image.width - text_width
602
+ if text_x < 0:
603
+ text_x = 0
604
+
605
+ # Check for overlaps
606
+ if not check_overlap(text_x, candidate_y, text_width, text_height, used_text_regions):
607
+ text_y = candidate_y
608
+
609
+ # Position 3: To the right of the bounding box
610
+ if text_y is None:
611
+ candidate_x = x2 + padding
612
+ if candidate_x + text_width <= image.width: # Within image bounds
613
+ candidate_y = y1
614
+ if candidate_y + text_height > image.height:
615
+ candidate_y = image.height - text_height
616
+ if candidate_y < 0:
617
+ candidate_y = 0
618
+
619
+ # Check for overlaps
620
+ if not check_overlap(candidate_x, candidate_y, text_width, text_height, used_text_regions):
621
+ text_x = candidate_x
622
+ text_y = candidate_y
623
+
624
+ # Position 4: To the left of the bounding box
625
+ if text_y is None:
626
+ candidate_x = x1 - text_width - padding
627
+ if candidate_x >= 0: # Within image bounds
628
+ candidate_y = y1
629
+ if candidate_y + text_height > image.height:
630
+ candidate_y = image.height - text_height
631
+ if candidate_y < 0:
632
+ candidate_y = 0
633
+
634
+ # Check for overlaps
635
+ if not check_overlap(candidate_x, candidate_y, text_width, text_height, used_text_regions):
636
+ text_x = candidate_x
637
+ text_y = candidate_y
638
+
639
+ # Position 5: Find any available space (fallback)
640
+ if text_y is None:
641
+ # Try to find space by scanning the image in a grid pattern
642
+ step_size = 20
643
+ found = False
644
+ for scan_y in range(0, image.height - text_height, step_size):
645
+ if found:
646
+ break
647
+ for scan_x in range(0, image.width - text_width, step_size):
648
+ if not check_overlap(scan_x, scan_y, text_width, text_height, used_text_regions):
649
+ text_x = scan_x
650
+ text_y = scan_y
651
+ found = True
652
+ break
653
+
654
+ # Last resort: place at top-left corner (may overlap)
655
+ if text_y is None:
656
+ text_x = 0
657
+ text_y = 0
658
+
659
+ # Draw text background rectangle
660
+ draw.rectangle([text_x, text_y, text_x + text_width, text_y + text_height],
661
+ fill=color, outline=color)
662
+
663
+ # Draw text
664
+ draw.text((text_x, text_y), text, fill='white', font=font)
665
+
666
+ # Add this text region to used regions to prevent future overlaps
667
+ used_text_regions.append((text_x, text_y, text_x + text_width, text_y + text_height))
668
+
669
+ print(f"[CONSOLE] Annotated screenshot created with {deceptive_count} deceptive patterns and {non_deceptive_count} non-deceptive elements highlighted")
670
+
671
+ # Save annotated image to temporary file
672
+ if eval_dir:
673
+ # Create in the managed temp directory that will be cleaned up
674
+ temp_filename = os.path.join(eval_dir, "annotated_screenshot.png")
675
+ annotated_image.save(temp_filename)
676
+ return temp_filename
677
+ else:
678
+ # Fallback to system temp directory
679
+ temp_file = tempfile.NamedTemporaryFile(suffix='.png', delete=False)
680
+ annotated_image.save(temp_file.name)
681
+ return temp_file.name
682
+
683
+ except Exception as e:
684
+ print(f"[CONSOLE] Error creating annotated screenshot: {e}")
685
+ print(f"[CONSOLE] Falling back to original image: {image_path}")
686
+ # Return the original image path as fallback
687
+ return image_path
688
+
689
+
690
+ # Create the Gradio interface
691
+ def create_interface():
692
+ global scheduler, dataset_dir, jsonl_path
693
+ with gr.Blocks(title="Deceptive Pattern Detector", theme=gr.themes.Soft()) as demo:
694
+ gr.HTML("""
695
+ <div style="text-align: center; margin-bottom: 30px;">
696
+ <h1>🔍 Deceptive Pattern Detector</h1>
697
+ <p style="font-size: 18px; color: #666;">
698
+ Enter a website URL to analyze for deceptive design patterns
699
+ </p>
700
+ <div style="margin-top: 12px;">
701
+ <a href="https://arxiv.org/abs/2411.07441" target="_blank" rel="noopener noreferrer" aria-label="Read our arXiv paper"
702
+ style="display: inline-block; padding: 10px 14px; border-radius: 999px; background: #2563eb; color: white; text-decoration: none; font-weight: 600; box-shadow: 0 6px 16px rgba(37, 99, 235, 0.35);">
703
+ 📄 Read our paper on arXiv
704
+ </a>
705
+ </div>
706
+ </div>
707
+ """)
708
+
709
+ # How to Use section - collapsible accordion with tab format
710
+ with gr.Tabs():
711
+ with gr.TabItem("🔒 Privacy Policy"):
712
+ gr.HTML("""
713
+ <div style="padding: 20px; background-color: var(--block-background-fill); border-radius: 8px; border-left: 4px solid #28a745; border: 1px solid var(--border-color-primary);">
714
+ <div style="display: flex; gap: 20px; flex-wrap: wrap; align-items: stretch;">
715
+ <!-- Left Column: Privacy Highlights -->
716
+ <div style="flex: 1; min-width: 300px; display: flex; flex-direction: column;">
717
+ <div style="margin-bottom: 16px; padding: 16px; background-color: var(--block-background-fill); border-radius: 6px; border: 1px solid #10b981; opacity: 0.9;">
718
+ <div style="margin-bottom: 12px;">
719
+ <strong style="color: #10b981;">🔐 API Keys:</strong> <span style="color: var(--body-text-color); font-size: 14px; line-height: 1.6;">We <strong style="color: #dc2626;">NEVER</strong> save or store your Gemini API keys. They are only used temporarily in memory during your analysis session and are immediately discarded.</span>
720
+ </div>
721
+
722
+ <div style="margin-bottom: 0;">
723
+ <strong style="color: #8b5cf6;">🚫 No PII Storage:</strong> <span style="color: var(--body-text-color); font-size: 14px; line-height: 1.6;">We do not store any Personally Identifiable Information (PII), including API keys, user identifiers, or sensitive data from analyzed websites.</span>
724
+ </div>
725
+ </div>
726
+ </div>
727
+
728
+ <!-- Right Column: Data Usage -->
729
+ <div style="flex: 1; min-width: 300px; display: flex; flex-direction: column; gap: 12px;">
730
+ <div style="padding: 16px; background-color: var(--block-background-fill); border-radius: 6px; border: 1px solid #f59e0b; opacity: 0.9; flex-grow: 1;">
731
+ <strong style="color: #f59e0b;">🌐 Website URLs & Classifications:</strong>
732
+ <ul style="margin: 8px 0; padding-left: 20px; color: var(--body-text-color); font-size: 14px; line-height: 1.6;">
733
+ <li style="margin-bottom: 6px;">We <strong style="color: #f59e0b;">may</strong> save the websites you analyze (URLs only) and their corresponding deceptive pattern classifications</li>
734
+ <li style="margin-bottom: 6px;">This data helps us improve our detection system and fine-tune our framework</li>
735
+ <li style="margin-bottom: 6px;">No personal information is linked to this data</li>
736
+ <li>This data is used solely for research and system improvement purposes</li>
737
+ </ul>
738
+ </div>
739
+ </div>
740
+ </div>
741
+
742
+ <div style="margin-top: 20px; padding: 15px; background-color: var(--block-background-fill); border-radius: 6px; border: 1px solid #10b981; text-align: center;">
743
+ <strong style="color: #10b981;">✅ Summary:</strong> <span style="color: var(--body-text-color); font-size: 14px;">Your API keys are never stored. Anonymized URL and classification data may be retained for system improvement.</span>
744
+ </div>
745
+ </div>
746
+ """)
747
+
748
+ with gr.TabItem("ℹ️ How to Use"):
749
+ gr.HTML("""
750
+ <div style="padding: 20px; background-color: var(--block-background-fill); border-radius: 8px; border-left: 4px solid #2196F3; border: 1px solid var(--border-color-primary);">
751
+ <div style="display: flex; gap: 20px; flex-wrap: wrap; align-items: stretch;">
752
+ <!-- Left Column: Steps -->
753
+ <div style="flex: 1; min-width: 300px; display: flex; flex-direction: column;">
754
+ <ol style="margin: 0; color: var(--body-text-color); line-height: 1.6; flex-grow: 1;">
755
+ <li style="margin-bottom: 12px;"><strong>Enter URL:</strong> Provide the website URL you want to analyze (must start with http:// or https://)</li>
756
+ <li style="margin-bottom: 12px;"><strong>API Key:</strong> Enter your Google Gemini API key (get a free one at <a href="https://makersuite.google.com/app/apikey" target="_blank" style="color: #2196F3;">Google AI Studio</a>). We may make 1-2 Gemini-2.5-Pro API calls per analysis.</li>
757
+ <li style="margin-bottom: 12px;"><strong>Analyze:</strong> Click the analyze button and watch as the screenshot appears and the analysis runs</li>
758
+ <li style="margin-bottom: 12px;"><strong>Review:</strong> The annotated screenshot will show all elements with colored bounding boxes (light green for non-deceptive, various colors for deceptive patterns). Rerun the analysis if the detailed results and annotation mismatch.</li>
759
+ <li style="margin-bottom: 12px;"><strong>Note:</strong> E2E Analysis time may range from <5 sec to 5 mins based on various factors such as cloud infrastructure, demand, amount of text on page</li>
760
+ </ol>
761
+ </div>
762
+
763
+ <!-- Right Column: Disclaimer and Technical Info -->
764
+ <div style="flex: 1; min-width: 300px; display: flex; flex-direction: column; gap: 12px;">
765
+ <div style="padding: 12px; background-color: var(--block-background-fill); border-radius: 4px; border: 1px solid var(--border-color-accent); opacity: 0.9; flex-grow: 1;">
766
+ <strong style="color: #ff9800;">⚠️ Disclaimer:</strong> <span style="color: var(--body-text-color); font-size: 14px; line-height: 1.6;">This tool uses AI analysis and may not catch all deceptive patterns or may flag legitimate design elements. Use as a supplementary guide only.</span>
767
+ </div>
768
+
769
+ <div style="padding: 12px; background-color: var(--block-background-fill); border-radius: 4px; border: 1px solid var(--border-color-accent); opacity: 0.9; flex-grow: 1;">
770
+ <strong style="color: var(--body-text-color);">📷 Screenshot Method:</strong>
771
+ <ul style="margin: 8px 0; padding-left: 20px; color: var(--body-text-color); font-size: 14px; line-height: 1.6;">
772
+ <li style="margin-bottom: 6px;"><strong>Selenium WebDriver:</strong> Automatic screenshots using Chrome in headless mode (~1280x1080)</li>
773
+ <li><span style="color: #dc2626; font-weight: bold;">Static capture of front page only (no scrolling), with 5 second wait from initial page load</span></li>
774
+ </ul>
775
+ </div>
776
+ </div>
777
+ </div>
778
+ </div>
779
+ """)
780
+
781
+
782
+ with gr.Row():
783
+ with gr.Column(scale=2):
784
+ # Input section
785
+ gr.Markdown("### 🌐 Website Analysis")
786
+
787
+ # with gr.Tabs():
788
+ # with gr.TabItem("📱 URL Analysis"):
789
+ url_input = gr.Textbox(
790
+ type="text",
791
+ label="Website URL (Required)",
792
+ placeholder="https://example.com"
793
+ )
794
+
795
+
796
+
797
+ gemini_api_key = gr.Textbox(
798
+ type="password",
799
+ label="Gemini API Key (Required)",
800
+ placeholder="Enter your Google Gemini API key...",
801
+ info='Create your free API key by visiting <a href="https://makersuite.google.com/app/apikey" target="_blank" style="color: #2196F3;">Google AI Studio</a>'
802
+ )
803
+
804
+ # Expandable guide for getting API key
805
+ with gr.Accordion("❓ How to get a free Gemini API key (Step-by-step guide)", open=False):
806
+ gr.HTML("""
807
+ <div style="padding: 15px; background-color: var(--block-background-fill); border-radius: 8px; border-left: 4px solid #4CAF50;">
808
+ <h4 style="color: var(--body-text-color); margin-top: 0; margin-bottom: 15px;">🔑 Get Your Free Google Gemini API Key:</h4>
809
+
810
+ <div style="margin-bottom: 20px;">
811
+ <ol style="color: var(--body-text-color); line-height: 1.8; margin: 0; padding-left: 20px;">
812
+ <li style="margin-bottom: 10px;">
813
+ <strong>Visit Google AI Studio:</strong> Go to
814
+ <a href="https://makersuite.google.com/app/apikey" target="_blank" style="color: #2196F3; text-decoration: underline;">
815
+ https://makersuite.google.com/app/apikey
816
+ </a>
817
+ </li>
818
+ <li style="margin-bottom: 10px;">
819
+ <strong>Sign in:</strong> Use your Google account to sign in (create one if needed)
820
+ </li>
821
+ <li style="margin-bottom: 10px;">
822
+ <strong>Create API Key:</strong> Click the "Create API Key" button
823
+ </li>
824
+ <li style="margin-bottom: 10px;">
825
+ <strong>Select Project:</strong> Choose an existing Google Cloud project or create a new one
826
+ </li>
827
+ <li style="margin-bottom: 10px;">
828
+ <strong>Copy Key:</strong> Once generated, copy the API key to your clipboard
829
+ </li>
830
+ <li style="margin-bottom: 10px;">
831
+ <strong>Paste Here:</strong> Paste the API key into the field above and start analyzing!
832
+ </li>
833
+ </ol>
834
+ </div>
835
+
836
+ <div style="padding: 12px; background-color: var(--background-fill-secondary); border-radius: 6px; border: 1px solid var(--border-color-accent);">
837
+ <div style="display: flex; align-items: center; gap: 8px; margin-bottom: 8px;">
838
+ <span style="font-size: 16px;">💡</span>
839
+ <strong style="color: var(--body-text-color);">Pro Tips:</strong>
840
+ </div>
841
+ <ul style="color: var(--body-text-color); font-size: 14px; line-height: 1.6; margin: 0; padding-left: 20px;">
842
+ <li style="margin-bottom: 6px;">This tool typically uses 1-2 API calls per analysis</li>
843
+ <li>Your API key is never <strong>STORED</strong> by this application</li>
844
+ </ul>
845
+ </div>
846
+ </div>
847
+ """)
848
+
849
+ analyze_url_btn = gr.Button(
850
+ "🔍 Analyze Website URL",
851
+ variant="primary",
852
+ size="lg"
853
+ )
854
+
855
+ # Status moved to left column
856
+ gr.Markdown("### 📊 Analysis Status")
857
+ status_text = gr.Textbox(
858
+ label="Status",
859
+ value="Ready to analyze...",
860
+ lines=2,
861
+ interactive=False
862
+ )
863
+
864
+ # Results display moved to left column
865
+ results_display = gr.HTML(
866
+ value="<div style='text-align: center; padding: 40px; color: var(--body-text-color); opacity: 0.7;'>Enter a URL and click analyze to see results here.</div>"
867
+ )
868
+
869
+ results_dataframe = gr.Dataframe(
870
+ label="Detailed Results (Scroll right to see all columns)",
871
+ visible=False
872
+ )
873
+
874
+ with gr.Column(scale=3):
875
+ # Screenshot section - only screenshot in right column
876
+ gr.Markdown("### 📷 Website Screenshot")
877
+
878
+ # Placeholder container for screenshot
879
+ screenshot_placeholder = gr.HTML(
880
+ value="""
881
+ <div style="
882
+ border: 2px dashed var(--border-color-primary);
883
+ border-radius: 12px;
884
+ padding: 60px 20px;
885
+ text-align: center;
886
+ background-color: var(--block-background-fill);
887
+ background-image:
888
+ radial-gradient(circle at 20% 50%, var(--border-color-accent) 0%, transparent 50%),
889
+ radial-gradient(circle at 80% 50%, var(--border-color-accent) 0%, transparent 50%);
890
+ background-size: 100px 100px;
891
+ background-position: 0 0, 50px 50px;
892
+ min-height: 400px;
893
+ display: flex;
894
+ flex-direction: column;
895
+ justify-content: center;
896
+ align-items: center;
897
+ opacity: 0.8;
898
+ ">
899
+ <div style="
900
+ background-color: var(--block-background-fill);
901
+ padding: 20px 30px;
902
+ border-radius: 8px;
903
+ border: 1px solid var(--border-color-primary);
904
+ backdrop-filter: blur(10px);
905
+ ">
906
+ <h3 style="
907
+ color: var(--body-text-color);
908
+ margin: 0 0 10px 0;
909
+ font-size: 20px;
910
+ ">📷 Screenshot Preview Area</h3>
911
+ <p style="
912
+ color: var(--body-text-color);
913
+ margin: 0;
914
+ opacity: 0.8;
915
+ font-size: 16px;
916
+ line-height: 1.5;
917
+ ">
918
+ Website screenshots will appear here during analysis.<br>
919
+ <span style="font-size: 14px; opacity: 0.7;">
920
+ Original screenshot → Annotated with deceptive pattern highlights
921
+ </span>
922
+ </p>
923
+ </div>
924
+ </div>
925
+ """,
926
+ visible=True
927
+ )
928
+
929
+ screenshot_display = gr.Image(
930
+ label="Website Screenshot",
931
+ visible=False,
932
+ interactive=False
933
+ )
934
+
935
+ # Event handlers
936
+ def handle_url_analysis(url, api_key):
937
+ """Handle URL analysis with screenshot capture."""
938
+ print(f"[CONSOLE] handle_url_analysis called with URL: {url}")
939
+ print(f"[CONSOLE] API key provided: {'Yes' if api_key else 'No'}")
940
+
941
+ eval_dir_for_cleanup = None # Track eval_dir for cleanup
942
+
943
+ try:
944
+ print(f"[CONSOLE] Starting analysis generator for URL: {url}")
945
+
946
+ # Clear any previous error messages at the start of new analysis
947
+ yield (
948
+ "🚀 Starting new analysis...",
949
+ gr.update(visible=True), # Show placeholder initially
950
+ gr.update(visible=False), # Hide screenshot initially
951
+ "<div style='text-align: center; padding: 20px; color: var(--body-text-color); opacity: 0.7;'>Preparing analysis...</div>", # Clear previous errors
952
+ gr.update(visible=False) # Hide dataframe
953
+ )
954
+
955
+ analysis_generator = take_screenshot_and_process(url, api_key)
956
+
957
+ final_result = None
958
+ final_image = None
959
+ original_image = None # Track original screenshot separately for dataset upload
960
+ print(f"[CONSOLE] Processing generator results...")
961
+ for result_tuple in analysis_generator:
962
+
963
+ if len(result_tuple) == 4:
964
+ dataframe_result, status_update, image_path, eval_dir = result_tuple
965
+ eval_dir_for_cleanup = eval_dir # Store for cleanup
966
+
967
+ if dataframe_result is None:
968
+ # Progress update - show screenshot if available and clear any previous error messages
969
+ if image_path:
970
+ # Store the first image as original (before annotation)
971
+ if original_image is None:
972
+ original_image = image_path
973
+ final_image = image_path # Update the current image for display
974
+ yield (
975
+ status_update,
976
+ gr.update(visible=False), # Hide placeholder
977
+ gr.update(value=image_path, visible=True, label="📷 Original Screenshot"), # Show original screenshot
978
+ "<div style='text-align: center; padding: 20px; color: var(--body-text-color); opacity: 0.7;'>Analysis in progress...</div>", # Clear previous errors
979
+ gr.update(visible=False)
980
+ )
981
+ else:
982
+ yield (
983
+ status_update,
984
+ gr.update(visible=True), # Keep placeholder visible
985
+ gr.update(visible=False), # Hide screenshot
986
+ "<div style='text-align: center; padding: 20px; color: var(--body-text-color); opacity: 0.7;'>Analysis in progress...</div>", # Clear previous errors
987
+ gr.update(visible=False)
988
+ )
989
+ else:
990
+ print(f"[CONSOLE] Received final result with {len(dataframe_result)} rows")
991
+ final_result = dataframe_result
992
+ final_status = status_update
993
+ final_image = image_path # This will be the annotated image for display
994
+ # Store the first image as original if not already set
995
+ if original_image is None:
996
+ original_image = image_path
997
+ # Clear any previous errors when we get successful results
998
+ yield (
999
+ final_status,
1000
+ gr.update(visible=False), # Hide placeholder
1001
+ gr.update(value=final_image, visible=True, label="🎯 Annotated Screenshot (Analysis Complete)") if final_image else gr.update(visible=False),
1002
+ "<div style='text-align: center; padding: 20px; color: var(--body-text-color); opacity: 0.7;'>Processing results...</div>", # Clear previous errors
1003
+ gr.update(visible=False)
1004
+ )
1005
+ break
1006
+
1007
+ # Generator approach provides real-time notifications automatically
1008
+
1009
+ if final_result is not None:
1010
+ print(f"[CONSOLE] Creating result display HTML")
1011
+ results_html = create_result_display(final_result)
1012
+ print(f"[CONSOLE] Yielding final results to UI")
1013
+
1014
+ save_url = url.lower().replace("http://", "").replace("https://", "").strip().replace("www.", "") \
1015
+ .replace(".", "_x01x_") \
1016
+ .replace("/", "_x02x_") \
1017
+ .replace("-", "_x03x_") \
1018
+ .replace("=", "_x04x_") \
1019
+ .replace("?", "_x05x_") \
1020
+ .replace("&", "_x06x_") \
1021
+ .replace("%", "_x07x_") \
1022
+ .replace(":", "_x08x_") \
1023
+ .replace("#", "_x09x_") \
1024
+ .replace("'", "_x10x_") \
1025
+ .replace('"', "_x11x_") \
1026
+ .replace("*", "_x12x_") \
1027
+ .replace("<", "_x13x_") \
1028
+ .replace(">", "_x14x_") \
1029
+ .replace("|", "_x15x_")
1030
+
1031
+ save_url = save_url + "__" + str(uuid.uuid4()).replace("-", "_")
1032
+
1033
+ save_dict = {
1034
+ save_url: final_result
1035
+ }
1036
+
1037
+ # Create DataFrame for image dataset with "id" and "image" columns
1038
+ # Use original screenshot (not annotated) for dataset upload
1039
+ dataset_image_path = original_image if original_image else final_image
1040
+ annotated_image_path = final_image if final_image else original_image
1041
+ print(f"[CONSOLE] Using image for dataset upload: {dataset_image_path} (original: {original_image}, final: {final_image})")
1042
+ print(f"[CONSOLE] Using annotated image for display: {annotated_image_path} (original: {original_image}, final: {final_image})")
1043
+
1044
+ if dataset_image_path and os.path.exists(dataset_image_path) and annotated_image_path and os.path.exists(annotated_image_path):
1045
+ try:
1046
+ # Load the original image using PIL
1047
+ pil_image = Image.open(dataset_image_path)
1048
+ pil_final = Image.open(annotated_image_path)
1049
+ # Convert to RGB if needed (removes alpha channel if present)
1050
+ if pil_image.mode != 'RGB':
1051
+ pil_image = pil_image.convert('RGB')
1052
+
1053
+ if pil_final.mode != 'RGB':
1054
+ pil_final = pil_final.convert('RGB')
1055
+
1056
+ image_df = pd.DataFrame([{"id": save_url, "image": pil_image, "annotated_image": pil_final}])
1057
+ print(f"[CONSOLE] Loaded original image for dataset: {dataset_image_path} -> PIL Image {pil_image.size}")
1058
+ except Exception as e:
1059
+ print(f"[CONSOLE] Error loading image {dataset_image_path}: {e}")
1060
+ # Fallback to path if image loading fails
1061
+ image_df = pd.DataFrame([{"id": save_url, "image": dataset_image_path, "annotated_image": annotated_image_path}])
1062
+ else:
1063
+ print(f"[CONSOLE] Warning: Image path not found or invalid: {dataset_image_path}")
1064
+ image_df = pd.DataFrame([{"id": save_url, "image": None, "annotated_image": None}])
1065
+
1066
+ dataset_upload.update_dataset_with_new_splits(save_dict)
1067
+ dataset_upload.update_dataset_with_new_images(image_df, scheduler=scheduler, dataset_dir=dataset_dir, jsonl_path=jsonl_path)
1068
+
1069
+ # Show final results with annotated screenshot
1070
+ yield (
1071
+ final_status,
1072
+ gr.update(visible=False), # Hide placeholder
1073
+ gr.update(value=final_image, visible=True, label="🎯 Annotated Screenshot (Analysis Complete)") if final_image else gr.update(visible=False),
1074
+ results_html,
1075
+ gr.update(value=final_result, visible=True)
1076
+ )
1077
+
1078
+ # Clean up temporary files after successful display
1079
+ # Add small delay to let frontend finish loading images before cleanup
1080
+ time.sleep(5) # Give frontend time to load the images
1081
+ cleanup_temp_directory(eval_dir_for_cleanup)
1082
+
1083
+ else:
1084
+ print(f"[CONSOLE] No final result generated, analysis failed")
1085
+ # Clean up temp files even on failure
1086
+ cleanup_temp_directory(eval_dir_for_cleanup)
1087
+ yield (
1088
+ "❌ Analysis failed - no results generated",
1089
+ gr.update(visible=True), # Show placeholder again
1090
+ gr.update(visible=False, label="Website Screenshot"), # Hide screenshot and reset label
1091
+ "<div style='color: #ef4444; text-align: center; background-color: var(--block-background-fill); padding: 15px; border-radius: 8px; border: 1px solid #ef4444; opacity: 0.9;'>Analysis failed. Please check your Gemini API key and try again.</div>",
1092
+ gr.update(visible=False)
1093
+ )
1094
+
1095
+ except Exception as e:
1096
+ print(f"[CONSOLE] Exception in handle_url_analysis: {str(e)}")
1097
+ print(f"[CONSOLE] Exception type: {type(e).__name__}")
1098
+ # Clean up temp files on exception
1099
+ cleanup_temp_directory(eval_dir_for_cleanup)
1100
+ error_msg = f"❌ Error: {str(e)}"
1101
+ yield (
1102
+ error_msg,
1103
+ gr.update(visible=True), # Show placeholder again
1104
+ gr.update(visible=False, label="Website Screenshot"), # Hide screenshot and reset label
1105
+ f"<div style='color: #ef4444; text-align: center; background-color: var(--block-background-fill); padding: 15px; border-radius: 8px; border: 1px solid #ef4444; opacity: 0.9;'>{error_msg}</div>",
1106
+ gr.update(visible=False)
1107
+ )
1108
+ if e.__class__ == gr.exceptions.Error:
1109
+ raise e
1110
+
1111
+ # Connect the analyze buttons
1112
+ print(f"[CONSOLE] Setting up button click handlers")
1113
+ analyze_url_btn.click(
1114
+ fn=handle_url_analysis,
1115
+ inputs=[url_input, gemini_api_key],
1116
+ outputs=[status_text, screenshot_placeholder, screenshot_display, results_display, results_dataframe],
1117
+ show_progress="full"
1118
+ )
1119
+
1120
+ return demo
1121
+
1122
+ # Create unique directory for this session using temp directory
1123
+ session_id = str(uuid.uuid4())[:8]
1124
+ temp_base = Path(tempfile.gettempdir()) / "deceptive_pattern_images"
1125
+ dataset_dir = temp_base / f"{session_id}"
1126
+ dataset_dir.mkdir(parents=True, exist_ok=True)
1127
+ jsonl_path = dataset_dir / "metadata.jsonl"
1128
+
1129
+ scheduler = CommitScheduler(
1130
+ repo_id=os.environ["IMAGE_REPO_ID"],
1131
+ repo_type="dataset",
1132
+ folder_path=dataset_dir,
1133
+ path_in_repo=dataset_dir.name,
1134
+ token=os.environ["HF_TOKEN"],
1135
+ every=1
1136
+ )
1137
+
1138
+ # Create and launch the interface
1139
+ if __name__ == "__main__":
1140
+ # import torch
1141
+ #
1142
+ # print(f"Is CUDA available: {torch.cuda.is_available()}")
1143
+ # print(f"CUDA device: {torch.cuda.get_device_name(torch.cuda.current_device())}")
1144
+
1145
+ from py_files.utils import decrypt_system_prompts
1146
+
1147
+ if not decrypt_system_prompts():
1148
+ print(f"[CONSOLE] Failed to decrypt system prompts, exiting...")
1149
+ exit(1)
1150
+
1151
+ print(f"[CONSOLE] ===== STARTING GRADIO APPLICATION =====")
1152
+ print(f"[CONSOLE] Creating Gradio interface...")
1153
+ demo = create_interface()
1154
+ print(f"[CONSOLE] Interface created successfully")
1155
+ print(f"[CONSOLE] Launching server on 0.0.0.0:7860...")
1156
+ demo.queue().launch(server_name="0.0.0.0", server_port=7860)
models/vision/14.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2c917fdc79430e67dc9dd236b3377998a0239f6c59db74d17be6fa9ac0b93648
3
+ size 33596919
models/vision/15.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ce60b06896e9d42039041c11e4aa394d26cf0bce09ac94238864d6e94a1b1029
3
+ size 33596919
models/vision/16.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:839c7f6337f3fab6828708e5f12b109d8b210078fe1d2e9b6238e1dd3d4564ca
3
+ size 33596919
packages.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ chromium-driver
py_files/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ # Empty init file to make this a Python package
py_files/bounding_clustering.py ADDED
@@ -0,0 +1,220 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Bounding box clustering module for grouping nearby text elements.
3
+ Copied from the original with minimal modifications for HuggingFace Spaces.
4
+ """
5
+
6
+ from math import sqrt, dist
7
+ import extcolors
8
+ from py_files.pycolor import find
9
+
10
+
11
+ class Node:
12
+ def __init__(self, data=None, x1=None, y1=None, x2=None, y2=None,
13
+ font_size=None, children=None):
14
+ if children is None:
15
+ children = []
16
+ self.data = data
17
+ self.top_left_x = min(int(x1), int(x2)) if x1 is not None and x2 is not None else None
18
+ self.top_left_y = min(int(y1), int(y2)) if y1 is not None and y2 is not None else None
19
+ self.bottom_right_x = max(int(x1), int(x2)) if x1 is not None and x2 is not None else None
20
+ self.bottom_right_y = max(int(y1), int(y2)) if y1 is not None and y2 is not None else None
21
+
22
+ self.font_size = abs(self.bottom_right_y - self.top_left_y) if not font_size else font_size
23
+
24
+ self.children = children
25
+
26
+ def add_child(self, child):
27
+ child_top_left_x = child.top_left_x
28
+ child_top_left_y = child.top_left_y
29
+ child_bottom_right_x = child.bottom_right_x
30
+ child_bottom_right_y = child.bottom_right_y
31
+ self.top_left_x = min(self.top_left_x, child_top_left_x)
32
+ self.top_left_y = min(self.top_left_y, child_top_left_y)
33
+ self.bottom_right_x = max(self.bottom_right_x, child_bottom_right_x)
34
+ self.bottom_right_y = max(self.bottom_right_y, child_bottom_right_y)
35
+ self.data += " " + child.data
36
+ self.children.append(child)
37
+ self.font_size = (max(self.font_size, 0) + max(child.font_size, 0)) // 2
38
+
39
+ def get_data(self):
40
+ return self.data
41
+
42
+ def __repr__(self):
43
+ return f'Node({self.data}, size: {self.font_size}, {self.top_left_x}, {self.top_left_y}, {self.bottom_right_x}, {self.bottom_right_y}, {self.children})'
44
+
45
+
46
+ class ImageNode(Node):
47
+ def __init__(self, category=None, top_left_x=None, top_left_y=None, bottom_right_x=None, bottom_right_y=None,
48
+ text=""):
49
+ super().__init__(data=None, top_left_x=top_left_x, top_left_y=top_left_y, bottom_right_x=bottom_right_x,
50
+ bottom_right_y=bottom_right_y)
51
+ self.category = category
52
+ self.text = text
53
+
54
+ def get_boundaries(self):
55
+ return self.top_left_x, self.top_left_y, self.bottom_right_x, self.bottom_right_y
56
+
57
+ def get_text(self):
58
+ return self.text
59
+
60
+ def set_text(self, text):
61
+ self.text = text
62
+
63
+ def get_category(self):
64
+ return self.category
65
+
66
+ def __repr__(self):
67
+ return f'ImageNode({self.text}, {self.top_left_x}, {self.top_left_y}, {self.bottom_right_x}, {self.bottom_right_y})'
68
+
69
+
70
+ def distance_between_parallel_lines(pt1, pt2, pt3, pt4):
71
+ x1, y1 = pt1
72
+ x2, y2 = pt2
73
+ x3, y3 = pt3
74
+
75
+ numerator = abs((y2 - y1) * x3 - (x2 - x1) * y3 + x2 * y1 - y2 * x1)
76
+ denominator = sqrt((y2 - y1) ** 2 + (x2 - x1) ** 2)
77
+
78
+ if denominator == 0:
79
+ return None
80
+
81
+ return numerator / denominator
82
+
83
+
84
+ def create_node_lines(node):
85
+ top_left = (node.top_left_x, node.top_left_y)
86
+ top_right = (node.bottom_right_x, node.top_left_y)
87
+ bottom_left = (node.top_left_x, node.bottom_right_y)
88
+ bottom_right = (node.bottom_right_x, node.bottom_right_y)
89
+
90
+ return {
91
+ "left_line": [top_left, bottom_left],
92
+ "right_line": [top_right, bottom_right],
93
+ "top_line": [top_left, top_right],
94
+ "bottom_line": [bottom_left, bottom_right],
95
+ }
96
+
97
+
98
+ def rect_distance(x1, y1, x1b, y1b, x2, y2, x2b, y2b):
99
+ left = x2b < x1
100
+ right = x1b < x2
101
+ bottom = y2b < y1
102
+ top = y1b < y2
103
+ if top and left:
104
+ return dist((x1, y1b), (x2b, y2))
105
+ elif left and bottom:
106
+ return dist((x1, y1), (x2b, y2b))
107
+ elif bottom and right:
108
+ return dist((x1b, y1), (x2, y2b))
109
+ elif right and top:
110
+ return dist((x1b, y1b), (x2, y2))
111
+ elif left:
112
+ return x1 - x2b
113
+ elif right:
114
+ return x2 - x1b
115
+ elif bottom:
116
+ return y1 - y2b
117
+ elif top:
118
+ return y2 - y1b
119
+ else: # rectangles intersect
120
+ return 0
121
+
122
+
123
+ def distance_between_nodes(node1, node2):
124
+ return rect_distance(node1.top_left_x, node1.top_left_y, node1.bottom_right_x, node1.bottom_right_y,
125
+ node2.top_left_x, node2.top_left_y, node2.bottom_right_x, node2.bottom_right_y)
126
+
127
+
128
+ class QuadTree:
129
+ def __init__(self, root=None, max_dist=5):
130
+ self.root = root if isinstance(root, Node) else Node(None, 0, 0, 0, 0)
131
+ self.max_dist = max_dist
132
+
133
+ def insert(self, node: Node, parent=None):
134
+ if parent is None:
135
+ parent = self.root
136
+
137
+ if len(parent.children) == 0:
138
+ parent.children.append(node)
139
+ return
140
+
141
+ min_dist = float('inf')
142
+ min_child = None
143
+ for child in parent.children:
144
+ _dist = distance_between_nodes(node, child)
145
+ if _dist < min_dist:
146
+ min_dist = _dist
147
+ min_child = child
148
+
149
+ if min_dist <= self.max_dist:
150
+ min_child.add_child(node)
151
+ return True
152
+ else:
153
+ parent.children.append(node)
154
+ return False
155
+
156
+ def get_root(self):
157
+ return self.root
158
+
159
+ def get_children(self, data=False):
160
+ boxes = []
161
+ datas = []
162
+ for child in self.root.children:
163
+ boxes.append([child.top_left_x, child.top_left_y, child.bottom_right_x, child.bottom_right_y])
164
+ datas.append(child.data)
165
+ if data:
166
+ return datas, boxes
167
+ return boxes
168
+
169
+ def get_children_nodes(self):
170
+ return self.root.children
171
+
172
+ def get_dataframe(self, image):
173
+ import pandas as pd
174
+
175
+ def custom_sort(row):
176
+ x, y = row
177
+ return (y, x)
178
+
179
+ data = []
180
+ for child in self.root.children:
181
+ x1, y1, x2, y2 = child.top_left_x, child.top_left_y, child.bottom_right_x + 3, child.bottom_right_y + 3
182
+
183
+ try:
184
+ img_crop = image.crop((x1, y1, x2, y2))
185
+ colors, pixel_count = extcolors.extract_from_image(img_crop)
186
+
187
+ # Get background and font colors
188
+ bg_color = colors[0][0] if colors else (255, 255, 255)
189
+ font_color = colors[-1][0] if len(colors) > 1 else (0, 0, 0)
190
+
191
+ data.append([
192
+ child.data,
193
+ (child.top_left_x, child.top_left_y),
194
+ (child.bottom_right_x, child.bottom_right_y),
195
+ child.font_size,
196
+ f"{find(bg_color)}, (RGB: {bg_color[0]}, {bg_color[1]}, {bg_color[2]})",
197
+ f"{find(font_color)}, (RGB: {font_color[0]}, {font_color[1]}, {font_color[2]})"
198
+ ])
199
+ except Exception as e:
200
+ # Fallback for color extraction errors
201
+ data.append([
202
+ child.data,
203
+ (child.top_left_x, child.top_left_y),
204
+ (child.bottom_right_x, child.bottom_right_y),
205
+ child.font_size,
206
+ "white, (RGB: 255, 255, 255)",
207
+ "black, (RGB: 0, 0, 0)"
208
+ ])
209
+
210
+ df = pd.DataFrame(data, columns=["Text", "Top Co-ordinates", "Bottom Co-ordinates", "Font Size",
211
+ "Background Color", "Font Color"])
212
+ df = df.sort_values(by="Top Co-ordinates", key=lambda x: x.apply(custom_sort))
213
+
214
+ return df
215
+
216
+ def __repr__(self):
217
+ repr = ""
218
+ for c in self.root.children:
219
+ repr += f'Node({c.data}, size: {c.font_size}, {c.top_left_x}, {c.top_left_y}, {c.bottom_right_x}, {c.bottom_right_y})\n'
220
+ return repr
py_files/dataset_upload.py ADDED
@@ -0,0 +1,237 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import json
2
+ import os
3
+ import tempfile
4
+ import time
5
+ import uuid
6
+ from pathlib import Path
7
+
8
+ import datasets
9
+ import pandas as pd
10
+ from datasets import Image
11
+ from huggingface_hub import repo_exists, CommitScheduler
12
+
13
+
14
+ def update_dataset_with_new_splits(new_splits: dict, process_name: str = "Main"):
15
+ """
16
+ Add new splits to a regular HuggingFace dataset without downloading existing data.
17
+
18
+ This function pushes individual splits to the Hub using the split parameter,
19
+ which preserves all existing splits and only adds/updates the specified ones.
20
+
21
+ Key Features:
22
+ - No downloading of existing dataset required
23
+ - Existing splits are preserved (not overwritten)
24
+ - Each split is pushed individually using dataset.push_to_hub(split="name")
25
+ - Efficient for large datasets with many splits
26
+
27
+ Args:
28
+ new_splits: dict of {split_name: DataFrame} - splits to add/update
29
+ process_name: Name for logging and commit messages
30
+
31
+ Example:
32
+ new_splits = {
33
+ "validation_2024": val_df,
34
+ "test_batch_1": test_df,
35
+ "custom_split": custom_df
36
+ }
37
+ update_dataset_with_new_splits(new_splits)
38
+ """
39
+ repo_id = os.environ["REPO_ID"]
40
+ hf_token = os.environ["HF_TOKEN"]
41
+ print(f"\n[{process_name}] Starting dataset splits update process...")
42
+
43
+ # --- Start of Critical Section ---
44
+ if not repo_exists(repo_id, repo_type="dataset", token=hf_token):
45
+ print(f"[{process_name}] Repository {repo_id} not found. Cannot update.")
46
+ return
47
+
48
+ # Skip downloading existing dataset - we'll push only new splits
49
+ print(f"[{process_name}] Preparing to push {len(new_splits)} new splits individually...")
50
+
51
+ # Prepare each split for individual pushing
52
+ splits_to_push = []
53
+ for split_id, df in new_splits.items():
54
+ new_split_dataset = datasets.Dataset.from_pandas(df)
55
+ splits_to_push.append((split_id, new_split_dataset))
56
+ print(f"[{process_name}] Prepared split '{split_id}' with {len(new_split_dataset)} entries.")
57
+
58
+ # Push individual splits to Hub with Retry Mechanism
59
+ _push_splits_to_hub(splits_to_push, repo_id, hf_token, process_name)
60
+ print(f"[{process_name}] Finished pushing new dataset splits to Hub.")
61
+
62
+
63
+ def update_dataset_with_new_images(image_df: pd.DataFrame, process_name: str = "Main", scheduler: CommitScheduler=None, dataset_dir: Path=None, jsonl_path: Path=None):
64
+ """
65
+ Add new images to an image HuggingFace dataset using smart approach:
66
+ - If dataset is empty/doesn't exist: Create proper HuggingFace dataset
67
+ - If dataset has data: Use CommitScheduler for efficient incremental updates
68
+
69
+ Key Features:
70
+ - Automatically detects empty datasets and bootstraps them
71
+ - Uses CommitScheduler for incremental updates on existing datasets
72
+ - Saves images as PNG files with unique names
73
+ - Stores metadata in JSONL format for file-based approach
74
+ - Thread-safe with scheduler locking
75
+
76
+ Args:
77
+ image_df: DataFrame with 'id' and 'image' columns (image should be PIL Image objects)
78
+ process_name: Name for logging and commit messages
79
+
80
+ Example:
81
+ img_df = pd.DataFrame([{"id": "img1", "image": pil_image}])
82
+ update_dataset_with_new_images(img_df)
83
+ """
84
+ print(f"\n[{process_name}] Starting image dataset update...")
85
+
86
+ # Validate input format for image datasets
87
+ if not hasattr(image_df, 'columns'):
88
+ raise ValueError(f"image_df must be a pandas DataFrame with 'id' and 'image' columns, got {type(image_df)}")
89
+
90
+ # Validate required columns
91
+ required_columns = ['id', 'image', 'annotated_image']
92
+ missing_columns = [col for col in required_columns if col not in image_df.columns]
93
+ if missing_columns:
94
+ raise ValueError(f"Missing required columns: {missing_columns}. Found columns: {list(image_df.columns)}")
95
+
96
+ print(f"[{process_name}] Validated DataFrame with {len(image_df)} entries and columns: {list(image_df.columns)}")
97
+
98
+ return _append_images_with_scheduler(image_df, process_name, scheduler, dataset_dir, jsonl_path)
99
+
100
+
101
+ def _append_images_with_scheduler(image_df: pd.DataFrame, process_name: str, scheduler, dataset_dir, jsonl_path):
102
+ """
103
+ Append images to existing dataset using CommitScheduler for efficient incremental updates.
104
+ """
105
+
106
+ print(f"[{process_name}] Using CommitScheduler for incremental updates...")
107
+
108
+ print(f"[IMAGE_SCHEDULER] Created CommitScheduler for {os.environ['IMAGE_REPO_ID']} with local path: {dataset_dir}")
109
+
110
+ # Process each image
111
+ saved_count = 0
112
+ failed_count = 0
113
+
114
+ for idx, row in image_df.iterrows():
115
+ try:
116
+ image_id = row['id']
117
+ image = row['image']
118
+ annotated_image = row['annotated_image'] # Optional
119
+
120
+ # Skip if image is None
121
+ if image is None:
122
+ print(f"[{process_name}] Skipping image {image_id}: image is None")
123
+ failed_count += 1
124
+ continue
125
+
126
+ if annotated_image is None:
127
+ print(f"[{process_name}] Warning: annotated_image is None for {image_id}")
128
+ failed_count += 1
129
+ continue
130
+
131
+ # Generate unique filename
132
+ unique_filename_orig = f"{uuid.uuid4()}_orig.png"
133
+ unique_filename_ann = f"{uuid.uuid4()}_ann.png"
134
+
135
+ image_path_orig = dataset_dir / unique_filename_orig
136
+ image_path_ann = dataset_dir / unique_filename_ann
137
+
138
+ # Save image and metadata with scheduler
139
+ with scheduler.lock:
140
+ # Save image file
141
+ if hasattr(image, 'save'):
142
+ # PIL Image object
143
+ image.save(image_path_orig, format='PNG')
144
+ elif hasattr(image, 'shape'):
145
+ # Numpy array
146
+ from PIL import Image as PILImage
147
+ PILImage.fromarray(image).save(image_path_orig, format='PNG')
148
+ else:
149
+ print(f"[{process_name}] Warning: Unsupported image type for {image_id}: {type(image)}")
150
+ failed_count += 1
151
+ continue
152
+
153
+ # Save annotated image file
154
+ if hasattr(annotated_image, 'save'):
155
+ annotated_image.save(image_path_ann, format='PNG')
156
+ elif hasattr(annotated_image, 'shape'):
157
+ from PIL import Image as PILImage
158
+ PILImage.fromarray(annotated_image).save(image_path_ann, format='PNG')
159
+ else:
160
+ print(f"[{process_name}] Warning: Unsupported annotated_image type for {image_id}: {type(annotated_image)}")
161
+ failed_count += 1
162
+ continue
163
+
164
+ # Append metadata to JSONL
165
+ metadata = {
166
+ "id": image_id,
167
+ "file_name": unique_filename_orig,
168
+ "annotated_file_name": unique_filename_ann
169
+ }
170
+
171
+ with jsonl_path.open("a", encoding="utf-8") as f:
172
+ json.dump(metadata, f, ensure_ascii=False)
173
+ f.write("\n")
174
+
175
+ saved_count += 1
176
+ print(f"[{process_name}] Saved image {saved_count}/{len(image_df)}: {image_id} -> {unique_filename_orig}")
177
+
178
+ except Exception as e:
179
+ print(f"[{process_name}] Error processing image {image_id}: {e}")
180
+ failed_count += 1
181
+ continue
182
+
183
+ print(f"[{process_name}] Finished image dataset update:")
184
+ print(f"[{process_name}] - Successfully saved: {saved_count} images")
185
+ print(f"[{process_name}] - Failed: {failed_count} images")
186
+ print(f"[{process_name}] - Images will be automatically committed to dataset repository")
187
+
188
+ if saved_count == 0:
189
+ print(f"[{process_name}] Warning: No images were successfully saved")
190
+
191
+ return saved_count, failed_count
192
+
193
+
194
+ def _push_splits_to_hub(splits_to_push: list, repo_id: str, hf_token: str, process_name: str):
195
+ """
196
+ Helper function to push individual splits to Hub with retry mechanism.
197
+
198
+ Args:
199
+ splits_to_push: List of (split_name, dataset) tuples
200
+ repo_id: HuggingFace repository ID
201
+ hf_token: HuggingFace token
202
+ process_name: Process name for logging
203
+ """
204
+ max_retries = 5
205
+ successful_splits = []
206
+ failed_splits = []
207
+
208
+ for split_name, split_dataset in splits_to_push:
209
+ print(f"[{process_name}] Pushing split '{split_name}' with {len(split_dataset)} entries...")
210
+
211
+ for attempt in range(max_retries):
212
+ try:
213
+ print(f"[{process_name}] Pushing split '{split_name}' (Attempt {attempt + 1}/{max_retries})...")
214
+ split_dataset.push_to_hub(
215
+ repo_id=repo_id,
216
+ split=split_name, # This preserves existing splits
217
+ token=hf_token,
218
+ commit_message=f"feat: Add split '{split_name}' from {process_name} with {len(split_dataset)} entries"
219
+ )
220
+ print(f"[{process_name}] Split '{split_name}' pushed successfully on attempt {attempt + 1}.")
221
+ successful_splits.append(split_name)
222
+ break # Exit retry loop on success
223
+ except Exception as e:
224
+ print(f"[{process_name}] Split '{split_name}' push attempt {attempt + 1} failed: {e}")
225
+ if attempt < max_retries - 1:
226
+ wait_time = 5
227
+ print(f"[{process_name}] Waiting for {wait_time} seconds before retrying...")
228
+ time.sleep(wait_time)
229
+ else:
230
+ print(f"[{process_name}] All {max_retries} push attempts failed for split '{split_name}'.")
231
+ failed_splits.append(split_name)
232
+
233
+ # Report results
234
+ if successful_splits:
235
+ print(f"[{process_name}] Successfully pushed splits: {successful_splits}")
236
+ if failed_splits:
237
+ print(f"[{process_name}] Failed to push splits: {failed_splits}")
py_files/gemini_analysis.py ADDED
@@ -0,0 +1,439 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Gemini AI analysis module for deceptive pattern detection.
3
+ Updated to match gemini_prompting_to_make_dp_csvs_genai.py structure.
4
+ """
5
+
6
+ import pandas as pd
7
+ import os
8
+ import time
9
+ import csv
10
+ from io import StringIO
11
+ import json
12
+ from glob import glob
13
+ from tqdm.auto import tqdm
14
+ import gradio as gr
15
+
16
+ try:
17
+ from google import genai
18
+ from google.genai import types
19
+ from google.genai.errors import ServerError
20
+ GENAI_AVAILABLE = True
21
+ except ImportError:
22
+ GENAI_AVAILABLE = False
23
+
24
+
25
+ def check_csv_format(df: pd.DataFrame) -> str:
26
+ """
27
+ Check if the csv file generated is in the correct format as is expected.
28
+ Expectation is that the csv file has 10 columns and the index is integer.
29
+ It is also expected that all the cells in the csv file are strings and not null.
30
+ If the csv file has only one column, it is considered as a bad file.
31
+ Args:
32
+ df: pandas DataFrame object that is read from the csv file.
33
+ Returns:
34
+ str: A string that indicates the status of the csv
35
+ """
36
+ if 1 < len(df.columns) < 10:
37
+ return "The CSV file has less than 10 columns."
38
+ elif len(df.columns) > 10:
39
+ return "The CSV file has more than 10 columns."
40
+ elif not isinstance(df.index, pd.core.indexes.range.RangeIndex):
41
+ return "The CSV file has an incorrect index. Probably issue with the PIPE (|) separation variable."
42
+ elif len(df.columns) == 1:
43
+ return "The CSV file has only one column."
44
+ elif 'Text' in df.columns and not isinstance(df.Text.dtype, object):
45
+ return "The CSV file has non-string values in the Text column."
46
+
47
+ else:
48
+ return "The CSV file is in the correct format."
49
+
50
+
51
+ # analyze_with_gemini function removed - using few_shots_generator instead
52
+
53
+
54
+ def few_shots_generator(eval_dir='./eval', files=None, api_key=None):
55
+ """
56
+ Generator version of few_shots that yields notifications in real-time.
57
+
58
+ Yields:
59
+ tuple: (status, message) where status is 'notification' or 'result'
60
+ """
61
+ print(f"[CONSOLE] few_shots_generator: Starting analysis...")
62
+ print(f"[CONSOLE] eval_dir: {eval_dir}")
63
+ print(f"[CONSOLE] files: {files}")
64
+ print(f"[CONSOLE] API key provided: {'Yes' if api_key else 'No'}")
65
+
66
+ if not api_key:
67
+ print(f"[CONSOLE] No API key provided, returning None")
68
+ yield ('notification', "❌ No API key provided for analysis")
69
+ raise gr.Error("No API key provided for analysis")
70
+
71
+ # Read system prompt from gradio-demo directory
72
+ try:
73
+ system_prompt_path = os.path.join(os.path.dirname(__file__), '..', 'system_prompt.txt')
74
+ with open(system_prompt_path, 'r', encoding='utf-8') as f:
75
+ textsi_1 = f.read()
76
+ print(f"[CONSOLE] System prompt loaded from: {system_prompt_path}")
77
+ except Exception as e:
78
+ print(f"[CONSOLE] Failed to load system prompt: {e}")
79
+ yield ('notification', "❌ Failed to load system prompt")
80
+ raise gr.Error(f"Failed to load system prompt: {str(e)}")
81
+
82
+ os.makedirs(f"{eval_dir}/gemini_fs", exist_ok=True)
83
+ print(f"[CONSOLE] Created gemini_fs directory: {eval_dir}/gemini_fs")
84
+
85
+ try:
86
+ client = genai.Client(api_key=api_key)
87
+ print(f"[CONSOLE] Gemini client initialized")
88
+ except Exception as e:
89
+ error_msg = f"❌ Failed to initialize Gemini client: {str(e)}"
90
+ yield ('notification', error_msg)
91
+ print(f"[CONSOLE] Client initialization failed: {e}")
92
+ raise gr.Error(f"Failed to initialize Gemini client: {str(e)}")
93
+
94
+ if files is None:
95
+ files = glob(os.path.join(f"{eval_dir}/csv_with_yolo", "*.csv"))
96
+ if not isinstance(files, list):
97
+ files = [files]
98
+
99
+ print(f"[CONSOLE] Processing {len(files)} files")
100
+
101
+ for f in files:
102
+ print(f"[CONSOLE] Processing file: {f}")
103
+ try:
104
+ data = pd.read_csv(f, index_col=0)
105
+ data.index = data.index.str.replace('|', '', regex=False)
106
+ data = data.to_csv()
107
+ print(f"[CONSOLE] Data loaded and converted to CSV format")
108
+ except Exception as e:
109
+ print(f"[CONSOLE] Failed to read the file: {f}, error: {e}")
110
+ raise gr.Error(f"Failed to read input file: {str(e)}")
111
+
112
+ try_cnt = 0
113
+ while try_cnt < 2:
114
+ try:
115
+ try_cnt += 1
116
+ yield ('notification', f"🤖 Calling Gemini AI for pattern analysis (attempt {try_cnt})...")
117
+ if try_cnt == 1:
118
+ gr.Info("🤖 Starting Gemini analysis...")
119
+ print(f"[CONSOLE] Attempt {try_cnt} - Calling Gemini API...")
120
+ response = client.models.generate_content(
121
+ model='gemini-2.5-pro',
122
+ contents=data,
123
+ config=types.GenerateContentConfig(
124
+ system_instruction=textsi_1,
125
+ temperature=0,
126
+ top_p=0.1,
127
+ top_k=1,
128
+ max_output_tokens=12288,
129
+ safety_settings=[
130
+ types.SafetySetting(category='HARM_CATEGORY_HARASSMENT', threshold='BLOCK_NONE'),
131
+ types.SafetySetting(category='HARM_CATEGORY_HATE_SPEECH', threshold='BLOCK_NONE'),
132
+ types.SafetySetting(category='HARM_CATEGORY_SEXUALLY_EXPLICIT', threshold='BLOCK_NONE'),
133
+ types.SafetySetting(category='HARM_CATEGORY_DANGEROUS_CONTENT', threshold='BLOCK_NONE'),
134
+ types.SafetySetting(category='HARM_CATEGORY_CIVIC_INTEGRITY', threshold='BLOCK_NONE')
135
+ ]
136
+ )
137
+ )
138
+ yield ('notification', f"✅ Gemini API call successful! Processing results...")
139
+ gr.Info("✅ Gemini analysis successful!")
140
+ print(f"[CONSOLE] Gemini API call successful")
141
+ break
142
+ except ServerError as e:
143
+ if try_cnt > 3:
144
+ error_msg = f"❌ Failed to get response after {try_cnt} attempts"
145
+ yield ('notification', error_msg)
146
+ print(f"[CONSOLE] Failed to get response for {f} after {try_cnt} attempts")
147
+ raise gr.Error(f"Analysis failed after {try_cnt} attempts")
148
+
149
+ wait_msg = f"⚠️ Server error occurred. Retrying attempt {try_cnt + 1}/2 in 60 seconds..."
150
+ yield ('notification', wait_msg)
151
+ gr.Warning(f"⚠️ Server error. Retrying in 60 seconds... (attempt {try_cnt + 1}/2)")
152
+ print(f"[CONSOLE] Server error: {e.message}, sleeping for 60 seconds")
153
+ print(e)
154
+ time.sleep(60)
155
+ continue
156
+ except Exception as e:
157
+ # Handle non-server errors (API key issues, quota errors, etc.)
158
+ error_msg = f"❌ Gemini API error: {str(e.message)}"
159
+ print(f"[CONSOLE] Non-server error in Gemini API call: {e}")
160
+ yield 'notification', error_msg
161
+ raise gr.Error(f"Gemini API error: {str(e.message)}")
162
+
163
+ try:
164
+ # Process the response
165
+ _f = os.path.join(f"{eval_dir}", "gemini_fs", os.path.basename(f))
166
+ df = pd.read_csv(StringIO(response.text.replace("```csv", '').replace("```", '').strip()), sep='|')
167
+ csv_with_yolo = pd.read_csv(f, index_col=0)
168
+ gemini_cols = df[["Deceptive Design Category", "Deceptive Design Subtype", "Reasoning"]]
169
+ csv_with_yolo.reset_index(inplace=True)
170
+ final_df = pd.concat([csv_with_yolo, gemini_cols], axis=1)
171
+ final_df.to_csv(_f, index=False, quoting=csv.QUOTE_ALL)
172
+ print(f"[CONSOLE] Results saved to: {_f}")
173
+
174
+ # Check if thinking is needed (if any deceptive patterns found)
175
+ if set(final_df['Deceptive Design Category'].tolist()) != {'non-deceptive'}:
176
+ yield ('notification', "🧠 Deceptive patterns detected! Running advanced thinking analysis...")
177
+ gr.Info("🧠 Deceptive patterns found! Running advanced analysis...")
178
+ print(f"[CONSOLE] Deceptive patterns found, running thinking analysis...")
179
+
180
+ # Use generator version of thinking
181
+ thinking_result = None
182
+ for thinking_status, thinking_data in thinking_generator(eval_dir, files=[_f], api_key=api_key):
183
+ if thinking_status == 'notification':
184
+ yield ('notification', thinking_data)
185
+ elif thinking_status == 'result':
186
+ thinking_result = thinking_data
187
+ break
188
+
189
+ if thinking_result is not None:
190
+ yield ('notification', "✅ Advanced thinking analysis completed successfully!")
191
+ gr.Info("✅ Advanced analysis completed!")
192
+ print(f"[CONSOLE] Thinking analysis completed, using refined results")
193
+ final_df = thinking_result
194
+ else:
195
+ yield ('notification', "⚠️ Advanced thinking analysis failed, using original results")
196
+ gr.Warning("⚠️ Advanced analysis failed, using basic results")
197
+ print(f"[CONSOLE] Thinking analysis failed, using original results")
198
+ else:
199
+ yield ('notification', "✅ No deceptive patterns found, analysis complete!")
200
+ gr.Info("✅ No deceptive patterns detected!")
201
+ print(f"[CONSOLE] No deceptive patterns found, skipping thinking analysis")
202
+
203
+ yield 'result', final_df
204
+ return
205
+
206
+ except Exception as e:
207
+ print(f"[CONSOLE] Error parsing with pipe separator, trying comma: {e}")
208
+ try:
209
+ df = pd.read_csv(StringIO(response.text.replace("```csv", '').replace("```", '').strip()), sep=',')
210
+ csv_with_yolo = pd.read_csv(f, index_col=0)
211
+ gemini_cols = df[["Deceptive Design Category", "Deceptive Design Subtype", "Reasoning"]]
212
+ csv_with_yolo.reset_index(inplace=True)
213
+ final_df = pd.concat([csv_with_yolo, gemini_cols], axis=1)
214
+ final_df.to_csv(_f, index=False, quoting=csv.QUOTE_ALL)
215
+ print(f"[CONSOLE] Results saved to: {_f} (comma separated)")
216
+
217
+ # Check if thinking is needed
218
+ if set(final_df['Deceptive Design Category'].tolist()) != {'non-deceptive'}:
219
+ yield ('notification', "🧠 Deceptive patterns detected! Running advanced thinking analysis...")
220
+ gr.Info("🧠 Deceptive patterns found! Running advanced analysis...")
221
+ print(f"[CONSOLE] Deceptive patterns found, running thinking analysis...")
222
+
223
+ # Use generator version of thinking
224
+ thinking_result = None
225
+ for thinking_status, thinking_data in thinking_generator(eval_dir, files=[_f], api_key=api_key):
226
+ if thinking_status == 'notification':
227
+ yield ('notification', thinking_data)
228
+ elif thinking_status == 'result':
229
+ thinking_result = thinking_data
230
+ break
231
+
232
+ if thinking_result is not None:
233
+ yield ('notification', "✅ Advanced thinking analysis completed successfully!")
234
+ gr.Info("✅ Advanced analysis completed!")
235
+ print(f"[CONSOLE] Thinking analysis completed, using refined results")
236
+ final_df = thinking_result
237
+ else:
238
+ yield ('notification', "⚠️ Advanced thinking analysis failed, using original results")
239
+ gr.Warning("⚠️ Advanced analysis failed, using basic results")
240
+ print(f"[CONSOLE] Thinking analysis failed, using original results")
241
+ else:
242
+ yield ('notification', "✅ No deceptive patterns found, analysis complete!")
243
+ gr.Info("✅ No deceptive patterns detected!")
244
+ print(f"[CONSOLE] No deceptive patterns found, skipping thinking analysis")
245
+
246
+ yield ('result', final_df)
247
+ return
248
+ except Exception as e2:
249
+ error_msg = f"❌ Error parsing Gemini response with both separators: {str(e2)}"
250
+ yield ('notification', error_msg)
251
+ print(f"[CONSOLE] FEW_SHOT Error with both separators: {e2}")
252
+ try:
253
+ error_file = _f.replace(".csv", "e1.txt")
254
+ with open(error_file, 'w') as _fs:
255
+ _fs.write(response.text)
256
+ print(f"[CONSOLE] Error response saved to: {error_file}")
257
+ except Exception as e3:
258
+ print(f"[CONSOLE] Failed to save error response: {e3}")
259
+ raise gr.Error(f"Failed to parse response: {str(e2)}")
260
+
261
+ yield ('result', None)
262
+
263
+
264
+ def thinking_generator(eval_dir="./eval", files=None, api_key=None):
265
+ """
266
+ Generator version of thinking that yields notifications in real-time.
267
+ """
268
+ print(f"[CONSOLE] thinking_generator: Starting thinking analysis...")
269
+ print(f"[CONSOLE] eval_dir: {eval_dir}")
270
+ print(f"[CONSOLE] files: {files}")
271
+
272
+ if not api_key:
273
+ print(f"[CONSOLE] No API key provided for thinking analysis")
274
+ raise gr.Error("No API key provided for thinking analysis")
275
+
276
+ # Read thinking system prompt from gradio-demo directory
277
+ try:
278
+ thinking_prompt_path = os.path.join(os.path.dirname(__file__), '..', 'system_prompt_thinking.txt')
279
+ with open(thinking_prompt_path, 'r', encoding='utf-8') as f:
280
+ textsi_1 = f.read()
281
+ print(f"[CONSOLE] Thinking system prompt loaded from: {thinking_prompt_path}")
282
+ except Exception as e:
283
+ print(f"[CONSOLE] Failed to load thinking system prompt: {e}")
284
+ raise gr.Error(f"Failed to load thinking system prompt: {str(e)}")
285
+
286
+ os.makedirs(f"{eval_dir}/gemini_fs", exist_ok=True)
287
+ try:
288
+ client = genai.Client(api_key=api_key, http_options={'api_version':'v1beta'})
289
+ print(f"[CONSOLE] Thinking client initialized with v1beta")
290
+ except Exception as e:
291
+ error_msg = f"❌ Failed to initialize thinking client: {str(e)}"
292
+ print(f"[CONSOLE] Thinking client initialization failed: {e}")
293
+ raise gr.Error(f"Failed to initialize thinking client: {str(e)}")
294
+
295
+ if files is None:
296
+ files = glob(os.path.join(f"{eval_dir}/gemini_fs", "*.csv"))
297
+ if not isinstance(files, list):
298
+ files = [files]
299
+
300
+ print(f"[CONSOLE] Processing {len(files)} files for thinking analysis")
301
+
302
+ for f in files:
303
+ print(f"[CONSOLE] Thinking analysis for file: {f}")
304
+ try:
305
+ data = pd.read_csv(f, index_col=0)
306
+ data.index = data.index.str.replace('|', '', regex=False)
307
+ data = data.to_csv()
308
+ print(f"[CONSOLE] Data prepared for thinking analysis")
309
+
310
+ # Make API call to Gemini with retry logic for thinking analysis
311
+ try_cnt = 0
312
+ response = None
313
+ while try_cnt < 2:
314
+ try:
315
+ try_cnt += 1
316
+ yield ('notification', f"🧠 Running advanced thinking analysis (attempt {try_cnt})...")
317
+ print(f"[CONSOLE] Attempt {try_cnt} - Calling Gemini API for thinking...")
318
+
319
+ response = client.models.generate_content(
320
+ model='gemini-2.5-pro',
321
+ contents=data,
322
+ config=types.GenerateContentConfig(
323
+ system_instruction=textsi_1,
324
+ temperature=0,
325
+ top_p=0.1,
326
+ top_k=1,
327
+ max_output_tokens=65536,
328
+ safety_settings=[
329
+ types.SafetySetting(category='HARM_CATEGORY_HARASSMENT', threshold='BLOCK_NONE'),
330
+ types.SafetySetting(category='HARM_CATEGORY_HATE_SPEECH', threshold='BLOCK_NONE'),
331
+ types.SafetySetting(category='HARM_CATEGORY_SEXUALLY_EXPLICIT', threshold='BLOCK_NONE'),
332
+ types.SafetySetting(category='HARM_CATEGORY_DANGEROUS_CONTENT', threshold='BLOCK_NONE'),
333
+ types.SafetySetting(category='HARM_CATEGORY_CIVIC_INTEGRITY', threshold='BLOCK_NONE')
334
+ ]
335
+ )
336
+ )
337
+ yield ('notification', f"✅ Advanced thinking analysis API call successful!")
338
+ print(f"[CONSOLE] Thinking API call successful")
339
+ break
340
+ except ServerError as e:
341
+ if try_cnt > 3:
342
+ error_msg = f"❌ Failed to complete thinking analysis after {try_cnt} attempts"
343
+ yield ('notification', error_msg)
344
+ print(f"[CONSOLE] Failed to get thinking response after {try_cnt} attempts")
345
+ raise gr.Error(f"Advanced analysis failed after {try_cnt} attempts")
346
+
347
+ wait_msg = f"⚠️ Server error in thinking analysis. Retrying attempt {try_cnt + 1}/2 in 60 seconds..."
348
+ yield ('notification', wait_msg)
349
+ gr.Warning(f"⚠️ Thinking server error. Retrying in 60s... (attempt {try_cnt + 1}/2)")
350
+ print(f"[CONSOLE] Server error in thinking analysis: {e.message}, sleeping for 60 seconds")
351
+ print(e)
352
+ time.sleep(60)
353
+ continue
354
+ except Exception as e:
355
+ # Handle non-server errors in thinking analysis
356
+ error_msg = f"❌ Thinking analysis API error: {str(e)}"
357
+ yield ('notification', error_msg)
358
+ print(f"[CONSOLE] Non-server error in thinking API call: {e}")
359
+ raise gr.Error(f"Thinking analysis API error: {str(e)}")
360
+
361
+ output_csv = ""
362
+ thought_txt = ""
363
+ for part in response.candidates[0].content.parts:
364
+ if part.thought == True:
365
+ thought_txt = part.text
366
+ print(f"[CONSOLE] Extracted thought text ({len(thought_txt)} chars)")
367
+ else:
368
+ output_csv = part.text
369
+ print(f"[CONSOLE] Extracted output CSV ({len(output_csv)} chars)")
370
+
371
+ _f = os.path.join(f"{eval_dir}", "gemini_fs", os.path.basename(f))
372
+ _f_thought = os.path.join(f"{eval_dir}", "gemini_fs", os.path.basename(f).replace(".csv", "_thinking.txt"))
373
+
374
+ # Save thinking text
375
+ with open(_f_thought, 'w', encoding='utf-8') as _f_thought_file:
376
+ _f_thought_file.write(thought_txt)
377
+ print(f"[CONSOLE] Thinking text saved to: {_f_thought}")
378
+
379
+ # Parse and save updated CSV with similar process as main analysis
380
+ try:
381
+ # Parse the thinking response CSV
382
+ df_thinking = pd.read_csv(StringIO(output_csv), sep='|')
383
+
384
+ # Read the original CSV file to get the base data
385
+ csv_with_yolo = pd.read_csv(f, index_col=0).drop(columns=["Deceptive Design Category", "Deceptive Design Subtype", "Reasoning"], errors='ignore')
386
+
387
+ # Extract the thinking analysis columns (similar to main process)
388
+ thinking_cols = df_thinking[["Deceptive Design Category", "Deceptive Design Subtype", "Reasoning"]]
389
+
390
+ # Reset index and concatenate with original data
391
+ csv_with_yolo.reset_index(inplace=True)
392
+ final_df = pd.concat([csv_with_yolo, thinking_cols], axis=1)
393
+
394
+ # Save the updated dataframe
395
+ final_df.to_csv(_f, index=False, quoting=csv.QUOTE_ALL)
396
+ print(f"[CONSOLE] Thinking results saved to: {_f} (pipe separated)")
397
+ yield ('result', final_df) # Return the updated dataframe
398
+ return
399
+ except Exception as e:
400
+ print(f"[CONSOLE] Error with pipe separator, trying comma: {e}")
401
+ try:
402
+ # Parse the thinking response CSV with comma separator
403
+ df_thinking = pd.read_csv(StringIO(output_csv), sep=',')
404
+
405
+ # Read the original CSV file to get the base data
406
+ csv_with_yolo = pd.read_csv(f, index_col=0).drop(columns=["Deceptive Design Category", "Deceptive Design Subtype", "Reasoning"], errors='ignore')
407
+
408
+ # Extract the thinking analysis columns (similar to main process)
409
+ thinking_cols = df_thinking[["Deceptive Design Category", "Deceptive Design Subtype", "Reasoning"]]
410
+
411
+ # Reset index and concatenate with original data
412
+ csv_with_yolo.reset_index(inplace=True)
413
+ final_df = pd.concat([csv_with_yolo, thinking_cols], axis=1)
414
+
415
+ # Save the updated dataframe
416
+ final_df.to_csv(_f, index=False, quoting=csv.QUOTE_ALL)
417
+ print(f"[CONSOLE] Thinking results saved to: {_f} (comma separated)")
418
+ yield ('result', final_df) # Return the updated dataframe
419
+ return
420
+ except Exception as e2:
421
+ error_msg = f"❌ Error parsing thinking analysis response with both separators: {str(e2)}"
422
+ yield ('notification', error_msg)
423
+ print(f"[CONSOLE] THINKING ERROR with both separators: {e2}")
424
+ try:
425
+ error_file = _f.replace(".csv", "e2.txt")
426
+ with open(error_file, 'w') as _fs:
427
+ _fs.write(output_csv)
428
+ print(f"[CONSOLE] Thinking error response saved to: {error_file}")
429
+ except Exception as e3:
430
+ print(f"[CONSOLE] Failed to save thinking error response: {e3}")
431
+ raise gr.Error(f"Failed to parse thinking response: {str(e2)}")
432
+
433
+ except Exception as e:
434
+ error_msg = f"❌ Error in thinking analysis: {str(e)}"
435
+ yield ('notification', error_msg)
436
+ print(f"[CONSOLE] Error in thinking analysis for {f}: {e}")
437
+ raise gr.Error(f"Thinking analysis error: {str(e)}")
438
+
439
+ yield ('result', None) # Return None if no files processed
py_files/ocr.py ADDED
@@ -0,0 +1,157 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ OCR module adapted for HuggingFace Spaces.
3
+ Uses Google Cloud Vision API for text detection.
4
+ """
5
+
6
+ from PIL import Image, ImageDraw, ImageFilter
7
+ from google.cloud import vision
8
+ import numpy as np
9
+ import io
10
+ import os
11
+ import json
12
+ import tempfile
13
+ from py_files.bounding_clustering import QuadTree, Node
14
+
15
+
16
+ def change_contrast(img, level):
17
+ """Adjust image contrast for better OCR results."""
18
+ factor = (259 * (level + 255)) / (255 * (259 - level))
19
+
20
+ def contrast(c):
21
+ return 128 + factor * (c - 128)
22
+
23
+ return img.point(contrast)
24
+
25
+
26
+ def get_bounding_box_doc(blk):
27
+ """Extract bounding box coordinates from document text block."""
28
+ vertices = [int(blk.bounding_box.vertices[0].x), int(blk.bounding_box.vertices[0].y),
29
+ int(blk.bounding_box.vertices[2].x), int(blk.bounding_box.vertices[2].y)]
30
+ return vertices
31
+
32
+
33
+ def get_text_from_image_doc(img, debug=False, get_response=False, resp=None, max_dist=20):
34
+ """
35
+ Extract text from image using Google Cloud Vision Document Text Detection.
36
+ Adapted for HuggingFace Spaces environment.
37
+ """
38
+ response = resp
39
+ if resp is None:
40
+ # Initialize the client with credentials from environment
41
+ try:
42
+ # Try to get credentials from environment variable
43
+ google_creds = os.environ.get('GOOGLE_CLOUD_CREDENTIALS')
44
+ if google_creds:
45
+ # Create temporary credentials file
46
+ creds_data = json.loads(google_creds)
47
+ with tempfile.NamedTemporaryFile(mode='w', suffix='.json', delete=False) as f:
48
+ json.dump(creds_data, f)
49
+ creds_path = f.name
50
+ os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = creds_path
51
+
52
+ client = vision.ImageAnnotatorClient()
53
+
54
+ # Enhance image contrast for better OCR
55
+ img = change_contrast(img, 20)
56
+
57
+ # Convert PIL image to bytes
58
+ imgByteArr = io.BytesIO()
59
+ img.save(imgByteArr, format='PNG')
60
+ image = vision.Image(content=imgByteArr.getvalue())
61
+
62
+ # Perform document text detection
63
+ response = client.document_text_detection(image=image)
64
+
65
+ # Clean up temporary credentials file
66
+ if google_creds and 'creds_path' in locals():
67
+ try:
68
+ os.unlink(creds_path)
69
+ except:
70
+ pass
71
+
72
+ except Exception as e:
73
+ # Fallback: create a dummy response for demo purposes
74
+ print(f"Warning: Google Cloud Vision not available: {e}")
75
+ response = create_dummy_ocr_response(img)
76
+
77
+ # Process the response
78
+ word_boxes = []
79
+
80
+ if hasattr(response, 'full_text_annotation') and response.full_text_annotation:
81
+ for page in response.full_text_annotation.pages:
82
+ for block in page.blocks:
83
+ if block.confidence < 0.9:
84
+ continue
85
+ if debug:
86
+ print(f"\nBlock confidence: {block.confidence}")
87
+ print(f"Block box: {get_bounding_box_doc(block)}")
88
+
89
+ words = ""
90
+ fonts = []
91
+ for paragraph in block.paragraphs:
92
+ for word in paragraph.words:
93
+ word_text = "".join([symbol.text for symbol in word.symbols])
94
+ words += word_text + " "
95
+ word_bbox = get_bounding_box_doc(word)
96
+ fonts.append(abs(word_bbox[3] - word_bbox[1]))
97
+
98
+ if debug:
99
+ print(f"Words: {words}")
100
+
101
+ if fonts: # Only add if we have font information
102
+ word_boxes.append([words.strip()] + get_bounding_box_doc(block) + [sum(fonts) // len(fonts)])
103
+
104
+ # If no text was detected, create a minimal entry
105
+ if not word_boxes:
106
+ word_boxes.append(["No text detected", 0, 0, 100, 20, 12])
107
+
108
+ # Create QuadTree for clustering nearby text
109
+ tree = QuadTree(max_dist=max_dist)
110
+ for i in range(len(word_boxes)):
111
+ tree.insert(Node(*tuple(word_boxes[i])))
112
+
113
+ if get_response:
114
+ return tree, response
115
+ return tree, {}
116
+
117
+
118
+ def create_dummy_ocr_response(img):
119
+ """
120
+ Create a dummy OCR response for demo purposes when Google Cloud Vision is not available.
121
+ This allows the demo to work without requiring actual OCR credentials.
122
+ """
123
+ W, H = img.size
124
+
125
+ # Create a simple mock response object
126
+ class MockResponse:
127
+ def __init__(self):
128
+ self.full_text_annotation = None
129
+
130
+ # For demo purposes, we'll just return an empty response
131
+ # In a real scenario, you might want to use an alternative OCR library like pytesseract
132
+ return MockResponse()
133
+
134
+
135
+ def draw_boxes(img, bound, color, width=5):
136
+ """Draw bounding boxes on image for visualization."""
137
+ _img = img.copy()
138
+ draw = ImageDraw.Draw(_img)
139
+
140
+ x0 = min(bound[0], bound[2]) - 7
141
+ x1 = max(bound[0], bound[2]) + 10
142
+ y0 = min(bound[1], bound[3]) - 7
143
+ y1 = max(bound[1], bound[3]) + 10
144
+
145
+ draw.rectangle([x0, y0, x1, y1], outline=color, width=width)
146
+ return _img, x0, y0, x1, y1
147
+
148
+
149
+ def get_image_with_boxes_doc(image, color='red', width=5, get_response=False, response=None):
150
+ """Get image with OCR bounding boxes drawn on it."""
151
+ tree, resp = get_text_from_image_doc(image, get_response=get_response, resp=response)
152
+ bxs = tree.get_children(False)
153
+ for bx in bxs:
154
+ image, x0, y0, x1, y1 = draw_boxes(image, bx, color, width)
155
+ if get_response:
156
+ return image, resp
157
+ return image
py_files/pycolor.py ADDED
The diff for this file is too large to render. See raw diff
 
py_files/utils.py ADDED
@@ -0,0 +1,60 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import base64
2
+ import os
3
+
4
+ from cryptography.hazmat.primitives import hashes
5
+ from cryptography.hazmat.primitives.kdf.pbkdf2 import PBKDF2HMAC
6
+ from cryptography.hazmat.backends import default_backend
7
+ from cryptography.fernet import Fernet, InvalidToken
8
+
9
+
10
+ def decrypt(password: str, token: bytes) -> bytes | None:
11
+ """Decrypts a token using a password."""
12
+ try:
13
+ salt = token[:16]
14
+ encrypted_data = token[16:]
15
+ kdf = PBKDF2HMAC(
16
+ algorithm=hashes.SHA256(),
17
+ length=32,
18
+ salt=salt,
19
+ iterations=480000,
20
+ backend=default_backend()
21
+ )
22
+ key = base64.urlsafe_b64encode(kdf.derive(password.encode()))
23
+ f = Fernet(key)
24
+ return f.decrypt(encrypted_data)
25
+ except InvalidToken:
26
+ return None
27
+
28
+ def decrypt_file(password: str, input_path: str, output_path: str):
29
+ """Reads an encrypted file, decrypts it, and saves the original content."""
30
+ try:
31
+ with open(input_path, 'rb') as f_in:
32
+ encrypted_data = f_in.read()
33
+
34
+ decrypted_data = decrypt(password, encrypted_data)
35
+
36
+ if decrypted_data is None:
37
+ print("❌ Decryption failed! Wrong password or corrupted file.")
38
+ return
39
+
40
+ with open(output_path, 'wb') as f_out:
41
+ f_out.write(decrypted_data)
42
+ print(f"✅ File '{input_path}' decrypted successfully to '{output_path}'.")
43
+ except FileNotFoundError:
44
+ print(f"❌ Error: Input file not found at '{input_path}'.")
45
+ except Exception as e:
46
+ print(f"❌ An unexpected error occurred during decryption: {e}")
47
+
48
+ def decrypt_system_prompts() -> bool:
49
+ """Decrypts system prompts stored in a dictionary."""
50
+ password = os.environ.get("PROMPT_PASSWORD")
51
+ if not password:
52
+ print("❌ PROMPT_PASSWORD environment variable not set.")
53
+ return False
54
+ try:
55
+ decrypt_file(password, "./system_prompt.enc", "./system_prompt.txt")
56
+ decrypt_file(password, "./system_prompt_thinking.enc", "./system_prompt_thinking.txt")
57
+ return True
58
+ except Exception as e:
59
+ print(f"❌ An error occurred while decrypting system prompts: {e}")
60
+ return False
py_files/yolo.py ADDED
@@ -0,0 +1,370 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import math
2
+ # from YoloVal import DetectionValidatorEnsemble
3
+ from argparse import ArgumentParser
4
+ from collections import deque
5
+
6
+ import cv2
7
+ import numpy as np
8
+ import torch
9
+ from torch import nn
10
+ from ultralytics import YOLO
11
+ from ultralytics.engine.results import Results
12
+ from ultralytics.models.yolo.detect import DetectionValidator
13
+ from ultralytics.nn.autobackend import AutoBackend
14
+ from ultralytics.utils import ops, nms
15
+
16
+
17
+ def do_rectangles_overlap(rect1, rect2, overlap_threshold=0.5):
18
+ # Rect1 coords
19
+ x1_min, y1_min, x1_max, y1_max = rect1
20
+ # Rect2 coords
21
+ x2_min, y2_min, x2_max, y2_max = rect2
22
+
23
+ # Check if one rectangle is to the left of the other
24
+ if x1_max < x2_min or x2_max < x1_min:
25
+ return False
26
+
27
+ # Check if one rectangle is above the other
28
+ if y1_max < y2_min or y2_max < y1_min:
29
+ return False
30
+
31
+ # Find the area of the first rectangle
32
+ area_rect1 = (x1_max - x1_min) * (y1_max - y1_min)
33
+ area_rect2 = (x2_max - x2_min) * (y2_max - y2_min)
34
+
35
+ # Find the coordinates of the intersection rectangle
36
+ inter_x_min = max(x1_min, x2_min)
37
+ inter_x_max = min(x1_max, x2_max)
38
+ inter_y_min = max(y1_min, y2_min)
39
+ inter_y_max = min(y1_max, y2_max)
40
+
41
+ # Check if there is no intersection
42
+ if inter_x_max <= inter_x_min or inter_y_max <= inter_y_min:
43
+ return False
44
+
45
+ # Calculate the area of the intersection rectangle
46
+ inter_area = (inter_x_max - inter_x_min) * (inter_y_max - inter_y_min)
47
+
48
+ # Calculate the percentage of overlap relative to both rectangles
49
+ overlap_percentage_1 = inter_area / area_rect1
50
+ overlap_percentage_2 = inter_area / area_rect2
51
+
52
+ # Check for complete containment
53
+ contained = ((x1_min <= x2_min <= x1_max and x1_min <= x2_max <= x1_max) and
54
+ (y1_min <= y2_min <= y1_max and y1_min <= y2_max <= y1_max)) or \
55
+ ((x2_min <= x1_min <= x2_max and x2_min <= x1_max <= x2_max) and
56
+ (y2_min <= y1_min <= y2_max and y2_min <= y1_max <= y2_max))
57
+
58
+ # Return True if the overlap meets the threshold
59
+ return overlap_percentage_1 >= overlap_threshold or overlap_percentage_2 >= overlap_threshold or contained
60
+
61
+
62
+ import spaces
63
+
64
+ class YoloEnsemble:
65
+ def __init__(self, weights: list[str]):
66
+ self.models = [YOLO(weight) for weight in weights]
67
+
68
+ @spaces.GPU(duration=10)
69
+ def predict(self, img_path: str, conf: float = 0.25, verbose: bool = True):
70
+
71
+ import torch
72
+ import numpy as np
73
+ import random
74
+
75
+ seed = 42
76
+ torch.manual_seed(seed)
77
+ np.random.seed(seed)
78
+ random.seed(seed)
79
+ torch.cuda.manual_seed_all(seed) # if you are using multi-GPU.
80
+
81
+ # For full reproducibility, you might also need this
82
+ torch.backends.cudnn.deterministic = True
83
+ torch.backends.cudnn.benchmark = False
84
+
85
+ predictions = [_model(img_path, conf=conf, verbose=verbose) for _model in self.models]
86
+
87
+ if len(self.models) > 1:
88
+ return self.ensemble(predictions)
89
+ return predictions[0]
90
+
91
+ def ensemble(self, predictions: list):
92
+ hits = None
93
+ orig_shape = None
94
+ names = None
95
+ orig_img = None
96
+ path = None
97
+ speed = None
98
+
99
+ for results in predictions:
100
+ for result in results:
101
+ _hits = result.boxes.data.unsqueeze(dim=0)
102
+ if hits is None:
103
+ hits = _hits
104
+ else:
105
+ hits = torch.cat((hits, _hits), dim=1)
106
+
107
+ if orig_shape is None:
108
+ orig_shape = result.orig_shape
109
+ names = result.names
110
+ orig_img = result.orig_img
111
+ path = result.path
112
+ speed = result.speed
113
+
114
+ # hits = hits.unsqueeze(dim=0)
115
+ nms_hits = nms.non_max_suppression(hits, conf_thres=0.25, classes=[0, 1, 2, 3, 4, 5, 6])
116
+ boxes = deque(nms_hits[0].tolist())
117
+ non_overlapping_boxes = []
118
+ while len(boxes) > 0:
119
+ box = boxes.popleft()
120
+ overlappers = [box]
121
+ rem = []
122
+ for i, b in enumerate(boxes):
123
+ if do_rectangles_overlap(box[:4], b[:4]):
124
+ overlappers.append(b)
125
+ rem.append(i)
126
+ for _i, _ in enumerate(rem):
127
+ del boxes[_ - _i]
128
+ keep_box = sorted(overlappers, key=lambda x: x[4], reverse=True)[0]
129
+ non_overlapping_boxes.append(keep_box)
130
+ if len(non_overlapping_boxes) == 0:
131
+ return [Results(names=names, orig_img=orig_img, path=path, speed=speed)]
132
+ # result = Results(boxes=torch.Tensor(non_overlapping_boxes).to(nms_hits[0].get_device()), names=names, orig_img=orig_img, path=path, speed=speed)
133
+ return [Results(boxes=torch.Tensor(non_overlapping_boxes), #.to(nms_hits[0].get_device()),
134
+ names=names, orig_img=orig_img, path=path, speed=speed)]
135
+
136
+
137
+ class YoloEnsembleAutoBackend:
138
+ def __init__(self, weights: list[str], val=False, **kwargs):
139
+ if isinstance(weights, list):
140
+ self.models = [
141
+ # AutoBackend(
142
+ # weights=weight,
143
+ # device=kwargs.get('device', None),
144
+ # dnn=kwargs.get('dnn', False),
145
+ # data=kwargs.get('data', None),
146
+ # fp16=kwargs.get('fp16', False),
147
+ # ) for weight in weights
148
+ YOLO(weight) for weight in weights
149
+ ]
150
+ else:
151
+ self.models = [
152
+ AutoBackend(
153
+ weights=weights,
154
+ device=kwargs.get('device', None),
155
+ dnn=kwargs.get('dnn', False),
156
+ data=kwargs.get('data', None),
157
+ fp16=kwargs.get('fp16', False),
158
+ )
159
+ ]
160
+ model = AutoBackend(
161
+ weights=weights[0],
162
+ device=kwargs.get('device', None),
163
+ dnn=kwargs.get('dnn', False),
164
+ data=kwargs.get('data', None),
165
+ fp16=kwargs.get('fp16', False),
166
+ )
167
+
168
+ # self.models[0].val()
169
+
170
+ self.device = kwargs.get('device', None)
171
+ self.fp16 = model.fp16
172
+ self.stride = model.stride
173
+ self.pt = model.pt
174
+ self.jit = model.jit
175
+ self.engine = model.engine
176
+ self.val = val
177
+ self.names = model.names
178
+
179
+ def warmup(self, imgsz=(1, 3, 640, 640)):
180
+ pass
181
+
182
+ def eval(self):
183
+ for model in self.models:
184
+ model.eval()
185
+
186
+ def predict(self, imgs, conf=0.25, verbose=True):
187
+ predictions = [_model(imgs, conf=conf, verbose=verbose) for _model in self.models]
188
+ predictions = [list(x) for x in zip(*predictions)]
189
+ if len(self.models) > 1:
190
+ # return self.ensemble([torch.cat([p[0] for p in predictions], 1)])
191
+ return self.ensemble2(predictions)
192
+ if not self.val:
193
+ return predictions[0]
194
+ return predictions[0]
195
+
196
+ def ensemble(self, predictions: list):
197
+ final_preds = []
198
+ device = None
199
+
200
+ for ip, results in enumerate(predictions):
201
+ for ir, result in enumerate(results):
202
+ device = result.device
203
+ _array = deque(result.cpu().tolist())
204
+ non_overlapping_boxes = []
205
+ while len(_array) > 0:
206
+ box = _array.popleft()
207
+ overlappers = [box]
208
+ rem = []
209
+ for i, b in enumerate(_array):
210
+ if do_rectangles_overlap(box[:4], b[:4]):
211
+ overlappers.append(b)
212
+ rem.append(i)
213
+ for _i, _ in enumerate(rem):
214
+ del _array[_ - _i]
215
+ keep_box = sorted(overlappers, key=lambda x: x[4], reverse=True)[0]
216
+ non_overlapping_boxes.append(keep_box)
217
+
218
+ repeat = int(math.ceil(300 / len(non_overlapping_boxes)))
219
+ non_overlapping_boxes = non_overlapping_boxes * repeat
220
+ final_preds.append(non_overlapping_boxes[:300])
221
+
222
+ _new_preds = torch.tensor(final_preds, device=device)
223
+
224
+ return _new_preds
225
+
226
+ def ensemble2(self, predictions: list):
227
+ final_preds = []
228
+ device = None
229
+
230
+ preds = []
231
+ for ip, prediction in enumerate(predictions): # for image i
232
+ model_preds = []
233
+ for ir, result in enumerate(prediction): # for model r's prediction on image i
234
+ if not device:
235
+ device = result.boxes.xyxy.device
236
+ boxes = np.array(result.boxes.xyxy.cpu().tolist())
237
+ if len(boxes) == 0:
238
+ continue
239
+ _cls = np.array(result.boxes.cls.cpu().tolist())
240
+ _cls = _cls.reshape(-1, 1)
241
+ _conf = np.array(result.boxes.conf.cpu().tolist())
242
+ _conf = _conf.reshape(-1, 1)
243
+ try:
244
+ np.hstack((boxes, _conf, _cls))
245
+ except:
246
+ breakpoint()
247
+ boxes = np.hstack((boxes, _conf, _cls))
248
+ boxes = boxes.tolist()
249
+ model_preds.extend(boxes)
250
+ preds.append(model_preds)
251
+
252
+ for ip, pred in enumerate(preds): # for image i
253
+ _array = deque(pred)
254
+ non_overlapping_boxes = []
255
+ while len(_array) > 0:
256
+ box = _array.popleft()
257
+ overlappers = [box]
258
+ rem = []
259
+ for i, b in enumerate(_array):
260
+ if do_rectangles_overlap(box[:4], b[:4]):
261
+ overlappers.append(b)
262
+ rem.append(i)
263
+ for _i, _ in enumerate(rem):
264
+ del _array[_ - _i]
265
+ keep_box = sorted(overlappers, key=lambda x: x[4], reverse=True)[0]
266
+ non_overlapping_boxes.append(keep_box)
267
+ # increase to 100
268
+ if len(non_overlapping_boxes) != 0:
269
+ repeat = int(math.ceil(100 / len(non_overlapping_boxes)))
270
+ non_overlapping_boxes = non_overlapping_boxes * repeat
271
+ final_preds.append(non_overlapping_boxes[:100])
272
+
273
+ _new_preds = torch.tensor(final_preds, device=device)
274
+
275
+ # for ip, results in enumerate(predictions):
276
+ # per_img_preds = []
277
+ # for ir, result in enumerate(results):
278
+ # device = result.boxes.xyxy.device
279
+ # boxes = np.array(result.boxes.xyxy.cpu().tolist())
280
+ # _cls = np.array(result.boxes.cls.cpu().tolist())
281
+ # _cls = _cls.reshape(-1, 1)
282
+ # _conf = np.array(result.boxes.conf.cpu().tolist())
283
+ # _conf = _conf.reshape(-1, 1)
284
+ #
285
+ # boxes = np.hstack((boxes, _conf, _cls))
286
+ # boxes = boxes.tolist()
287
+ #
288
+ # _array = deque(boxes)
289
+ # non_overlapping_boxes = []
290
+ # while len(_array) > 0:
291
+ # box = _array.popleft()
292
+ # overlappers = [box]
293
+ # rem = []
294
+ # for i, b in enumerate(_array):
295
+ # if do_rectangles_overlap(box[:4], b[:4]):
296
+ # overlappers.append(b)
297
+ # rem.append(i)
298
+ # for _i, _ in enumerate(rem):
299
+ # del _array[_ - _i]
300
+ # keep_box = sorted(overlappers, key=lambda x: x[4], reverse=True)[0]
301
+ # non_overlapping_boxes.append(keep_box)
302
+ #
303
+ # # repeat = int(math.ceil(100 / len(non_overlapping_boxes)))
304
+ # # non_overlapping_boxes = non_overlapping_boxes * repeat
305
+ # # final_preds.append(non_overlapping_boxes[:100])
306
+ # per_img_preds.extend(non_overlapping_boxes)
307
+ #
308
+ #
309
+ # _new_preds = torch.tensor(final_preds, device=device)
310
+ return _new_preds
311
+
312
+
313
+
314
+ class YoloPreprocess(nn.Module):
315
+ def __init__(self):
316
+ super(YoloPreprocess, self).__init__()
317
+
318
+ def pre_transform(self, img: np.ndarray):
319
+ img = img
320
+ shape = len(img), len(img[0])
321
+ new_shape = [1280, 1280]
322
+ r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])
323
+ new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
324
+ dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1] # wh padding
325
+ dw, dh = np.mod(dw, 32), np.mod(dh, 32)
326
+ dw /= 2
327
+ dh /= 2
328
+
329
+ if shape[::-1] != new_unpad: # resize
330
+ img = cv2.resize(img, new_unpad, interpolation=cv2.INTER_LINEAR)
331
+ top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
332
+ left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
333
+ img = cv2.copyMakeBorder(
334
+ img, top, bottom, left, right, cv2.BORDER_CONSTANT, value=(114, 114, 114)
335
+ )
336
+ return [img]
337
+
338
+ def forward(self, im):
339
+ im = np.stack(self.pre_transform(im))
340
+ im = im[..., ::-1].transpose((0, 3, 1, 2)) # BGR to RGB, BHWC to BCHW, (n, 3, h, w)
341
+ im = np.ascontiguousarray(im) # contiguous
342
+ # im = torch.from_numpy(im)
343
+ _im = im / 255
344
+
345
+ return _im
346
+
347
+
348
+ if __name__ == '__main__':
349
+ parser = ArgumentParser()
350
+ parser.add_argument('--weights', nargs='+', help="Model paths", required=True)
351
+ args = parser.parse_args()
352
+
353
+ # img = cv2.imread('askubuntu2.png')
354
+ # x = YoloPreprocess()
355
+ # x(img)
356
+
357
+ # model = YoloEnsemble(args.weights)
358
+ # model = YOLO('./train16.pt').to('cuda')
359
+ # results = model.predict(['askubuntu2.png'], conf=0.7)
360
+ # for result in results:
361
+ # boxes = result.boxes # Boxes object for bounding box outputs
362
+ # masks = result.masks # Masks object for segmentation masks outputs
363
+ # keypoints = result.keypoints # Keypoints object for pose outputs
364
+ # probs = result.probs # Probs object for classification outputs
365
+ # # result.show() # display to screen
366
+ # result.save(filename='result.jpg')
367
+
368
+ args = dict(model='./train16.pt', data='dataset/data.yaml')
369
+ validator = DetectionValidator(args=args)
370
+ validator()
requirements.txt ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ gradio>=4.44.0
2
+ pandas==2.0.3
3
+ numpy==1.24.3
4
+ Pillow>=10.0.0
5
+ extcolors==1.0.0
6
+ google-cloud-vision==3.4.4
7
+ google-genai>=0.3.0
8
+ ultralytics>=8.0.196
9
+ --extra-index-url https://download.pytorch.org/whl/cu113
10
+ torch
11
+ torchvision
12
+ pybboxes==0.1.6
13
+ tqdm>=4.66.1
14
+ playwright==1.40.0
15
+ requests>=2.31.0
16
+ selenium>=4.15.0
17
+ datasets
18
+ huggingface_hub[hf_xet]
19
+ cryptography
system_prompt.enc ADDED
The diff for this file is too large to render. See raw diff
 
system_prompt_thinking.enc ADDED
@@ -0,0 +1 @@
 
 
1
+ 2��ެoPt��v�gAAAAABo4AkWwV9Mh02cKBl8q-8O7D3gJF5k2vqltw99VgUgYOWDKaj1RGKEUnSOYM5Rnzeg7FonjdqUF5ZE5rqNPvAWau5MPIsCT9IzkB_LacwkjEUZsDDiUuQHGegLTFwm8ni6vG59L86Xzt9jnWJ-1btNcB2r7CdR_EZHGXceJPPCjWiVdp6Oa2ddFlebuQxR1io1jFvd2QuaZEr6O8oNYtUL_4v4itmMv2pl1tniSymqOcHyosJcEc8-MMzAl5DgEo7f0viRdp6L9_MjEEenXpr4fVKVbtmQhgdDyeZmfC32FVOcGZz4tt9lAZCeTsTT3hu2vJOQrevoy4EXBxYOqe8PS09rL6Nk8_lvvvn4U5fk85HtCR2zqxokIB0lOor4EhqUM4BqqAn3XSJTQeE7Q4WcVx6sEbiL52hg1WkvXVar-eezBwHZw23DZec_E3mjdHYsZ73kQnpi5auGk3AM5WATlqGc_qgynvbz8L8df5hc6xcaZOlHH9xVU0XXCGCuPZ1VSnzLqjvXq7f3XfMTnAPgETMu2YyPyZjbcL02SabNXN2KqBe4jhZ2BDBhTTiZ2yu0YgjE0ZTGSI07l7EAVvL-uKa03eobSrtlrLorKjeoPAALEhCbSr9oouACgcLntkmD8bEyOqguvsMY3io6Z6BxZOriP7010UmQ23AiY46c8E5H050MFe00WwyvxIvyqLBUO1GiRJEOl05BZngM2F3z88Yydi7O_CP5TvnLrhhMlnfwK372QSF3XS-0fzko-QnGEOwsJedtn8hoCnAhqRtD__DeNOXbaE_MAJD3iLJ2r7pvh2omiw1wlp8a7iU1c6MlvN_J2tS4s7FZicFjZdzJ5ETr1YIMYp1Nyq8svYCWlVvU_14aa0MCEOWpIs7DNRjfimgna3X59BWOkoJgLlVONOUm3jw1oUxnzQa8m7paa09bXE6wPT6wkwLPvdu1IdzghGhE3J2LzkfeY6l55TAipi3cAdfbbOKzZi_NFgnYSYkduejXDHD7qZssc3ReOL8qrg8mOtjtchJnHNBl8PlznwReN9q8SNNhuhqWq_rgNgF2JDkQOiIQFBWMDwMRnjh0lxp-O8G2w3t8yFUHfzbKfh9NUsU-ygA01ERMSvzJWVcBknE_U9Uevuk_g7Qbe2e5G3s7Y6SW5B1ogJA0YlvD8RsLgGfOwEzkq7y6AGjPC5_Sb-d7ipwK4EpHDrMa3sBVYqG01uWFsUl8M1aFkQUrvv9iNlCLsqVs6O9COQ2l052zEKKnNm_l1oTCAokXedmwDO70Y4SFrc0HU4vtGNk7XywmNhyKSXLCp0-02E2RXmPPUBP9FbbXSYnKbhVu3qIQJ4pmvIOnVmap8xEzzdT8MTQna2rXhJS77_CqDqxO0rUki0_OHkd7iQoMJR1ibLLh-kp5_s1_0fcsWCdDsKgPICW8G5JtM-m2E2SFlr5HRXpdyTZEv5jA6wyMEAKbC7-shI5cWmLz-g1zhbnvsI0raf-sNyQKoUGCNQDiw0kAX0dOSU_2Z5Um0dUIw77lgCf8rvizKdP6_Y4yH58Oio4kjYLx8_86UMDdnwmGhx4zqrLzb-6ICQZq4JKxI9UFn4OcyzspjKDBH74q5sjD5I3KpjqH7xy4dgiXOcKwlycnMfM6oSSnvkWZ_b8OiaUrQxgbTMB1kOKgQ_-Hxz9Jf3O5YfOrQXE70GDzRyTYQye0i6s0dL2QkOczAw6d0SgoLB3YWGKwP4W58MW2AS7bsXnkIt_ubKQpauPXy3Hsa8cjVrM4_LUnuESOfNFZRNioCbejSBvrsB7AM4R1nbJp184mJ9lgllaAohSgVT_RwsbU4PxNHL-zdBOdQZcBJc2cLF-eF7_TX2ZX35j9mYQeCVDIOtDK49ajOalkvThuhGl5OvtDAGexSnUwPStXmR3AI10mKnMjSU0C7Yybv4JkAV2fBDV7WX81oaKc2H_3MQSV9N6XWPra8lz-wa2_tUHQNF4qvl6jiRbwCTjQVDozz_HRIcqTzNfhiKtOTo6O17gcWfA9w1VgGI7pajXBf_drWUMLSh7_N7thZcMR3mmn9D8UbhX6x1XsgT3B5HgyTpTriv4PmkCIHZFeqhB_ZpaEy_-bFVbybifsRmzi3ic90dimj7VrNySv3ME-5sBkfV6hf2ajpvq9ms89Pc18h6I6syENkW-wci2UYTfwvJWfjzfOlfZbJueVjMDX6xsr7VpdejKRiSi_EAN0UM03UxjVdeR_AH6DbS89QMHYvoLlUivlN0Wjb1bnsJpM9PnOexhivCJt12oKWJX2-ZaVhNiFax3h3JMXwchoJW5Rvzjj7EgYYBt_DZvNdmvIU81ysZNTm68_Ty0DrFAnbrtm48wWiilzWKXRCaaItr010j619fdok_bStCmrp5qwh3qj1EV65y28riNFR8S5LGB9NfxiC3bDrAS9KPrdDHLp_YhiJH2PbWsau01uqIAK7Gb430YEsDDn9woqbWXR0SIKt_QQEysUKXLi0IXXB8pFqSOfIIeX0kX08rhtJ2BG3PoJfwdWDRq9F10kLB7KeKwUF25_wElPeXb05eSH0SBF-1kxyTMpu4eQpqFgROHcm5TdknAnv3JxLtIpZxhl5mjzpPxsLNpfEQI0ghBG15vaPz145J7hnSL6_tsCfKDQ81rxzl5sCJ68pLoBidfz-HBgqQQaCNp9kP1XtiuMtbiCZtYjlYIo2ZcV02wb2X8uoCwJqEEtPKBdlH1qT2cWJ9VGtdlpPqpoo8c1VEZZPolSsxhl4hggdGbPl1esC8jywV59JMnDSMvsYVJsyceSh2IV7YpaUIOkRyz8Hv_664QiSapelq519nCx6Ww3UPPmC2M1bb8pTMGNsj0dplYv4EO9PcujE5bjwgMfrR5xHdVLnm1moedBqRWaMvJ6ONxmmsbhxN6u-peMP9v1p0Vb3jE05ZqOsgVBXL8UhCWFvcHmGu-Nr2UClyEPmyIDiJoSke-SSZ-K2fE6EknMFQVvtTOiQ76Z7ZIqOlV6-ksbag5d0TWW_jEKaE51AG3i24dcCrAQB_EBscmSFzPTxVmTk1zPbgcxqjzne1t3bgDMuJQc2hZGg4EMYoAaQLMAe6f8gSvJ5QHV-gqRaI_kDcwa_kpI0egW9bVRUfDHeqfr6mhwttPxnptl2gg-1HIhkC8fpcLTgHukDtBCobkmFY2M3V4Gj54f1O3rYs8fB_0EGBcWpNRvO0kvek-PoqLgljLCV7-bENIVJMIaNXvFyZX3bEWhnuFeFBJ8rf2W_SyHV4xnmetz-3sK-_Eliov4BWPyLwgSvccchtT8lgXi1av2FRfF6EmrVcx8srkrxu_PqOJyBOmwExMlcbZ1G4rdBwTVsDY8JXVFUomnGzejflmnayALVbRVcXhsC70ZlokTYTwB9XQQ601D7a7t9orLdjlvj9Ymp-hD1jQz_v9N2CumjOXfI-mpn_22ykYoqhkwgh58CIzZVUvkfmM69uBXKrjoYgyttgEFKfeXMabbUvQQVf8sTHu4mpLVhdyr2kbPGdrMLEEYGtrU55q18jJNRDgUArIK-QBScoVyexPV6nulwk9zDcP836WzFZPK0r_vI209H2HZkxtyHVs6-lGQtv1VBJhZ9ISU5bdz1mVFbJcoIEk17S25KUV1M5S2w0MYmqIhHBTUFYVTgD5UB0xqTiMVCAcIukOjarBZA8SJuaxtYiby0g0PFkO3-0zsLkMmVoq36k3eT_zMD8rVKBmxYNJIcZD3tvsatRP3kfgXyh1Z3Buqvc54MQ6iiea8OoM_24AAWQOi3jRlJngWoQIKbCHA8pwfnHD4K14P48wgKAxneMtMEdOLZfHkyzmV1XD9zU-CRoHmV-vnHdJ6xH0lkoNmfHamwQOfi8b_7W0H0QYdlsoaxE456sWpQmA9M7xGQprfEyzBwbGR9WN1qpYEzkc65mTG3VAq5n-qdh6qmsndNwQQJDMnGXh876JWs6mFMtL_Yuc784qgHtAL3B2RZ5fvez4LeBwjZ7zdRGXXAyBRd3__zaeU7z7S1v3apXaqXh2FRpGoLdAAPBrexmU2nYQafR1wl6thxoqxEk9fcWpNyPTPEyfzvQ99rvVC_UZ8DNV5xcn5FuDWhKwhtBbfnpQpuhJL2DZMODa5WdSCf8TooHk534bf7IsAL9tZXZsmesWeW6vIftkmbOxunxTd7QwwSCWOkQjg3AQ_z6oqWNZz8QCv7fy6pFNBy38VC5TgqFWAIwupXtgDxHRv_MeVf400x8G5ol9k9xp1FSl2F-TWnKohcPSskdPU0M11igXCFrmqYBmRqczfoQHtGQ9omBvKliglLGlJcRHGRmPGbTFFQGAGu25vP8iR_e2tmB3CLiVoN13tlWc0WVH0BYn1jxu7KZDy_-ZUkBu7r5mKMgoyWiSv66vL13pU6PY3t7DDrI_Q8sc2Obrmk5hzUdFPCCPTiaCH3hhqkAXnjup_CPanu2iydarOkM_wWPk_qvkSH3KGPGISBZlo-Nu9_Gr9Fd-YXwpi1pabSikRlEJD6NU3QhJV4nqdo17mKPgB-6H2zqCSVNZK69-R6VuRcTU6Wg-unYEKLePj4S2rlkxoPb1Dn3zQWwVbnn2r2iT7pnnSgx1zbR4eT0xJLaDALaejKL20pv94QgUXpIu66D7UGphThaTfMszcDqnGd0HQfh-hZHRC5WJBRZQ3HeRHG0CuUPBOYnjJzLUrurRoG8Wg9O2v23goOIhrSj34vzW7PRfX71o8lX3bCPjpPfkqNX_d7bJ_R0ipkZz1vKTlp556JBLQ6TYKYt0F9mdMfg9O2zbTYIaezYRNRMCtK7y-HJJJXy-e2j0BZqYd2KP8t7IGQEesDX3-bRuv4xx2NF_hnZNSIbDP-ZDWOXuOlIUzXtozhtbSEfAl-KWFrc-zrI0wdqD6RztF1qV1s1bmkXiK2tJn0rqiZZvWcQLAXmKhbQ1EUB5IUTDsyYa939CtdoVEfWlrNuzmWYX46zDDOdvXFXLZ-0lTOfpy3wKW-Rq1jisp_1XyIWRa3amrpfcpqmGbEI8fgcbDdpSmdY1XJ-47F_qMyrnhHlEd2HoV5gi0hLmsanJjhrbkiEYh5xJtM11Wf-dozicKFTdLURmbDJMT3swi-bBURQZZ_a-O9ZOok8AEwLOUkTJwwTD0L64iBD6-ZemVKQ9IpsjO1Fly_RMrC5zMD_Zd8gXGViXeYjiFSBODxBB5RPt4GDbA2SEZhB8oWIFJyeR5mSpQ2d_ays0ESRhRol5pk7nnd8eDMh57Sf9BtO0IAzhG9suBQHuF1hojCOb03izrjzhf4cFEIpVaBG_5H01TwAC9MJWaGOylBmpqZcCP_GOPcINyhusz2G904RpSt7ZKLGeettQqaogSA1Ok95qJKpDSlyEUwP0nJXMcvkHDAr5OASxJvTh2EkJg74aaDwtYeMSaRyB7z7Rf-KBsL20GFgTDrGlVpZtC8ghF4zTijghIKrbUPxXhhtgm1U6B9d-bncn2qvz2Ch0nl0XggWcuufC9KJRGlEa1NnhWGmbnKpimkYdl9wobU9-zNQwn-mMKlZWa2yW0rvjDgCGX87nemysC5kyjXXLt3VPMfRe_2lqfTmkzY-OjIUfHRxalx1fda1oE_VMWhiP8A_EcO3n9deUlaQKp9ObTeap1EOYx9wiBCnEpZ-BmC4Xl_PUG0Mua521u-e6cH69uYM9cCRufWblHCaR6YQOJpjcVrA0YIeAZSxczEGOYFcyBI8xV0HDwufQpVEIA7s5EAlnZAHUg1x2DkxO9wAJklTvtnzt_dqPecxftU-PPBZN3Gl_UORA8OzuHhT63N2TFDXHzuOcWGqMB5Vy0Al3UB7uDq5Ca7DJU1pggF3rTF49dNN0ZcCojmY_Zg7gfpT1-CelWzyKMC6K-HJyf0KcPBB6ySyVJbmiOjlTJV0UM8Bn5Zpyy8bqbLlXZdY9SDVWhpcozyo---pIAAn5aSLL3cX5jpgq9LgubhBp3biSlplSFCjJCw7kXnkOK_5O2iEFMndOO-iMxuYlqP3BeHkIRUzBISD9YYHRmutco5a4X3BEsgA9mFZeVCU73jc8r3qI9yiS8ncsl71h2G8M6wswAKnovCDqVPauKB7e_2XgVrj5cpL8ApwZa3uyG7zRcxfK28h7Y4DlAEYdS44FCmKFT2TII3Z2yixnY9AoiSpZhkqt97jTNrM_5JhF3xsBD1-UQR3X-WUocx9PgRWrQWS5gTCQ5yD8pFTstOlLE_McelGjnDWWQ72q3jawICCQ2ubTcs1sXAbWuUGA41Fk7uJe8TOIxhrR6QdMcgaFR0-j3rOh3skeejlmMmTmPO6yNjg-bjuH9sJ9asXQMdSiO_eNef6Cti0hPOD7tLzfQV99TH-7ZN19PLj3wOMN7RPK7rRbaa5OWIr46jUkmlKc4ri17fd_ATJ3kDJbQ1TU8dCvsvMSUmnvROD6By5GnxZMVpU3bzcPIoe_uLgIPf0a83WPE4HTwsFWkn6BB-c1I3AInuDRAMjC2761Z6rawH-QrnlUihHtCF_XRgPbG6r5u_onYVvHSAxbZKdjq0gX4iee5w-smXEVNCi0pj9V9tEA4ws7tXfAv6HkkkOGn7Z3-bBqZGG-hV2Q2tmuOICb6ZC7xmep1xfofnSJRa871dgi0cEg7eCl6fXFaud2uMpF3bRrXWMRzelMCtCRqT-TN_zXFGwm9KIv-DG0MAc23zPjJH-bLMZmrD9Dg_mTZ0rQqSe8Gb3hVpgtkNW9WeF-bTdbsFg8x25XMx9XzDOCG62zXgm1-99V--iDJaSi_KMeDzL0KZfqbTMsYkrXb9yqTYfxdwBWt0_EzZNBd9nZHFu0RoXVe9rN61M_3Gubo29KrFUosrgQy3dHcTCR9WhI3-lMXllG3EjCNBG5YUmYiGVKdLe_uHHI210XMLcd8Ssi9ozZaJ4qN9plMkUmP06t5bKM6wyLaRPpnMrYX2oNhVTkASkWfDqxYgFAWfYLSnJzCp4wtHsdCEHccJI7kpnnH9CIVLMIb0X0FzmYWq5SkJNRghDGjYBqC6oPUYe8ObPID1fK6gzkcNCTjiubOdSoaPrHdeL49CQN-YF5GCz4BPgpc7x-fL1h3pIZdO7ifCDrH4hVQa0Ey7w2QjmhyJHPqXnyCjTDI1I8W3yYP1UKYaUyMCaYSru2Hd0-DCVuwAl-MsKd4V1Uf5XVEtCziEO-jlUCOTjEdsRMdqMmaJmLAG_Mj91uRfzJlVVaWanpZCFjRdzV0QzayenhKsEpYJVxABoVah4HAn4mMRsPWRymdNBfwRBY5cs2kpBODrXo0z6U7s7YIyHUrSm1wWlg_TJvx8XdLKB23KLsxy_sv0TS0zf9GLEnsYcF_6zMJi7Cv9qhZ9MAj2-WHxz-EEfAGhtxlvi8hn9Zvab8ONC2EcTN_K1w7lLrtlRz2EC_CT1bG7W39MLhTaXMGQhPTw7g-SWr6IjJWV8LMCFg3DDknW_IU-469VPrYxPauaUxuJKYxUjb6mbqFqUe9mcavpLxar5VjTuy4lf8AW3tfHRHm1e2QhVTgLcMNfnphd2Ob-ct_oyqWmwQrC-qoT5-c-jaaRpl6bLH1hF51Ifiw88ee6RI3Fq-B0YmH4m95JNruL6kHuAWXc3HaUL06s4_2Iyj-la7DTx12i-BPTma91B6CheI_UUp5F7Pmj5p0_EC7A9M89iUvBiloRqZyvO3VqTIOD_8yuZffNTwd78NBww9W7ejyZ4fV3ab5fnLjwwVUCk4Gie5J5pst_oPotMhQCaDoXWgNC-RnFI6bjtsa1ajHG2BDtwQJ4R7o5EwymPU6ZmNrE77mVj-0z9JkMh44tIrq-eD0s_MXTxra1s5uPeou_rrqlxxU0EpfnhmbR3F9MnbT5Igak9j0sjCWgUSPNlOQu60NyupgfjzWymK77tpWG-MgmEX50eQtD2QDiDVihZ6QhzzA103VwLGbC6rostbgrfZFR02kJufUrMzoFv-T8zzLq1tBN4OYgcA1b3XPlKdg5FH7ajndjewQsxUSiXpwIiaWmvuOc3-ccnWnKum9DCamlsaAnFrqlmE7keM2nyxm5jcqZF_Imo6AdVk0iYcF59Roc9ce7MJ95b-7A7xjU1uOaqB-yRmb5mPpvnRSf82vWBb9ds3rIEsByYkN55Qlo6dN6dKBADi9uLFizeLrqtKAacXqtGMMwztlXArilzC279amhruerKEcccFzd787GEwAIXdUSlGYrvSbjTTbIVUarmtmC2GXcq1_HjOsokerJeewjpWLBxD5yIy78oJVMXNLEZeItRBNwopBtIU6RFUAvgURZSVQBcjcylaCbxne8nqWWd4I7dugVESWPY-Yr7XQw07A7N-yRbZXwPnhztI4kGJEUrTHDbAzbHM3QaVb2CZA3NpHD-jhSLkRJAKuMbRAjOUn_eMebZXRu1GdxAf9qZj7WBieCKtxbxcpGNwBUR6PvzF9D7_mYrHjSnpYAsie9svJHHDGzbeILiw6AbB8k5GBKjGMy-uts_8dzfbZS9OoHrHNrENK4bpKOOOlneUZaLEvubsEXQHMCrgWgQ1wyZWx0i2oUL3oGzmH1X4MMJp0tV04oIPDn0_naYDD5iKpzpBLiRacPPTQ8ZGJJhvqND8PKxrKQzkOgurFwjqcQiFRMF22Kgdmd3kp7V7bhFVsNYBgNDOjak5zCqzTOABvWYku46mbwMv4mry-wSijZDhHUe5F6W_mfb-Wl7C2geEB5mhtQZtNIpXboF2s70aknuEhNH5jYZhgNmAJLR0yQfg3nGYi1kwapEy1HlSl2s0oOUgeawH6ESQxwbEE_4Fe39YGr4IhKC--gPYpUmVSxaKSrU_gVFWJpFW3-SgLR1XieU2NHGcS05RBuPHYNdO6G3N_whO2prncHY8S3FYWmV6H1O7KfAVFjrbkFgeoNXnjeo5YJnJ5S9-OwBHs-nLAM9T8bXBhK0UG1eCdLxdg8XeQdkelDf8cwQBH04QR3isgbhGnPrDL_DnKRXhinu79KN9gHlTpf7_Rw0VwFl382EZdCPR1HcM1ENgon9XxAgGFHueAcHiAdDp7K-fWpXoTk8xcMDX2jd5RZedcwqDOFgS3N6WyK-_YKInl7pfK2umwHWqSAv-QZTvFDWHAw6bHZyaNQMIjbyc3kAuI7QowsqCRw7ygmV_KWae5xxU_Um5FRl0Q73qVcA0BhSzSD0OcxiretM_gm-XLtgb7rDmhlGLkEbLzKl40ZBfL42_dDS57mERIR12oiJ9J10TCTqDhyiir-HeszlCAGd4NOk_9WEr0BVOW_WL2HxoF4jPsypQoocxN1sYLsWj6KIR_AdFNjYmvwtAYwrfqOSo9XBWx9zfYU3FHg1cIohLeNk6ggw1DvAiEXoI6V7mpN0IjUWOXdlu79gd1VeBGFi6nM4rbZmZVGE937rhCoW1n-Ink2ClBfkL0NrsjMSSOvKMlu82MsLSa9nnnqLCaQ7AtemWE4ye6CcAUJE0XIlnp0AMjY86PD0lSS0xRAyltvQXRS2VMAAJmjdKS7pxejJ1k9BNK8hAYzNs04EetxExbd5Jz7AVYJFk4BG--Qa3l7NDEY1VWhJ7BbHP9jIas-RN6sZiXdlzHq2jYkabONgqQrEnTQhrKKggPskFODKZtUvOhifNLUhBYwhyWPOdJbJnwttjGAYVZZp_WJFNzJhfK9Xzix_9r98RXsnnwQOhDAfDK-9nHrF9I8m8MDYwJ-Szgdc27ZuzPpVvF1cYVeL5FumMP_VgcY856y7EjSQ3aFwBULoehrSpYK79Wilnla1y01PZuheLEsb8Zj1T0TNZr-0SDasiMxJIRuv8vyaFluUcxkcqNsFNpfDx6dEyMLq9qOxx-NoNY8TZ38BLrx0oIIl8My7oL7bftFU-haJD4mYF5xssdy7IkzFEZZhz43CdwubR6PXt0XuJIZjMfRKpnLpOeKGYC9GLudBHG4KOzR53ehEFcPu2ib1uLQgvLhN7Tp7QzUMsbzZiIX50b6B5HojgJj08QP6Gh_YvOImJJIDiAR18e-US01fFpo0dpvjzD4gwPnTMp4vCGAp3Z5qrVaO_tPbZ22RvQe5n4JVq5VJzdpsWoKbx9TSIksjrj0Ounjurwski_sSdALqOhDOqQ2z2o1CXlu0W5fVBmATb-1nv3KInFZuHRP6PUSB13Daq7NTzzumkDcyP90RHObzpHDHrIU9pVRzPghSsozbhELqVOp4wwq8HwVBsderd_VlBPU2EK62RM_Jvo8_jZKf2GjKBsKnYHndC_NIocO5b4-cTjgmyXFLgLpMRydPlXcEx_8xjrBwwIXVJhW9CQKt6vrPqa5Isw5NxTlDC95xfv9WG3T2pQeoiKGTkeLZ9EKOU6y1BKo8TgyxbwxW9Yx8i9UGn08XrBfy0dvl07Uwdz0OpNWoZwiTrSFrwvoBZyZcgy1rX3MgYhjhDlpEXrqAS_Cf_kcxUzwdB0ZkuJi3-WHtuiXg-WKhq517v-lbossVIDE2pytFwtfS9zeWR0gImKgWvkg2i9U3NKCq0Egky5MqpHBDzL1ExyKLTiz-VOV9Y8OaAg8_LhgsVTcVlcpIMoKZg9XeTfsCswGEou2yCyoixMeez0_v_5QxVCClMXeSVFmOY9N2gGDQxW1dgGOGG7qX-_jZtG2UnnH-eHFq4JXqNW9SCF9AUkL4pBYqu2sxdC_24xN04RAJ5PjaJpW_sFI4nXy52zpQjgMYL0Yz7zudNqdmEZAcHlFmRE3ZEs66gT8xBRpN23tLzaL-2tbJPMuzj3BUsnuJ6GnsIO1CvB5T2M1hz0d-NWGtQBelRfTGsIaPRg2QbJSWyurbfTts6_nY3PkriMlSp7PRxsKQx09JXpIGnhy2-kren2Hk9fX3phX1GNtA1VIKBRnaQBDxyps3RXJF2udDDON4GgE6ES-yZVu44QolI9mE1dtzk72vn668iUv6Ijzeeg3WiCGbilNIjKVdEbitPbBjuw1ZeGDvSkP7jboMOUCD-UTvWH7BdVMDyXsOZRSruLRIHdFKJrwYW8CxGUJZ_WxxBe8-luJoa3fKGWOD5OiQdfFJeevMdk5OLcGQuZj8Rxwd8IB-oVaF-xsWsNKp3RPqXynSnNiSlRxGLT3PGDsOHnFVuNo-q4bYcAipgDW6cI8KKcKKC-YQ6wE0fAEbz-AaFf9i-Fcunb-EinvXb4ycF89AHZMlTWbSIlBEeY_dXg0otkzZsgfu-sDuOu88-VqeDk30YyUHDPaoiKMCMpoVHQLAeg0PA-R77du0URuVXq_CvsI7mS_KXuvh3m7P0oEx7RsUeWDwfh9f6y2tfB6G2l2AHEVJx7bLt2ElODRQQOjYkLIjumMdVxMjlV1fQAD11Yqmq6Gg7f9lvLn9R9W4GEQYWuY9qhbIuHcRgZVKM7fOD65gYWXNJv5w2lHjrP31XgiEAhrLuHj5oJz07TDuuEJFouOD2zDIbpj2Lt4XtYROWTRhP1ZioXWDxyNH8qzTk-ZtmhqwW2xTYxo2IC3gS9Niq4QYUxsx0EsH2Be01zp-Jbv8aDTuEgLp_x7dI4iC89tno0rFTwLQi5FODJKkqV-YW-5csrTj3STvKGtzis9hX3QP-Z8J-baMDWSdCClvkYc_3kcclE6UljSQd6MErFtCbGO0hjOoPJbpp9WdL_C-tFcv4lowQ0JfJW9sso2Sm--A_BqdT41NZmbQE1dcuPeVV8LUUARtMTOve4Hv7hq3tawH2qv1oNX3EtzDg2YVknQO8K_c5lu6raAKG28TVrSzccvFjLJDHrccxqsVcHnv6gJ_j9URlSbmSx_3p6K3dbWM8Vji8LFDBrOSWk4mEMy5DDdTECSkWBnsgYL5b93ULpbl6QEMo50gi7z7i7gnldxY7nCWG9PH23TDGpIpXZfdmOxPXM4FwtyCxM_pxyrh1OeWFEqv0Hjd0Ceey98M6l1eRJMWojlGNi84bxqOHlm6rBPYW7CpQZzgfVsLO5jYUtG4TIg82i2ebgXilB9_cvlasa3XjQvhJ6peYDKKzKJLtw65IwEmSGB1m46W57GB9rrZQyAQ8S0-rXcjv-MvIRyuoSpd5o1ojVEXbW7ZeVsBfgNOox6xxA6A8BwCzrrnoycyBwcoCwHaRPlKKgmxqV6KlcO3kS_Z38XvdiyslOM23s8Jrphhub-80DNjk4H8qWwgcOAC8JO9ovUL5PCmfgBibSAxlot8Hi6tobENp5DdVjiJss_SOBxc5EvNbw3iUm7P069pzg53BEWLK0HMnIr3NYFZeML3HjFdxFgajvkunBRaGKIQvD0NDct_J1yg_4PtsJPLzleFYYbfUGiCasdXOg6sfpQOtKhn43Gqduce9GKjXI-5_T0OlQX_QYRsd6r9DvWA4DnwzRKgx_MQwAVQExqSJPz4mkIOTUfNeg1k1SKixMFYJsibD9A22YxgG9ddCJhX9ixVEi3fUov6tzFwDVKO1fkWz-h1oZDyahHFcEt7O8tR5CCPNaV-GSfynChJzed0HnaPvl_8jGL1RpWcKu0pIxgVyvO9Vu6MV4gNkUiCZP0c-DJ7_fehMuHvAj5UzpWUi80BZzv3hcm3Fca5A2cqof80pjDjMIpzGtxi21MS21xm0fWIMXIOZUwhLuPabPLFmuCREj-hiNuamWV_1f8OOYzf2FTXIMuTgDs5_oJpuCCxv7bhFIss1DoK6mqZ7pxN3QBD01LPi8FYLDKz-ecVx92PEZGFsH9BrKDaTt_yA0J1FA-D5dLBh7QzfuYFuZOAUGbs3ieXN67gms1AEX0JjQXnDPW9wxJzyYFE-nNNAwNV8kkzkObYmvkuk5n92dJ3Po2-FprvhWLBuBbrwji90gHKIq48WDMrIP__ndiGHcwNXpWqv7Bypaql8ifNbry0b_5m7KDQRiL3e6BFkgh50rIlTGnBW_bPOn4oS7UghzI-f7R4veNcPJ4myk13IESWXKXCndf7x9ArrHzlWoP6QVOZxlaG1shTalnoO3UxQDpltlTllIHlHktBoHkazswg_xOcfrGBiBtdaSIawfO2p8XGnh7LTwAMQOG0QxqsVLk9ZNXfNkCOYgz2aXz7EIonTmPrQF8QwsugytiS7gFpJnhN_w5vxRXfDNI8oMwU1bDwlNM1_PVyv4eEwOxUMO3D4I0kro9YY0Y8XtJnRjWBeMNH7Y8KvXgtpQ3iUIR7TCMkMvXUQK4daKD7JE3saGSWLP1Rs01VHYBFAviEds2LV769OHh7VOcHLaYqnDUIIAmfyv3VbtkPcBfrZ74EPoPpgU2UQoDcc7EC3WCLl5IFMqCCBycCxEAcHM0VX-Q8RAjMw-eY2BGqsTr72uzosF_k5StBZEfvrtBWsEfdRRfRF281pY2u-MXmN4jQWi27eKjtZTASjXLg2w3NMI7ZU6BPKHJc38viSAp-0YaCsyHa4hjEqCGUDJh5xU3kcPbK4g8HCMr250m6AzzjCLyAHwLC2w-gLuEK5KrK2Of_HL4Qqc_phQcoJGGUHdRkgPFDi2zAkAuFjy6lhQi7uK6bwjPa4LwXRx9O84Dz9uRw56qJatCgHM7_nTFLZktx_Fsy2iMbioSu1l4qdrFnaPldLMgqp3YcS8gNLtLQt8P8x1gOqxgAyZt09vkvJdJaTDtZPOwFMRe56IUBgXu-CgH1CtPLiwQ2taXsEAkq4yLqU_ccL5vkw7iZoGrNxLp0GmSj_Ehl_WbX7It5OsCU-uQC1VWHYBuGbJYJHggUGHgJMh29MsabvpLn_1nMEGqB5HgLaaHsA-MqPHEQCFwpLA0UefWeL9j4POUhN3wtpbw68KlNyfG_e2gvckJKpt4D3gzCDjy0ACnhpGyi3YHY5bl2IeKXlynSqRWz-o3d6qefToleCxsdmS4c9oiJ1fiE3XH3B-9dRNCQHMtqShdT7LbHQj6OlHdREkwGthXDSGZEk0wuB2FDvtLyzcM67V7QArLbruKAWQzq_Orc7ilL8ZuzgUxHVBIA2HC-EpaEsimS6N7n6rLVHf8SxeYKUCwFw1eiVqBPcoEyIdUXSeTOsJKs_pnxE0cOgTR0UuxWcVAV-Di2cyRw3ayf6gmF3OW1eqGbHckQw2kMjU7QdkvOrZYyAjX3wVwmG0gAk5r6g8Ras1biPDCQw-q8ryu_bdYqwuuoMcgJ_ijDz_n949Xkvx-3khaKy4kPyYI7Bl4h-QLi7tnKAu7wyRz5goYIrVfpRokN3O8m7UUtiO5mcoXTgD_Mu-bHAUtU59FQIvcDYsjb72MALiu1aBzZbbtoemENcBS5yVYmbZqMX688WuXyr2ITMJdBDSUtSoe6pFRTBJC77K-Tc5iWK97XiHHgB7LPXNUU0j7yK4YPGtiZn7k7E9Me9EmH7otnIz0Y9S-876Jw5hgP13jSFM-N0SMV25yNVRmxQ0P1pqKvj-TXjDja_WFTFf4N9B4vy_Vp7kx4SJBT5DRChnB6sc1AGyWsPeG9bZzEBsFhY2Gkkxg55NlW4RsgX5by56L25cyLTzdQrkHMcM635elkF2Mb8wttSR0aKqP5khL0JAMTc6mJaauJgTV0APM34KeiwtGdjp2EhfAJdeOB3kFV5XAyRBlsREu7w9ghktImBkw_0tgy6SqNGX8mTLMnU-3M3OLuOfQuhzmU37k4Z3XGba2TuFaSiwehBh5-FlqWU12vMBiTsPt-boljElLQ9_T4B5k605iCFA73lFM1KPYm9-uHzX0alHJn9ZwYvk3UHL3WUPptqhrRq3f4jj_7s1oAAug1966A1mPeduHzS3X4A1eEgh8FoMXuxHFiMxln3pcnqS9sWINdGvs2w4DW5b4jCTu1RXg8cwk89y2Wr8USOviN9qNO9Q0jNEB2izY4CbWdOhVnRLUXJUZ1qiYWATU3Ac6JyTqJ5oclD60_wN-Qkggk3kqCXRowW9agnXYZTqCltfRcH2yTVJ3kQRrcnw-kudAsVcTktUvtTCfdga0iSp3S6zNd4hJ1LkOmOSDPDNzOR5r_UjWk6V-RPD_PCBl53KCJAxiBbAmtY5YQ5456VtQ83bNMHQXvK_2Unr5UUNt4ldNEz7c5N2X0wIyyWTn6qvNt2w_w391dlort3uBSKUgm72xq-dW3EcSXD0dETz9-Fns104KlA6Ar77IcL-cXfbOsWA7yJZN-NRDMyVM3uDw63WvpdDOzz_g0C4vPNAQl0ZwBCpdFJO_NEj7FiEXHqbdMS5yO7KxZUc8lP9eQER7Phpq4A5c6NIZcC4JM_fvx80Wpgelp4GMc4ER29e_0TlSsV0kjnpvT00r6-tx46AU0sAMUt3bULLcmaOp_3ce2yLcOr2AS933hyx_rUADMgVSdg8qwuZxesI7JYDuB-O0_jGbycrVvNVu56wIFsU9yRti0gP8ROOlWNnj0noGSaoz-c_rhoX3cwHmgPmg5O0LqZgnmCUMTS7rw8FwACrKSXEKi2UMgF-voPWaax7yc2v89p6CMJlFOyeN09IRdvnSxq_fSrTTo-Sonp9fO24248yhgFlZ0U7LrSdN4ullaQBMW1lSa4dLAnePTcXM4QtA96v87N91yGRhKhDFtVMmvBMBfKOowqMf-NgsOWbETFtl0RiOkyMiI9HoHVTXoA18ShLM7GzB6RKL5w6NFOo9HoB4XdOhPUR6TSgN1jU-e59J6fV9OnNziwe77G_Rr8RnBnWP343j2g2hzuZHMlgSleZX6Qp72nIS1MpR3aJ9VuHqE-8pMJsx9uaKh8QLG4qI4jfrrNihemtw3PxFALciv7IXrw3bVL2M2qhB055t0p_RKjb8cBmgIR-APhgMPS-7vnj9Q4rWS3Q3KxkxFJfbx8fNyp9eUCLap5f500cpNLNBTYccJqDgRVyN1lc-mPsJJlIJF9AQD_EcCe9KHjBVKPDFenpbb7RjA2-FD3m5l77d70KB_e_M0vuK7dzSaXlRUH_l8S-5EHMdl8Z0A-CZnFuizUgLXR9GzpH9TnTScF08ULfJYKcza140uAAPl8qafbw1i_RMUAORanPW_dBMp6xLZE2a6_kWKUFBr-sLydskfqhxsKIkeNF_VdYep5eG5HYJ_l765YirgEW1KurW8DKWi7UIH8o3Gk_53DvKtwgHQ4exOjU-2fF-cYm-MuFL-MifT-zPcm_AhvUXLuVOgnrdO2LsIrZ0T-yKS6b_e4NQiui1fe7wqRvDhdPtIolRYuQEyKO5yUBJa-LzUIKsagfnOZanh-YuNWj1axSBzt1p5H2GmvztM2_oObCA0l5HrqpXTdrh7DFD1tKQxyL0iFUm-QEUeUA0BXatAXH_P2ooi69SzbhvfIEDacupM_gDQRn8AJkYBHZR06qFOYrAJ4t76CIjRpeiGZ5LAgohmKqcFofuxG5qaEmjfEjhhUgnZY716WZkEGMIw3PH2hJjhzxutKkcqzGGBYkZZOut9A8XFUsXmMqng1hXA9NAWimKvRzst_boBgZVDwvqXpa2e919ZM6L71_hS9GEbhf_Bk2U86ot2lVrk6deZmxVcRyUd--ifU55WuM6XT31owNxv53P_R3oQ2bRKwfZ3WfeyYADH2ZpodbkuWYCI_rrVUkjJaRINnPUGJMdwfF6Ngv1sdPsREASpcTqFbo9QVRmO61vSIggv2G2efGPMfdLvHyv4FBgW5hRKZN4EFLTLQhP7l6y_K8uZoULAIYLt1_J1RnuGwDy6sWQiwZmDH3ynNQ7M2MYvJNnxfB_pkJCFmsTPyXSy_nO_GJtDHa5SRReNsJbv8RB1DMmpAl-6Mo-LDpY0KkPRrTdYLuvYG5F6hCdE-rMaYdkAMLacFZuCuiVUxsGCFOI-JmXJmTT7YJ39_BzbI5eo839-Rp04SB-fVducrZiRzxRFffiIQKy-IaVhum0_8SLfteMD1aJPHeSTnnbiENiGqc5sOkWH6jUd4_reN01R6AvHXaqC2k0MlKKcHFIeaWbMzXN4Z5y4ewy8OfhGJPzc24qoHgqdWoulSL-otDZQNnDjoR4yaBpEXSKQqFzzVB1yf8NuXxGOWwDsLVSXRTjxILHr4L9IyImL3KZwFjikHCI34x3jRzb8T-8J-c1zWahnCuuhj_YPowTYewrabD_ux-10qag3Km5XGOyram6W9I1hp02DtfaNyYLTRxCjET05RcSh41abuESa1PHPqSkUilAFiGNiCadRYxqL95pjO35m6fRLSCQJz90kv0L_8lAxrYOdrX5ApFhPQZgzhJ2FFLnYehBbiJ6x_nSBvd5CLfa-CBrVZ7hRcuGdEv2szzGtMySWVOx-1giTOdzmIFzTR4sgL390sN-BvDvCeJ3KLt7532QnEnoJd3uMTijaxU7tHy9RYmd_8TT7Rq2eQxqlyKSuJuUeZwm9VN5VwGy2byh770t-5zmHdyLyc9gkPiGs3B51oAomr7KoyjzSmKA-jswH3SeleYETqTQeN0fCdK7hqcwqfuV1d48xISFT338P5ntL9PddJ5MFYerP0yVzSQzUFR5ggtvfmLdLHmJBG5K8uERVta37ZSURzQxnrVZPun8VZmmGaBvM88YIlJ0fJTVAuV8duslxzS-Ptwe8arHmYDuaazxplYByJpXOA4SqbwyLVIxipQpbhULOlAde5i59XE6XhZm0Qu1554eknwIaHNG3eP-BGVulxX7SVmuzO25S9ISsjJv4f9aoBSRDf0cOYRLYoKwwPcR6Z-5bQlWeRuPPhoKGVh15UsPDWb80_HrIMqdj4Av6YjlY59lJdy7choyXJHhgivEN6cSA5qOipkMbreBXKy6T1I9vUi0cwZjDcq1TqtprKE7AOAiglFWcvWkpPlIjaLsaInRxe8STFDW9AgtS1fIjZGkk0yzgcTZKI3pTm_Oh_ChiFg1zAOFq_EZp8uil68L8Hy-nv0xc2-XIJe1qjrI05ThdE03IqVpJ-x_bPdMwFfrntSWsWnA7HrPwW4Ndk_g8h2gK5mw7FffWnKG-vv1FShmKGRSUrmjSfmU9tqYNnFr-Z4rHDTBj4YnOtt8cYLnfPgtvcwLWYQoNdPO_9gK0z65KJF2oNgFH_VkcyoVrlvUkC5jAkPufiFnH_vpvGKO8zAWdzmuFowvSZOUjwibhIj-3PBh6f0keiGtv2R8o34nGK-Tq4OA5gvBwk6FaUu2QZEzFLlvrK1vPUoRqDjnhmETxLCYXxbhXeNOB1A5G-TS7nHW3ph-_DWpU3kyhuvNGOKGHBMCMRA28Hqfmllbptbc6p77nbgZS0Ga3hYmE7IyCMe9L9VGUns3U7ctpzAm-Y8hzkaLUjb-h0MczQL70_N3NL7ciqSzRIXxAFh1bkMHvL8I3-Q59JONqVjiiDQOPCWXDPCbxyPjAMIy6RMJCVLRCFmhmDiKqMLmuQUdM2hAKjHRl_FZAvn95p5GDZfae7KW-7CW_amNXPl3O8u7GQG7wIJJyO0pAc8QMDOsVJhBOxxOmalx1pbZIxaGc7fsu7biP9CFoPwisC6BmkT9sUxxhlvEkL_BQDj2fVHxy9euS88YeprGmggWu5IdcW1pTHY2RnGwPV0wkaZ9ePYF0gZzjh_ItXQ0nxUqlWk1RsRzmuXC-t1rY3h0s2Fw_gadokND8YTXx0al2qwfRRyc6MhLULwSE5N0MgPxpW01d9q8q8hmjEM6O4VujpoDuNx6PEZ6qHW5JErJVWcFD-ZeVck-TTqVoKTFMvelrsiieofPMAMgRjxunGwBeE1CMqMz5228yayrF3StaFggSjuRYO4OqId2rkkq6LQZtLwpXEpfV3uZO1nCjdad5vO8awDPto5bgJZfjfJPIDKaGcseq6FBM5RlE6rwBo5nxHhshe7-iLD-oqIv3cah1f9ejrtyYt5jTa5qtecfUvNMTBal1riaX7K3WNdJfprIA-YEBsLr3b6bieeQETMC16rOZlp-AoHeaWcBEwEU4dta9F9pzTzHLHkhp88aauBpUNl26whjvgg-BglWIpmg1goUbSWc30Fx_EtvNRr0RWL05bAThRWY70F-rOVLD_GtanSegmfZMV5wpPc1vK1uIzm3U-NDpsXmKh36NAzWQCwaFFPGlR3RJYhV_3YXHQoqB2noIHFrLLvfIIUgj9pgNRX4DM1zCssWBigN5DX-p4SxODCQZAO_kAAXjLc6zSEp1zrCbEMweINUT5Q