duongtruongbinh commited on
Commit
2776a06
·
1 Parent(s): c8321d4

Init commit

Browse files
.gitignore ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ __pycache__/
2
+ __MACOSX/
3
+
4
+ .DS_Store
README.md CHANGED
@@ -4,9 +4,73 @@ emoji: 📊
4
  colorFrom: red
5
  colorTo: blue
6
  sdk: gradio
7
- sdk_version: 5.49.1
8
  app_file: app.py
 
9
  pinned: false
10
  ---
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  colorFrom: red
5
  colorTo: blue
6
  sdk: gradio
7
+ sdk_version: 5.38.0
8
  app_file: app.py
9
+ short_description: Run Logistic Regression on datasets to predict outcomes
10
  pinned: false
11
  ---
12
 
13
+ # Logistic Regression Demo
14
+
15
+ Interactive demonstration of Logistic Regression implemented from scratch using NumPy and gradient descent. Learn binary classification with sigmoid activation, binary cross-entropy loss, and adjustable prediction threshold.
16
+
17
+ ## Features
18
+
19
+ - **Binary Classification**: Implements binary classification (2 classes: 0 and 1)
20
+ - **NumPy Implementation**: Efficient matrix operations for fast computation
21
+ - **Sigmoid Activation**: Maps predictions to probabilities (0-1 range)
22
+ - **Binary Cross-Entropy Loss**: Optimized loss function for binary classification
23
+ - **Adjustable Threshold**: Experiment with different probability thresholds to balance precision/recall
24
+ - **Mini-batch Gradient Descent**: Supports configurable batch sizes (powers of 2) or full batch
25
+ - **Feature Normalization**: Automatic standardization (zero mean, unit variance) for stable training
26
+ - **Training Visualization**: Track loss and accuracy over epochs for training and validation sets
27
+
28
+ ## Algorithm Details
29
+
30
+ **Activation Function**: Sigmoid σ(z) = 1/(1 + e^(-z))
31
+ **Loss Function**: Binary Cross-Entropy L = -[y·log(ŷ) + (1-y)·log(1-ŷ)]
32
+ **Classification**: Predict class 1 if probability ≥ threshold, else class 0
33
+ **Normalization**: Features standardized (zero mean, unit variance) for numerical stability
34
+
35
+ ## Sample Datasets
36
+
37
+ 1. **Breast Cancer**: Wisconsin Breast Cancer dataset (binary classification)
38
+ 2. **Wine (Binary)**: Wine dataset converted to binary (class 0 vs others)
39
+ 3. **Synthetic**: Artificially generated binary classification dataset
40
+
41
+ ## How to Use
42
+
43
+ 1. **Select Data**: Choose a sample dataset or upload your own CSV/Excel file
44
+ 2. **Configure Target**: Select target column (must have exactly 2 unique values)
45
+ 3. **Set Training Parameters**:
46
+ - **Epochs**: Number of training iterations (recommended: 50-500)
47
+ - **Learning Rate**: Step size for gradient descent (recommended: 0.001-0.01)
48
+ - **Batch Size**: Samples per batch (powers of 2, or Full Batch)
49
+ - **Train/Validation Split**: Proportion for training (default: 80%)
50
+ 4. **Adjust Threshold**: Set probability threshold for classification (default: 0.5)
51
+ 5. **Enter Features**: Input feature values for prediction
52
+ 6. **Run Training**: Click "Run Training & Prediction" to train and visualize
53
+
54
+ ## Key Parameters
55
+
56
+ **Training Parameters**:
57
+ - **Epochs**: Complete passes through data. More epochs = better learning but risk of overfitting
58
+ - **Learning Rate**: Step size (0.001-0.01 recommended). Too high causes instability, too low is slow
59
+ - **Batch Size**: Samples processed before update. Smaller = faster but noisier, larger = more stable
60
+ - **Train/Validation Split**: Data split ratio (default 80/20)
61
+
62
+ **Threshold Parameter** (Key Feature):
63
+ - **Default**: 0.5 (balanced classification)
64
+ - **Lower threshold** (e.g., 0.3): More class 1 predictions → higher recall, lower precision
65
+ - **Higher threshold** (e.g., 0.7): Fewer class 1 predictions → higher precision, lower recall
66
+ - **Experiment**: Adjust threshold to see how predictions and accuracy change in real-time
67
+ - **Use Case**: Balance precision vs recall based on your classification goals
68
+
69
+ ## Requirements
70
+
71
+ - gradio >= 5.38.0
72
+ - pandas >= 1.5.0
73
+ - scikit-learn >= 1.3.0
74
+ - numpy >= 1.24.0
75
+ - plotly >= 5.15.0
76
+
app.py ADDED
@@ -0,0 +1,645 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import gradio as gr
2
+ import pandas as pd
3
+ import vlai_template
4
+
5
+ # Import Logistic Regression core
6
+ try:
7
+ from src import logistic_regression
8
+ LR_AVAILABLE = True
9
+ except ImportError as e:
10
+ print(f"❌ Logistic Regression module failed to load: {str(e)}")
11
+ LR_AVAILABLE = False
12
+ logistic_regression = None
13
+
14
+ vlai_template.configure(
15
+ project_name="Logistic Regression Demo",
16
+ year="2025",
17
+ module="06",
18
+ description="Interactive demonstration of Logistic Regression using NumPy and gradient descent. Learn binary classification with sigmoid activation, binary cross-entropy loss, and adjustable prediction threshold. Visualize training metrics and experiment with threshold values.",
19
+ colors={
20
+ "primary": "#1976D2",
21
+ "accent": "#7B1FA2",
22
+ "bg1": "#E3F2FD",
23
+ "bg2": "#BBDEFB",
24
+ "bg3": "#90CAF9",
25
+ },
26
+ font_family="'Inter', -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, 'Helvetica Neue', Arial, sans-serif"
27
+ )
28
+
29
+ current_dataframe = None
30
+
31
+ def load_sample_data_fallback(dataset_choice="Breast Cancer"):
32
+ """Fallback data loading function when core module is not available"""
33
+ from sklearn.datasets import load_breast_cancer, load_wine, make_classification
34
+ import pandas as pd
35
+ import numpy as np
36
+
37
+ def sklearn_to_df(data):
38
+ df = pd.DataFrame(data.data, columns=getattr(data, "feature_names", None))
39
+ if df.columns.isnull().any():
40
+ df.columns = [f"feature_{i}" for i in range(df.shape[1])]
41
+ df["target"] = data.target
42
+ return df
43
+
44
+ def wine_to_binary_df(wine_data):
45
+ df = pd.DataFrame(wine_data.data, columns=wine_data.feature_names)
46
+ df["target"] = (wine_data.target == 0).astype(int)
47
+ return df
48
+
49
+ def synthetic_classification():
50
+ X, y = make_classification(n_samples=1000, n_features=20, n_informative=15,
51
+ n_redundant=5, n_classes=2, random_state=42)
52
+ df = pd.DataFrame(X, columns=[f"feature_{i}" for i in range(X.shape[1])])
53
+ df["target"] = y
54
+ return df
55
+
56
+ datasets = {
57
+ "Breast Cancer": lambda: sklearn_to_df(load_breast_cancer()),
58
+ "Wine (Binary)": lambda: wine_to_binary_df(load_wine()),
59
+ "Synthetic": lambda: synthetic_classification(),
60
+ }
61
+
62
+ if dataset_choice not in datasets:
63
+ raise ValueError(f"Unknown dataset: {dataset_choice}")
64
+ return datasets[dataset_choice]()
65
+
66
+ def create_input_components_fallback(df, target_col):
67
+ """Fallback input components creation when XGBoost is not available"""
68
+ feature_cols = [c for c in df.columns if c != target_col]
69
+ components = []
70
+ for col in feature_cols:
71
+ data = df[col]
72
+ if data.dtype == "object":
73
+ uniq = sorted(map(str, data.dropna().unique()))
74
+ if not uniq:
75
+ uniq = ["N/A"]
76
+ components.append(
77
+ {"name": col, "type": "dropdown", "choices": uniq, "value": uniq[0]}
78
+ )
79
+ else:
80
+ val = pd.to_numeric(data, errors="coerce").dropna().mean()
81
+ val = 0.0 if pd.isna(val) else float(val)
82
+ components.append(
83
+ {
84
+ "name": col,
85
+ "type": "number",
86
+ "value": round(val, 3),
87
+ "minimum": None,
88
+ "maximum": None,
89
+ }
90
+ )
91
+ return components
92
+
93
+ SAMPLE_DATA_CONFIG = {
94
+ "Breast Cancer": {"target_column": "target", "problem_type": "classification"},
95
+ "Wine (Binary)": {"target_column": "target", "problem_type": "classification"},
96
+ "Synthetic": {"target_column": "target", "problem_type": "classification"},
97
+ }
98
+
99
+ force_light_theme_js = """
100
+ () => {
101
+ const params = new URLSearchParams(window.location.search);
102
+ if (!params.has('__theme')) {
103
+ params.set('__theme', 'light');
104
+ window.location.search = params.toString();
105
+ }
106
+ }
107
+ """
108
+
109
+ def validate_config(df, target_col):
110
+ if not target_col or target_col not in df.columns:
111
+ return False, "❌ Please select a valid target column from the dropdown.", None
112
+
113
+ target_series = df[target_col]
114
+ unique_vals = target_series.nunique()
115
+
116
+ # For logistic regression, we only support binary classification (2 classes)
117
+ problem_type = "classification"
118
+
119
+ if target_series.isnull().any():
120
+ return False, "⚠️ Target column has missing values. Please clean your data.", None
121
+
122
+ if target_series.dtype == "object":
123
+ return False, "⚠️ Target must be numeric for classification. Please select a numeric column.", None
124
+
125
+ if unique_vals != 2:
126
+ return False, f"⚠️ Target must have exactly 2 unique values for binary classification. Found {unique_vals} unique values.", None
127
+
128
+ # Check if values are 0 and 1
129
+ unique_values = sorted(target_series.unique())
130
+ if set(unique_values) != {0, 1}:
131
+ return True, f"\n✅ Configuration is valid! Target will be mapped to binary (0/1). Original values: {unique_values}", problem_type
132
+
133
+ return True, f"\n✅ Configuration is valid! Ready for binary classification with values {unique_values}.", problem_type
134
+
135
+
136
+ def get_status_message(is_sample, dataset_choice, target_col, problem_type, is_valid, validation_msg):
137
+ if is_sample:
138
+ return f"✅ **Selected Dataset**: {dataset_choice} | **Target**: {target_col} | **Type**: {problem_type.title()}"
139
+ elif target_col and problem_type:
140
+ status_icon = "✅" if is_valid else "⚠️"
141
+ return f"{status_icon} **Custom Data** | **Target**: {target_col} | **Type**: {problem_type.title()} | {validation_msg}"
142
+ else:
143
+ return "📁 **Custom data uploaded!** 👆 Please select target column above to continue."
144
+
145
+
146
+ def load_and_configure_data_simple(dataset_choice="Breast Cancer"):
147
+ global current_dataframe
148
+ try:
149
+ if not LR_AVAILABLE:
150
+ # Fallback data loading without core module
151
+ df = load_sample_data_fallback(dataset_choice)
152
+ else:
153
+ df = logistic_regression.load_data(None, dataset_choice)
154
+
155
+ current_dataframe = df
156
+
157
+ target_options = df.columns.tolist()
158
+ cfg = SAMPLE_DATA_CONFIG.get(dataset_choice, {})
159
+ target_col = cfg.get("target_column")
160
+ problem_type = cfg.get("problem_type")
161
+
162
+ if target_col and target_col in target_options:
163
+ is_valid, validation_msg, detected = validate_config(df, target_col)
164
+ if detected:
165
+ problem_type = detected
166
+ status_msg = get_status_message(True, dataset_choice, target_col, problem_type, is_valid, validation_msg)
167
+ else:
168
+ # If target_col not in options, use first column as fallback
169
+ target_col = target_options[0] if target_options else None
170
+ status_msg = get_status_message(True, dataset_choice, target_col, problem_type, False, "")
171
+
172
+ return [df.head(5).round(2), gr.Dropdown(choices=target_options, value=target_col), status_msg]
173
+
174
+ except Exception as e:
175
+ current_dataframe = None
176
+ return [pd.DataFrame(), gr.Dropdown(choices=[], value=None), f"❌ **Error loading data**: {str(e)} | Please try a different dataset."]
177
+
178
+
179
+ def load_and_configure_data(file_obj=None, dataset_choice="Breast Cancer"):
180
+ global current_dataframe
181
+ try:
182
+ if not LR_AVAILABLE:
183
+ # Fallback data loading without core module
184
+ if file_obj is not None:
185
+ # Handle file upload fallback
186
+ if file_obj.name.endswith(".csv"):
187
+ df = pd.read_csv(file_obj.name)
188
+ elif file_obj.name.endswith((".xlsx", ".xls")):
189
+ df = pd.read_excel(file_obj.name)
190
+ else:
191
+ raise ValueError("Unsupported format. Upload CSV or Excel files.")
192
+ else:
193
+ df = load_sample_data_fallback(dataset_choice)
194
+ else:
195
+ df = logistic_regression.load_data(file_obj, dataset_choice)
196
+
197
+ current_dataframe = df
198
+
199
+ target_options = df.columns.tolist()
200
+ is_sample = file_obj is None
201
+
202
+ if is_sample:
203
+ cfg = SAMPLE_DATA_CONFIG.get(dataset_choice, {})
204
+ target_col = cfg.get("target_column")
205
+ problem_type = cfg.get("problem_type")
206
+ else:
207
+ target_col, problem_type = None, None
208
+
209
+ if target_col:
210
+ is_valid, validation_msg, detected = validate_config(df, target_col)
211
+ if detected:
212
+ problem_type = detected
213
+ status_msg = get_status_message(is_sample, dataset_choice, target_col, problem_type, is_valid, validation_msg)
214
+ else:
215
+ status_msg = get_status_message(is_sample, dataset_choice, target_col, problem_type, False, "")
216
+
217
+ input_updates = [gr.update(visible=False)] * 40
218
+ inputs_visible = gr.update(visible=False)
219
+ input_status = "⚙️ Configure target column above to enable feature inputs."
220
+
221
+ if target_col and problem_type and (not is_sample or is_valid):
222
+ try:
223
+ if LR_AVAILABLE:
224
+ components_info = logistic_regression.create_input_components(df, target_col)
225
+ else:
226
+ components_info = create_input_components_fallback(df, target_col)
227
+ for i in range(min(20, len(components_info))):
228
+ comp = components_info[i]
229
+ number_idx, dropdown_idx = i * 2, i * 2 + 1
230
+ if comp["type"] == "number":
231
+ upd = {"visible": True, "label": comp["name"], "value": comp["value"]}
232
+ if comp["minimum"] is not None:
233
+ upd["minimum"] = comp["minimum"]
234
+ if comp["maximum"] is not None:
235
+ upd["maximum"] = comp["maximum"]
236
+ input_updates[number_idx] = gr.update(**upd)
237
+ input_updates[dropdown_idx] = gr.update(visible=False)
238
+ else:
239
+ input_updates[number_idx] = gr.update(visible=False)
240
+ input_updates[dropdown_idx] = gr.update(
241
+ visible=True, label=comp["name"], choices=comp["choices"], value=comp["value"]
242
+ )
243
+ inputs_visible = gr.update(visible=True)
244
+ input_status = f"📝 **Ready!** Enter values for {len(components_info)} features below, then click Run prediction. | {validation_msg}"
245
+ except Exception as e:
246
+ input_status = f"❌ Error generating inputs: {str(e)}"
247
+
248
+ return [df.head(5).round(2), gr.Dropdown(choices=target_options, value=target_col), status_msg] + input_updates + [inputs_visible, input_status]
249
+
250
+ except Exception as e:
251
+ current_dataframe = None
252
+ empty = [pd.DataFrame(), gr.Dropdown(choices=[], value=None), f"❌ **Error loading data**: {str(e)} | Please try a different file or dataset."]
253
+ return empty + [gr.update(visible=False)] * 40 + [gr.update(visible=False), "No data loaded."]
254
+
255
+
256
+ def update_learning_rate_display(lr_power):
257
+ """Update the display to show what the current learning rate slider value represents"""
258
+ # Map slider value to actual learning rate
259
+ lr_values = [0.000001, 0.00001, 0.0001, 0.001, 0.01, 0.1, 1.0]
260
+ lr_labels = ["1e-6", "1e-5", "1e-4", "1e-3", "1e-2", "1e-1", "1"]
261
+
262
+ idx = int(lr_power)
263
+ if 0 <= idx < len(lr_values):
264
+ return f"**Current Learning Rate:** {lr_values[idx]} ({lr_labels[idx]})"
265
+ else:
266
+ return "**Current Learning Rate:** N/A"
267
+
268
+
269
+ def update_batch_size_display(batch_size_power, train_split):
270
+ """Update the display to show what the current batch size slider value represents"""
271
+ global current_dataframe
272
+ df = current_dataframe
273
+
274
+ if df is None or df.empty:
275
+ return "**Current Batch Size:** N/A"
276
+
277
+ # Calculate training set size
278
+ train_size = int(len(df) * train_split)
279
+
280
+ # Determine max power of 2 that fits in training size
281
+ import math
282
+ max_power = int(math.log2(train_size)) if train_size > 0 else 0
283
+
284
+ # Convert slider value to batch size
285
+ if batch_size_power >= max_power + 1:
286
+ return f"**Current Batch Size:** Full Batch ({train_size} samples)"
287
+ else:
288
+ actual_batch_size = 2 ** int(batch_size_power)
289
+ return f"**Current Batch Size:** {actual_batch_size} samples (2^{int(batch_size_power)})"
290
+
291
+
292
+ def update_batch_size_slider(df_preview, target_col, train_split):
293
+ """Update batch size slider max based on training data size"""
294
+ global current_dataframe
295
+ df = current_dataframe
296
+
297
+ if df is None or df.empty:
298
+ return gr.update(maximum=10, value=10)
299
+
300
+ # Calculate training set size
301
+ train_size = int(len(df) * train_split)
302
+
303
+ # Determine max power of 2 that fits in training size
304
+ import math
305
+ max_power = int(math.log2(train_size)) if train_size > 0 else 0
306
+
307
+ # Slider goes from 0 to max_power+1 (where max_power+1 = Full Batch)
308
+ new_max = max_power + 1
309
+
310
+ # Set value to Full Batch by default
311
+ return gr.update(maximum=new_max, value=new_max)
312
+
313
+
314
+ def update_configuration(df_preview, target_col):
315
+ global current_dataframe
316
+ df = current_dataframe
317
+
318
+ if df is None or df.empty:
319
+ return [gr.update(visible=False)] * 40 + [gr.update(visible=False), "No data available.", "No data available."]
320
+ if not target_col:
321
+ return [gr.update(visible=False)] * 40 + [gr.update(visible=False), "Select target column.", "Select target column."]
322
+
323
+ try:
324
+ is_valid, validation_msg, problem_type = validate_config(df, target_col)
325
+ if not is_valid:
326
+ return [gr.update(visible=False)] * 40 + [gr.update(visible=False), f"⚠️ {validation_msg}", f"⚠️ {validation_msg}"]
327
+
328
+ if LR_AVAILABLE:
329
+ components_info = logistic_regression.create_input_components(df, target_col)
330
+ else:
331
+ components_info = create_input_components_fallback(df, target_col)
332
+ input_updates = [gr.update(visible=False)] * 40
333
+ for i in range(min(20, len(components_info))):
334
+ comp = components_info[i]
335
+ number_idx, dropdown_idx = i * 2, i * 2 + 1
336
+ if comp["type"] == "number":
337
+ upd = {"visible": True, "label": comp["name"], "value": comp["value"]}
338
+ if comp["minimum"] is not None:
339
+ upd["minimum"] = comp["minimum"]
340
+ if comp["maximum"] is not None:
341
+ upd["maximum"] = comp["maximum"]
342
+ input_updates[number_idx] = gr.update(**upd)
343
+ input_updates[dropdown_idx] = gr.update(visible=False)
344
+ else:
345
+ input_updates[number_idx] = gr.update(visible=False)
346
+ input_updates[dropdown_idx] = gr.update(
347
+ visible=True, label=comp["name"], choices=comp["choices"], value=comp["value"]
348
+ )
349
+ input_status = f"📝 Enter values for {len(components_info)} features | {validation_msg}"
350
+ status_msg = f"✅ **Selected Dataset**: Custom Data | **Target**: {target_col} | **Type**: {problem_type.title()}"
351
+ return input_updates + [gr.update(visible=True), input_status, status_msg]
352
+
353
+ except Exception as e:
354
+ return [gr.update(visible=False)] * 40 + [gr.update(visible=False), f"❌ Error: {str(e)}", f"❌ Error: {str(e)}"]
355
+
356
+
357
+ # Logistic Regression prediction function
358
+
359
+ def execute_prediction(df_preview, target_col, epochs, learning_rate_power, batch_size_power, train_test_split_ratio, threshold, *input_values):
360
+ global current_dataframe
361
+ df = current_dataframe
362
+
363
+ EMPTY_PLOT = None
364
+ EMPTY_HTML = ""
365
+ error_style = "<div style='background:#FFEBEE;border-left:6px solid #C62828;padding:14px 16px;border-radius:10px;'><strong>📊 Logistic Regression</strong><br><br>{}</div>"
366
+
367
+ # Check if Logistic Regression core is available
368
+ if not LR_AVAILABLE:
369
+ return (EMPTY_PLOT, EMPTY_PLOT, error_style.format("❌ Logistic Regression module is not available!<br><br>Please check the installation."))
370
+
371
+ if df is None or df.empty:
372
+ return (EMPTY_PLOT, EMPTY_PLOT, error_style.format("No data available."))
373
+ if not target_col:
374
+ return (EMPTY_PLOT, EMPTY_PLOT, error_style.format("Configuration incomplete."))
375
+
376
+ is_valid, validation_msg, problem_type = validate_config(df, target_col)
377
+ if not is_valid:
378
+ return (EMPTY_PLOT, EMPTY_PLOT, error_style.format("Configuration issue."))
379
+
380
+ try:
381
+ if LR_AVAILABLE:
382
+ components_info = logistic_regression.create_input_components(df, target_col)
383
+ else:
384
+ components_info = create_input_components_fallback(df, target_col)
385
+
386
+ new_point_dict = {}
387
+ for i, comp in enumerate(components_info):
388
+ number_idx = i * 2
389
+ v = input_values[number_idx] if number_idx < len(input_values) and input_values[number_idx] is not None else comp["value"]
390
+ new_point_dict[comp["name"]] = v
391
+
392
+ # Convert learning rate slider value to actual learning rate
393
+ lr_values = [0.000001, 0.00001, 0.0001, 0.001, 0.01, 0.1, 1.0]
394
+ idx = int(learning_rate_power)
395
+ if 0 <= idx < len(lr_values):
396
+ lr_float = lr_values[idx]
397
+ else:
398
+ lr_float = 0.01 # Default fallback
399
+
400
+ # Convert batch_size_power to actual batch size string
401
+ train_size = int(len(df) * train_test_split_ratio)
402
+ import math
403
+ max_power = int(math.log2(train_size)) if train_size > 0 else 0
404
+
405
+ if batch_size_power >= max_power + 1:
406
+ batch_size_str = "Full Batch"
407
+ else:
408
+ actual_batch_size = 2 ** int(batch_size_power)
409
+ batch_size_str = str(actual_batch_size)
410
+
411
+ train_loss_fig, val_loss_fig, results_display, prediction = logistic_regression.run_logistic_regression_and_visualize(
412
+ df, target_col, new_point_dict, epochs, lr_float, batch_size_str, train_test_split_ratio, threshold
413
+ )
414
+
415
+ return (train_loss_fig, val_loss_fig, results_display)
416
+
417
+ except Exception as e:
418
+ print(f"Execution error: {str(e)}") # For debugging
419
+ import traceback
420
+ traceback.print_exc()
421
+ return (EMPTY_PLOT, EMPTY_PLOT, error_style.format(f"Execution error: {str(e)}"))
422
+
423
+
424
+ # No tree visualization needed for logistic regression
425
+
426
+
427
+ with gr.Blocks(theme="gstaff/sketch", css=vlai_template.custom_css, fill_width=True, js=force_light_theme_js) as demo:
428
+ vlai_template.create_header()
429
+
430
+ gr.HTML(vlai_template.render_info_card(
431
+ icon="📊",
432
+ title="About this Logistic Regression Demo",
433
+ description="Interactive demonstration of Logistic Regression using NumPy and gradient descent. Learn binary classification with sigmoid activation, binary cross-entropy loss, and adjustable prediction threshold. Visualize training metrics and experiment with different threshold values."
434
+ ))
435
+
436
+ gr.Markdown("### 📊 **How to Use**: Select binary classification data → Configure target (must have 2 classes) → Set training parameters → Adjust threshold → Enter feature values → Run training!")
437
+
438
+ with gr.Row(equal_height=False, variant="panel"):
439
+ with gr.Column(scale=45):
440
+ with gr.Accordion("📊 Data & Configuration", open=True):
441
+ with gr.Row():
442
+ with gr.Column(scale=1):
443
+ gr.Markdown("Start with sample datasets or upload your own CSV/Excel files.")
444
+ file_upload = gr.File(label="📁 Upload Your Data", file_types=[".csv", ".xlsx", ".xls"])
445
+ with gr.Column(scale=3):
446
+ sample_dataset = gr.Dropdown(choices=list(SAMPLE_DATA_CONFIG.keys()), value="Breast Cancer", label="🗂️ Sample Datasets")
447
+
448
+ with gr.Row():
449
+ target_column = gr.Dropdown(choices=[], label="🎯 Target Column", interactive=True)
450
+
451
+ status_message = gr.Markdown("🔄 Loading sample data...")
452
+ data_preview = gr.DataFrame(label="📋 Data Preview (First 5 Rows)", row_count=5, interactive=False, max_height=250)
453
+
454
+ with gr.Accordion("📊 Training Parameters & Input", open=True):
455
+ gr.Markdown("**📊 Logistic Regression Parameters**")
456
+ with gr.Row():
457
+ epochs = gr.Number(
458
+ label="Number of Epochs",
459
+ value=100, minimum=1, maximum=1000, precision=0,
460
+ info="Number of training iterations"
461
+ )
462
+ learning_rate_slider = gr.Slider(
463
+ label="Learning Rate (Power of 10)",
464
+ value=4, minimum=0, maximum=6, step=1,
465
+ info="0=1e-6, 1=1e-5, 2=1e-4, 3=1e-3, 4=1e-2, 5=1e-1, 6=1"
466
+ )
467
+ learning_rate_display = gr.Markdown("**Current Learning Rate:** 0.01")
468
+ batch_size_slider = gr.Slider(
469
+ label="Batch Size (Power of 2)",
470
+ value=10, minimum=0, maximum=10, step=1,
471
+ info="Slide to select: 0=1, 1=2, 2=4, 3=8, ... Max=Full Batch"
472
+ )
473
+ batch_size_display = gr.Markdown("**Current Batch Size:** Full Batch")
474
+
475
+ gr.Markdown("**📊 Data Split Configuration**")
476
+ with gr.Row():
477
+ train_test_split_ratio = gr.Slider(
478
+ label="Train/Validation Split Ratio",
479
+ value=0.8, minimum=0.6, maximum=0.9, step=0.05,
480
+ info="Proportion of data used for training (e.g., 0.8 = 80% train, 20% validation)"
481
+ )
482
+
483
+ gr.Markdown("**🎯 Prediction Threshold Configuration**")
484
+ with gr.Row():
485
+ threshold = gr.Slider(
486
+ label="Classification Threshold",
487
+ value=0.5, minimum=0.0, maximum=1.0, step=0.01,
488
+ info="Probability threshold for binary classification. Predict class 1 if probability ≥ threshold, else class 0. Adjust to balance precision/recall."
489
+ )
490
+ threshold_display = gr.Markdown("**Current Threshold:** 0.50")
491
+
492
+ inputs_group = gr.Group(visible=False)
493
+ with inputs_group:
494
+ input_status = gr.Markdown("Configure inputs above.")
495
+ gr.Markdown("**📝 New Data Point** - Enter feature values for prediction:")
496
+ input_components = []
497
+ for row in range(5):
498
+ with gr.Row():
499
+ for col in range(4):
500
+ idx = row * 4 + col
501
+ if idx < 20:
502
+ number_comp = gr.Number(label=f"Feature {idx+1}", visible=False)
503
+ dropdown_comp = gr.Dropdown(label=f"Feature {idx+1}", visible=False)
504
+ input_components.extend([number_comp, dropdown_comp])
505
+
506
+ run_prediction_btn = gr.Button("📊 Run Training & Prediction", variant="primary", size="lg")
507
+
508
+ with gr.Column(scale=55):
509
+ gr.Markdown("### 📊 **Logistic Regression Results & Visualization**")
510
+
511
+ train_loss_chart = gr.Plot(label="Training Loss & Accuracy Over Epochs", visible=True)
512
+ val_loss_chart = gr.Plot(label="Validation Loss & Accuracy Over Epochs", visible=True)
513
+ results_display = gr.HTML("**📊 Logistic Regression Results**<br><br>Training details will appear here showing model performance, learned parameters, and predictions with current threshold.", label="📊 Results & Predictions")
514
+
515
+ gr.Markdown("""📊 **Logistic Regression Guide**:
516
+
517
+ **📈 Training Metrics**:
518
+ - **Loss (BCE)**: Binary Cross-Entropy loss decreases as model learns. Lower loss indicates better fit.
519
+ - **Accuracy**: Classification accuracy improves during training. Monitor both training and validation accuracy.
520
+
521
+ **🔧 Training Parameters**:
522
+ - **Epochs**: Number of complete passes through training data. More epochs = better learning, but watch for overfitting.
523
+ - **Learning Rate**: Step size for gradient descent. Recommended: 0.001 to 0.01. Too high may cause instability.
524
+ - **Batch Size**: Samples processed before updating parameters. Powers of 2: 1, 2, 4, 8... or Full Batch. Smaller = faster updates but noisier. Larger = more stable.
525
+ - **Train/Validation Split**: Proportion of data for training vs validation. Default 80/20 split.
526
+
527
+ **🎯 Threshold Parameter**:
528
+ - **Threshold**: Probability cutoff for binary classification. If predicted probability ≥ threshold → class 1, else → class 0.
529
+ - **Default**: 0.5 (balanced)
530
+ - **Lower threshold** (e.g., 0.3): More predictions of class 1 → higher recall, lower precision
531
+ - **Higher threshold** (e.g., 0.7): Fewer predictions of class 1 → higher precision, lower recall
532
+ - **Experiment**: Adjust threshold to see how predictions and accuracy change!
533
+
534
+ **🧮 Algorithm Details**:
535
+ - **Sigmoid Activation**: Maps linear output to probability (0-1 range)
536
+ - **Binary Cross-Entropy Loss**: Optimized for binary classification tasks
537
+ - **Feature Normalization**: Automatic standardization (zero mean, unit variance) for stable training
538
+
539
+ **💡 Tips**:
540
+ - Start with default parameters (100 epochs, learning rate 0.01, threshold 0.5)
541
+ - Monitor validation metrics to detect overfitting
542
+ - Adjust threshold based on your classification goals (precision vs recall)
543
+ - Use batch size = Full Batch for most stable training
544
+ """)
545
+
546
+ vlai_template.create_footer()
547
+
548
+ load_evt = demo.load(
549
+ fn=lambda: load_and_configure_data(None, "Breast Cancer"),
550
+ outputs=[data_preview, target_column, status_message] + input_components + [inputs_group, input_status],
551
+ ).then(
552
+ fn=update_batch_size_slider,
553
+ inputs=[data_preview, target_column, train_test_split_ratio],
554
+ outputs=[batch_size_slider],
555
+ ).then(
556
+ fn=update_batch_size_display,
557
+ inputs=[batch_size_slider, train_test_split_ratio],
558
+ outputs=[batch_size_display],
559
+ ).then(
560
+ fn=update_learning_rate_display,
561
+ inputs=[learning_rate_slider],
562
+ outputs=[learning_rate_display],
563
+ )
564
+ upload_evt = file_upload.upload(
565
+ fn=lambda file: load_and_configure_data(file, "Breast Cancer"),
566
+ inputs=[file_upload],
567
+ outputs=[data_preview, target_column, status_message] + input_components + [inputs_group, input_status],
568
+ ).then(
569
+ fn=update_batch_size_slider,
570
+ inputs=[data_preview, target_column, train_test_split_ratio],
571
+ outputs=[batch_size_slider],
572
+ ).then(
573
+ fn=update_batch_size_display,
574
+ inputs=[batch_size_slider, train_test_split_ratio],
575
+ outputs=[batch_size_display],
576
+ )
577
+
578
+ sample_dataset.change(
579
+ fn=lambda choice: load_and_configure_data_simple(choice),
580
+ inputs=[sample_dataset],
581
+ outputs=[data_preview, target_column, status_message],
582
+ ).then(
583
+ fn=update_configuration, inputs=[data_preview, target_column],
584
+ outputs=input_components + [inputs_group, input_status, status_message],
585
+ ).then(
586
+ fn=update_batch_size_slider,
587
+ inputs=[data_preview, target_column, train_test_split_ratio],
588
+ outputs=[batch_size_slider],
589
+ ).then(
590
+ fn=update_batch_size_display,
591
+ inputs=[batch_size_slider, train_test_split_ratio],
592
+ outputs=[batch_size_display],
593
+ )
594
+
595
+ target_column.change(
596
+ fn=update_configuration, inputs=[data_preview, target_column],
597
+ outputs=input_components + [inputs_group, input_status, status_message],
598
+ ).then(
599
+ fn=update_batch_size_slider,
600
+ inputs=[data_preview, target_column, train_test_split_ratio],
601
+ outputs=[batch_size_slider],
602
+ ).then(
603
+ fn=update_batch_size_display,
604
+ inputs=[batch_size_slider, train_test_split_ratio],
605
+ outputs=[batch_size_display],
606
+ )
607
+
608
+ # Update batch size display when slider or train/test split changes
609
+ batch_size_slider.change(
610
+ fn=update_batch_size_display,
611
+ inputs=[batch_size_slider, train_test_split_ratio],
612
+ outputs=[batch_size_display],
613
+ )
614
+
615
+ train_test_split_ratio.change(
616
+ fn=update_batch_size_slider,
617
+ inputs=[data_preview, target_column, train_test_split_ratio],
618
+ outputs=[batch_size_slider],
619
+ ).then(
620
+ fn=update_batch_size_display,
621
+ inputs=[batch_size_slider, train_test_split_ratio],
622
+ outputs=[batch_size_display],
623
+ )
624
+
625
+ # Update learning rate display when slider changes
626
+ learning_rate_slider.change(
627
+ fn=update_learning_rate_display,
628
+ inputs=[learning_rate_slider],
629
+ outputs=[learning_rate_display],
630
+ )
631
+
632
+ threshold.change(
633
+ fn=lambda t: f"**Current Threshold:** {t:.2f}",
634
+ inputs=[threshold],
635
+ outputs=[threshold_display],
636
+ )
637
+
638
+ run_prediction_btn.click(
639
+ fn=execute_prediction,
640
+ inputs=[data_preview, target_column, epochs, learning_rate_slider, batch_size_slider, train_test_split_ratio, threshold] + input_components,
641
+ outputs=[train_loss_chart, val_loss_chart, results_display],
642
+ )
643
+
644
+ if __name__ == "__main__":
645
+ demo.launch(allowed_paths=["static/aivn_logo.png", "static/vlai_logo.png", "static"])
packages.txt ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ graphviz
2
+ fonts-liberation
requirements.txt ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ gradio>=5.38.0
2
+ pandas>=1.5.0
3
+ scikit-learn>=1.3.0
4
+ numpy>=1.24.0
5
+ plotly>=5.15.0
src/__init__.py ADDED
File without changes
src/logistic_regression.py ADDED
@@ -0,0 +1,494 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import pandas as pd
2
+ import numpy as np
3
+ from sklearn.datasets import load_breast_cancer, load_wine, make_classification
4
+ from sklearn.model_selection import train_test_split
5
+ from plotly.subplots import make_subplots
6
+ import plotly.graph_objects as go
7
+ import time
8
+
9
+ _current_model_params = None
10
+
11
+ def _get_current_model():
12
+ return _current_model_params
13
+
14
+ def _set_current_model(params):
15
+ global _current_model_params
16
+ _current_model_params = params
17
+
18
+
19
+ def load_data(file_obj=None, dataset_choice="Breast Cancer"):
20
+ """Load binary classification datasets"""
21
+ if file_obj is not None:
22
+ if file_obj.name.endswith(".csv"):
23
+ encodings = ["utf-8", "latin-1", "iso-8859-1", "cp1252"]
24
+ for encoding in encodings:
25
+ try:
26
+ return pd.read_csv(file_obj.name, encoding=encoding)
27
+ except UnicodeDecodeError:
28
+ continue
29
+ return pd.read_csv(file_obj.name, encoding="utf-8", errors="replace")
30
+ elif file_obj.name.endswith((".xlsx", ".xls")):
31
+ return pd.read_excel(file_obj.name)
32
+ else:
33
+ raise ValueError("Unsupported format. Upload CSV or Excel files.")
34
+
35
+ datasets = {
36
+ "Breast Cancer": lambda: _sklearn_to_df(load_breast_cancer()),
37
+ "Wine (Binary)": lambda: _wine_to_binary_df(load_wine()),
38
+ "Synthetic": lambda: _synthetic_classification(),
39
+ }
40
+ if dataset_choice not in datasets:
41
+ raise ValueError(f"Unknown dataset: {dataset_choice}")
42
+ return datasets[dataset_choice]()
43
+
44
+
45
+ def _sklearn_to_df(data):
46
+ """Convert sklearn dataset to DataFrame"""
47
+ df = pd.DataFrame(data.data, columns=getattr(data, "feature_names", None))
48
+ if df.columns.isnull().any():
49
+ df.columns = [f"feature_{i}" for i in range(df.shape[1])]
50
+ df["target"] = data.target
51
+ return df
52
+
53
+
54
+ def _wine_to_binary_df(wine_data):
55
+ """Convert wine dataset to binary classification (class 0 vs others)"""
56
+ df = pd.DataFrame(wine_data.data, columns=wine_data.feature_names)
57
+ df["target"] = (wine_data.target == 0).astype(int)
58
+ return df
59
+
60
+
61
+ def _synthetic_classification():
62
+ """Generate synthetic binary classification dataset"""
63
+ X, y = make_classification(n_samples=1000, n_features=20, n_informative=15,
64
+ n_redundant=5, n_classes=2, random_state=42)
65
+ df = pd.DataFrame(X, columns=[f"feature_{i}" for i in range(X.shape[1])])
66
+ df["target"] = y
67
+ return df
68
+
69
+
70
+ def create_input_components(df, target_col):
71
+ """Create input components for feature values"""
72
+ feature_cols = [c for c in df.columns if c != target_col]
73
+ components = []
74
+ for col in feature_cols:
75
+ data = df[col]
76
+ val = pd.to_numeric(data, errors="coerce").dropna().mean()
77
+ val = 0.0 if pd.isna(val) else float(val)
78
+ components.append(
79
+ {
80
+ "name": col,
81
+ "type": "number",
82
+ "value": round(val, 3),
83
+ "minimum": None,
84
+ "maximum": None,
85
+ }
86
+ )
87
+ return components
88
+
89
+
90
+ def preprocess_data(df, target_col, new_point_dict):
91
+ """Preprocess data for logistic regression"""
92
+ feature_cols = [c for c in df.columns if c != target_col]
93
+ X = df[feature_cols].copy()
94
+ y = df[target_col].copy()
95
+
96
+ # Convert to numeric
97
+ for col in feature_cols:
98
+ X[col] = pd.to_numeric(X[col], errors="coerce").fillna(0.0)
99
+
100
+ # Ensure binary target (0 or 1)
101
+ unique_vals = sorted(y.unique())
102
+ if len(unique_vals) != 2:
103
+ raise ValueError(f"Target must be binary (0/1). Found {len(unique_vals)} unique values: {unique_vals}")
104
+
105
+ # Map to 0/1 if needed
106
+ y_mapped = y.copy()
107
+ if set(unique_vals) != {0, 1}:
108
+ mapping = {unique_vals[0]: 0, unique_vals[1]: 1}
109
+ y_mapped = y.map(mapping)
110
+
111
+ # Prepare new point
112
+ new_point = []
113
+ for col in feature_cols:
114
+ if col in new_point_dict:
115
+ try:
116
+ new_point.append(float(new_point_dict[col]))
117
+ except Exception:
118
+ new_point.append(0.0)
119
+ else:
120
+ new_point.append(0.0)
121
+
122
+ new_point = np.array(new_point, dtype=float).reshape(1, -1)
123
+
124
+ return X.values, np.array(y_mapped, dtype=int), new_point, feature_cols
125
+
126
+
127
+ def add_bias(X):
128
+ """Add bias column to feature matrix"""
129
+ return np.c_[np.ones(X.shape[0]), X]
130
+
131
+
132
+ def sigmoid(z):
133
+ """Sigmoid activation function: σ(z) = 1 / (1 + exp(-z))"""
134
+ z = np.clip(z, -500, 500)
135
+ return 1 / (1 + np.exp(-z))
136
+
137
+
138
+ def predict_proba(X, theta):
139
+ """Make probability predictions: y_hat = sigmoid(X @ theta)"""
140
+ z = X.dot(theta)
141
+ return sigmoid(z)
142
+
143
+
144
+ def predict_class(X, theta, threshold=0.5):
145
+ """Make binary class predictions using threshold"""
146
+ proba = predict_proba(X, theta)
147
+ return (proba >= threshold).astype(int)
148
+
149
+
150
+ def compute_loss(y_hat, y):
151
+ """Compute Binary Cross-Entropy loss: -[y*log(ŷ) + (1-y)*log(1-ŷ)]"""
152
+ eps = 1e-15
153
+ y_hat = np.clip(y_hat, eps, 1 - eps)
154
+ loss = -(y * np.log(y_hat) + (1 - y) * np.log(1 - y_hat))
155
+ return np.mean(loss)
156
+
157
+
158
+ def compute_gradient(y_hat, y, X):
159
+ """Compute gradient: X.T @ (y_hat - y) / N"""
160
+ N = len(y)
161
+ return X.T.dot(y_hat - y) / N
162
+
163
+
164
+ def update_theta(theta, gradient, lr):
165
+ """Update parameters using gradient descent"""
166
+ return theta - lr * gradient
167
+
168
+
169
+ def compute_accuracy(y_true, y_pred):
170
+ """Compute classification accuracy"""
171
+ return np.mean(y_true == y_pred)
172
+
173
+
174
+ def normalize_features(X_train, X_val=None, X_test=None):
175
+ """Normalize features using standardization (zero mean, unit variance)"""
176
+ mean = np.mean(X_train, axis=0)
177
+ std = np.std(X_train, axis=0)
178
+ std[std == 0] = 1
179
+
180
+ X_train_norm = (X_train - mean) / std
181
+ X_val_norm = (X_val - mean) / std if X_val is not None else None
182
+ X_test_norm = (X_test - mean) / std if X_test is not None else None
183
+
184
+ return X_train_norm, X_val_norm, X_test_norm, mean, std
185
+
186
+
187
+ def train_logistic_regression_with_validation(X_train, y_train, X_val, y_val, epochs, learning_rate, batch_size=None):
188
+ """
189
+ Train logistic regression with mini-batch gradient descent
190
+
191
+ Returns:
192
+ theta, train_losses, val_losses, train_accuracies, val_accuracies, X_mean, X_std
193
+ """
194
+ X_train_norm, X_val_norm, _, X_mean, X_std = normalize_features(X_train, X_val)
195
+
196
+ X_train_bias = add_bias(X_train_norm)
197
+ X_val_bias = add_bias(X_val_norm)
198
+
199
+ np.random.seed(42)
200
+ theta = np.random.randn(X_train_bias.shape[1]) * 0.01
201
+
202
+ train_losses = []
203
+ val_losses = []
204
+ train_accuracies = []
205
+ val_accuracies = []
206
+
207
+ n_samples = X_train_bias.shape[0]
208
+
209
+ if batch_size is None or batch_size >= n_samples:
210
+ actual_batch_size = n_samples
211
+ else:
212
+ actual_batch_size = batch_size
213
+
214
+ for epoch in range(epochs):
215
+ if actual_batch_size < n_samples:
216
+ indices = np.random.permutation(n_samples)
217
+ X_train_shuffled = X_train_bias[indices]
218
+ y_train_shuffled = y_train[indices]
219
+ else:
220
+ X_train_shuffled = X_train_bias
221
+ y_train_shuffled = y_train
222
+
223
+ for i in range(0, n_samples, actual_batch_size):
224
+ X_batch = X_train_shuffled[i:i+actual_batch_size]
225
+ y_batch = y_train_shuffled[i:i+actual_batch_size]
226
+
227
+ y_batch_hat = predict_proba(X_batch, theta)
228
+ gradient = compute_gradient(y_batch_hat, y_batch, X_batch)
229
+ theta = update_theta(theta, gradient, learning_rate)
230
+
231
+ y_train_hat = predict_proba(X_train_bias, theta)
232
+ train_loss = compute_loss(y_train_hat, y_train)
233
+ train_losses.append(train_loss)
234
+
235
+ y_train_pred = predict_class(X_train_bias, theta)
236
+ train_acc = compute_accuracy(y_train, y_train_pred)
237
+ train_accuracies.append(train_acc)
238
+
239
+ y_val_hat = predict_proba(X_val_bias, theta)
240
+ val_loss = compute_loss(y_val_hat, y_val)
241
+ val_losses.append(val_loss)
242
+
243
+ y_val_pred = predict_class(X_val_bias, theta)
244
+ val_acc = compute_accuracy(y_val, y_val_pred)
245
+ val_accuracies.append(val_acc)
246
+
247
+ return theta, train_losses, val_losses, train_accuracies, val_accuracies, X_mean, X_std
248
+
249
+
250
+ def run_logistic_regression_and_visualize(df, target_col, new_point_dict,
251
+ epochs, learning_rate, batch_size_str="Full Batch",
252
+ train_test_split_ratio=0.8, threshold=0.5):
253
+ """Run logistic regression training and generate visualizations"""
254
+ X, y, new_point, feature_cols = preprocess_data(df, target_col, new_point_dict)
255
+
256
+ if epochs < 1:
257
+ return None, None, None, "Number of epochs must be ≥ 1.", None
258
+ if learning_rate <= 0:
259
+ return None, None, None, "Learning rate must be > 0.", None
260
+
261
+ test_size = 1.0 - train_test_split_ratio
262
+ X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=test_size, random_state=42, stratify=y)
263
+
264
+ if batch_size_str == "Full Batch":
265
+ batch_size = None
266
+ else:
267
+ batch_size = int(batch_size_str)
268
+
269
+ start_time = time.time()
270
+ theta, train_losses, val_losses, train_accuracies, val_accuracies, X_mean, X_std = train_logistic_regression_with_validation(
271
+ X_train, y_train, X_val, y_val, epochs, learning_rate, batch_size
272
+ )
273
+ training_time = time.time() - start_time
274
+
275
+ _set_current_model({
276
+ "theta": theta,
277
+ "feature_cols": feature_cols,
278
+ "X_mean": X_mean,
279
+ "X_std": X_std
280
+ })
281
+
282
+ # Prepare normalized data for prediction with threshold
283
+ X_train_norm, X_val_norm, _, _, _ = normalize_features(X_train, X_val)
284
+ X_train_bias = add_bias(X_train_norm)
285
+ X_val_bias = add_bias(X_val_norm)
286
+
287
+ # Make prediction with threshold
288
+ new_point_norm = (new_point - X_mean) / X_std
289
+ new_point_bias = add_bias(new_point_norm)
290
+ prediction_proba = predict_proba(new_point_bias, theta)[0]
291
+ prediction_class = predict_class(new_point_bias, theta, threshold)[0]
292
+
293
+ # Compute metrics with threshold
294
+ y_train_pred_thresh = predict_class(X_train_bias, theta, threshold)
295
+ y_val_pred_thresh = predict_class(X_val_bias, theta, threshold)
296
+ train_acc_thresh = compute_accuracy(y_train, y_train_pred_thresh)
297
+ val_acc_thresh = compute_accuracy(y_val, y_val_pred_thresh)
298
+
299
+ final_train_loss = train_losses[-1]
300
+ final_val_loss = val_losses[-1]
301
+ final_train_acc = train_accuracies[-1]
302
+ final_val_acc = val_accuracies[-1]
303
+
304
+ train_loss_fig = create_training_loss_chart(train_losses, train_accuracies)
305
+ val_loss_fig = create_validation_loss_chart(val_losses, val_accuracies)
306
+
307
+ results_display = create_results_display(
308
+ theta, prediction_proba, prediction_class, feature_cols, epochs, learning_rate, threshold,
309
+ split_info={
310
+ "train_size": len(X_train),
311
+ "val_size": len(X_val),
312
+ "train_ratio": train_test_split_ratio,
313
+ "val_ratio": 1.0 - train_test_split_ratio,
314
+ "train_loss": final_train_loss,
315
+ "val_loss": final_val_loss,
316
+ "train_acc": final_train_acc,
317
+ "val_acc": final_val_acc,
318
+ "train_acc_thresh": train_acc_thresh,
319
+ "val_acc_thresh": val_acc_thresh,
320
+ "batch_size": batch_size_str,
321
+ "training_time": training_time
322
+ }
323
+ )
324
+
325
+ return train_loss_fig, val_loss_fig, results_display, prediction_proba
326
+
327
+
328
+ def create_training_loss_chart(train_losses, train_accuracies):
329
+ """Create training loss and accuracy visualization"""
330
+ if not train_losses or len(train_losses) == 0:
331
+ return None
332
+
333
+ epochs = list(range(1, len(train_losses) + 1))
334
+ valid_losses = [loss if not (np.isinf(loss) or np.isnan(loss)) else None for loss in train_losses]
335
+
336
+ fig = make_subplots(
337
+ rows=2, cols=1,
338
+ subplot_titles=("Training Loss (Binary Cross-Entropy)", "Training Accuracy"),
339
+ vertical_spacing=0.15,
340
+ row_heights=[0.5, 0.5]
341
+ )
342
+
343
+ fig.add_trace(
344
+ go.Scatter(
345
+ x=epochs,
346
+ y=valid_losses,
347
+ mode='lines+markers',
348
+ name='Training Loss',
349
+ line=dict(color='#1976D2', width=3),
350
+ marker=dict(size=6),
351
+ showlegend=True
352
+ ),
353
+ row=1, col=1
354
+ )
355
+
356
+ if train_accuracies and len(train_accuracies) == len(train_losses):
357
+ valid_accuracies = [acc * 100 if not (np.isinf(acc) or np.isnan(acc)) else None for acc in train_accuracies]
358
+ fig.add_trace(
359
+ go.Scatter(
360
+ x=epochs,
361
+ y=valid_accuracies,
362
+ mode='lines+markers',
363
+ name='Training Accuracy',
364
+ line=dict(color='#42A5F5', width=3),
365
+ marker=dict(size=6),
366
+ showlegend=True
367
+ ),
368
+ row=2, col=1
369
+ )
370
+
371
+ fig.update_xaxes(title_text="Epoch", row=1, col=1, showgrid=True, gridwidth=1, gridcolor='lightgray')
372
+ fig.update_yaxes(title_text="Loss", row=1, col=1, showgrid=True, gridwidth=1, gridcolor='lightgray')
373
+ fig.update_xaxes(title_text="Epoch", row=2, col=1, showgrid=True, gridwidth=1, gridcolor='lightgray')
374
+ fig.update_yaxes(title_text="Accuracy (%)", row=2, col=1, showgrid=True, gridwidth=1, gridcolor='lightgray', range=[0, 100])
375
+
376
+ fig.update_layout(
377
+ title="Training Metrics Over Epochs",
378
+ plot_bgcolor="white",
379
+ height=600,
380
+ margin=dict(l=40, r=40, t=80, b=40)
381
+ )
382
+
383
+ return fig
384
+
385
+
386
+ def create_validation_loss_chart(val_losses, val_accuracies):
387
+ """Create validation loss and accuracy visualization"""
388
+ if not val_losses or len(val_losses) == 0:
389
+ return None
390
+
391
+ epochs = list(range(1, len(val_losses) + 1))
392
+ valid_losses = [loss if not (np.isinf(loss) or np.isnan(loss)) else None for loss in val_losses]
393
+
394
+ fig = make_subplots(
395
+ rows=2, cols=1,
396
+ subplot_titles=("Validation Loss (Binary Cross-Entropy)", "Validation Accuracy"),
397
+ vertical_spacing=0.15,
398
+ row_heights=[0.5, 0.5]
399
+ )
400
+
401
+ fig.add_trace(
402
+ go.Scatter(
403
+ x=epochs,
404
+ y=valid_losses,
405
+ mode='lines+markers',
406
+ name='Validation Loss',
407
+ line=dict(color='#7B1FA2', width=3),
408
+ marker=dict(size=6),
409
+ showlegend=True
410
+ ),
411
+ row=1, col=1
412
+ )
413
+
414
+ if val_accuracies and len(val_accuracies) == len(val_losses):
415
+ valid_accuracies = [acc * 100 if not (np.isinf(acc) or np.isnan(acc)) else None for acc in val_accuracies]
416
+ fig.add_trace(
417
+ go.Scatter(
418
+ x=epochs,
419
+ y=valid_accuracies,
420
+ mode='lines+markers',
421
+ name='Validation Accuracy',
422
+ line=dict(color='#BA68C8', width=3),
423
+ marker=dict(size=6),
424
+ showlegend=True
425
+ ),
426
+ row=2, col=1
427
+ )
428
+
429
+ fig.update_xaxes(title_text="Epoch", row=1, col=1, showgrid=True, gridwidth=1, gridcolor='lightgray')
430
+ fig.update_yaxes(title_text="Loss", row=1, col=1, showgrid=True, gridwidth=1, gridcolor='lightgray')
431
+ fig.update_xaxes(title_text="Epoch", row=2, col=1, showgrid=True, gridwidth=1, gridcolor='lightgray')
432
+ fig.update_yaxes(title_text="Accuracy (%)", row=2, col=1, showgrid=True, gridwidth=1, gridcolor='lightgray', range=[0, 100])
433
+
434
+ fig.update_layout(
435
+ title="Validation Metrics Over Epochs",
436
+ plot_bgcolor="white",
437
+ height=600,
438
+ margin=dict(l=40, r=40, t=80, b=40)
439
+ )
440
+
441
+ return fig
442
+
443
+
444
+ def create_results_display(theta, prediction_proba, prediction_class, feature_cols, epochs, learning_rate, threshold, split_info):
445
+ """Create HTML display showing model results"""
446
+
447
+ theta_str = f"[{theta[0]:.4f}"
448
+ for i, w in enumerate(theta[1:]):
449
+ theta_str += f", {w:.4f}"
450
+ theta_str += "]"
451
+
452
+ html_content = f"""
453
+ <div style='background:#E3F2FD;border-left:6px solid #1976D2;padding:14px 16px;border-radius:10px;'>
454
+ <strong style='color:#0D47A1;'>📊 Logistic Regression Results</strong><br><br>
455
+
456
+ <div style='margin:8px 0;'>
457
+ <strong style='color:#1976D2;'>🔧 Model Configuration:</strong><br>
458
+ • Epochs: {epochs} | Learning Rate: {learning_rate}<br>
459
+ • Batch Size: {split_info.get('batch_size', 'Full Batch')} | Features: {len(feature_cols)}<br>
460
+ • Normalization: Standardized | Activation: Sigmoid | Loss: Binary Cross-Entropy<br>
461
+ </div>
462
+
463
+ <div style='margin:8px 0;'>
464
+ <strong style='color:#1976D2;'>📊 Data Split:</strong><br>
465
+ • Training: {split_info['train_size']} samples ({split_info['train_ratio']:.1%})<br>
466
+ • Validation: {split_info['val_size']} samples ({split_info['val_ratio']:.1%})<br>
467
+ </div>
468
+
469
+ <div style='margin:8px 0;'>
470
+ <strong style='color:#1976D2;'>📈 Performance Metrics:</strong><br>
471
+ • Training Loss (BCE): <span style='background:#BBDEFB;padding:2px 6px;border-radius:4px;'><strong>{split_info['train_loss']:.4f}</strong></span><br>
472
+ • Validation Loss (BCE): <span style='background:#C5CAE9;padding:2px 6px;border-radius:4px;'><strong>{split_info['val_loss']:.4f}</strong></span><br>
473
+ • Training Accuracy (threshold={threshold:.2f}): <span style='background:#BBDEFB;padding:2px 6px;border-radius:4px;'><strong>{split_info['train_acc_thresh']*100:.2f}%</strong></span><br>
474
+ • Validation Accuracy (threshold={threshold:.2f}): <span style='background:#C5CAE9;padding:2px 6px;border-radius:4px;'><strong>{split_info['val_acc_thresh']*100:.2f}%</strong></span><br>
475
+ • Training Time: <span style='background:#E1BEE7;padding:2px 6px;border-radius:4px;'><strong>{split_info['training_time']:.4f}s</strong></span><br>
476
+ </div>
477
+
478
+ <div style='margin:8px 0;'>
479
+ <strong style='color:#1976D2;'>🎯 Learned Parameters (θ):</strong><br>
480
+ • Theta = <code style='background:#F3E5F5;padding:2px 6px;border-radius:4px;'>{theta_str}</code><br>
481
+ • Bias (θ₀) = {theta[0]:.4f}<br>
482
+ </div>
483
+
484
+ <div style='margin:8px 0;'>
485
+ <strong style='color:#1976D2;'>🔮 Prediction (Threshold = {threshold:.2f}):</strong><br>
486
+ • Probability: <span style='background:#DCEDC8;padding:2px 6px;border-radius:4px;'><strong>{prediction_proba:.4f}</strong></span> ({(prediction_proba*100):.2f}%)<br>
487
+ • Predicted Class: <span style='background:#DCEDC8;padding:2px 6px;border-radius:4px;'><strong>{prediction_class}</strong></span> (0 = Class 0, 1 = Class 1)<br>
488
+ <em style='font-size:0.9em;color:#424242;'>* Adjust threshold to see how predictions change. Lower threshold → more predictions of class 1</em><br>
489
+ </div>
490
+ </div>
491
+ """
492
+
493
+ return html_content
494
+
static/aivn_logo.png ADDED
static/vlai_logo.png ADDED
vlai_template.py ADDED
@@ -0,0 +1,250 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os, base64
2
+ import gradio as gr
3
+
4
+ # Theming (can be overridden by the host app)
5
+ PRIMARY_COLOR = "#0F6CBD" # medical calm blue
6
+ ACCENT_COLOR = "#C4314B" # medical alert red
7
+ SUCCESS_COLOR = "#2E7D32" # positive/ok
8
+ BG1 = "#F0F7FF"
9
+ BG2 = "#E8F0FA"
10
+ BG3 = "#DDE7F8"
11
+ FONT_FAMILY = "'Inter', -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, 'Helvetica Neue', Arial, 'Noto Sans', 'Liberation Sans', sans-serif"
12
+
13
+ # App metadata (overridable)
14
+ PROJECT_NAME = "Demo Project"
15
+ AIO_YEAR = "2025"
16
+ AIO_MODULE = "00"
17
+ PROJECT_DESCRIPTION = ""
18
+ META_INFO = [] # list of (label, value)
19
+
20
+ def set_colors(primary: str = None, accent: str = None, bg1: str = None, bg2: str = None, bg3: str = None):
21
+ """Allow host app to set theme colors dynamically."""
22
+ global PRIMARY_COLOR, ACCENT_COLOR, BG1, BG2, BG3, custom_css
23
+ if primary:
24
+ PRIMARY_COLOR = primary
25
+ if accent:
26
+ ACCENT_COLOR = accent
27
+ if bg1:
28
+ BG1 = bg1
29
+ if bg2:
30
+ BG2 = bg2
31
+ if bg3:
32
+ BG3 = bg3
33
+ # Rebuild CSS with new colors
34
+ custom_css = _build_custom_css()
35
+
36
+ def set_font(font_family: str):
37
+ """Allow host app to set a custom font stack (e.g., 'Inter', system fallbacks)."""
38
+ global FONT_FAMILY, custom_css
39
+ if font_family and isinstance(font_family, str):
40
+ FONT_FAMILY = font_family
41
+ custom_css = _build_custom_css()
42
+
43
+ def set_meta(project_name: str = None, year: str = None, module: str = None, description: str = None, meta_items: list = None):
44
+ """Set project metadata used across the header and info sections."""
45
+ global PROJECT_NAME, AIO_YEAR, AIO_MODULE, PROJECT_DESCRIPTION, META_INFO
46
+ if project_name is not None:
47
+ PROJECT_NAME = project_name
48
+ if year is not None:
49
+ AIO_YEAR = year
50
+ if module is not None:
51
+ AIO_MODULE = module
52
+ if description is not None:
53
+ PROJECT_DESCRIPTION = description
54
+ if meta_items is not None:
55
+ META_INFO = meta_items
56
+
57
+ def configure(project_name: str = None, year: str = None, module: str = None, description: str = None,
58
+ colors: dict = None, font_family: str = None, meta_items: list = None):
59
+ """One-call configuration for meta, theme, and font."""
60
+ if colors:
61
+ set_colors(
62
+ primary=colors.get("primary"),
63
+ accent=colors.get("accent"),
64
+ bg1=colors.get("bg1"),
65
+ bg2=colors.get("bg2"),
66
+ bg3=colors.get("bg3"),
67
+ )
68
+ if font_family:
69
+ set_font(font_family)
70
+ set_meta(project_name, year, module, description, meta_items)
71
+
72
+
73
+ def image_to_base64(image_path: str):
74
+ # Construct the absolute path to the image
75
+ current_dir = os.path.dirname(os.path.abspath(__file__))
76
+ full_image_path = os.path.join(current_dir, image_path)
77
+ with open(full_image_path, "rb") as f:
78
+ return base64.b64encode(f.read()).decode("utf-8")
79
+
80
+ def create_header():
81
+ with gr.Row():
82
+ with gr.Column(scale=2):
83
+ logo_base64 = image_to_base64("static/aivn_logo.png")
84
+ gr.HTML(
85
+ f"""<img src="data:image/png;base64,{logo_base64}"
86
+ alt="Logo"
87
+ style="height:120px;width:auto;margin:0 auto;margin-bottom:16px; display:block;">"""
88
+ )
89
+ with gr.Column(scale=2):
90
+ gr.HTML(f"""
91
+ <div style="display:flex;justify-content:flex-start;align-items:center;gap:30px;">
92
+ <div>
93
+ <h1 style="margin-bottom:0; color: {PRIMARY_COLOR}; font-size: 2.5em; font-weight: bold;"> {PROJECT_NAME} </h1>
94
+ <h3 style="color: #888; font-style: italic"> AIO{AIO_YEAR}: Module {AIO_MODULE}. </h3>
95
+ </div>
96
+ </div>
97
+ """)
98
+
99
+ def create_footer():
100
+ logo_base64_vlai = image_to_base64("static/vlai_logo.png")
101
+ footer_html = """
102
+ <style>
103
+ .sticky-footer{position:fixed;bottom:0px;left:0;width:100%;background:#E8F5E8;
104
+ padding:10px;box-shadow:0 -2px 10px rgba(0,0,0,0.1);z-index:1000;}
105
+ .content-wrap{padding-bottom:60px;}
106
+ </style>""" + f"""
107
+ <div class="sticky-footer">
108
+ <div style="text-align:center;font-size:18px; color: #888">
109
+ Created by
110
+ <a href="https://vlai.work" target="_blank" style="color:#465C88;text-decoration:none;font-weight:bold; display:inline-flex; align-items:center;"> VLAI
111
+ <img src="data:image/png;base64,{logo_base64_vlai}" alt="Logo" style="height:20px; width:auto;">
112
+ </a> from <a href="https://aivietnam.edu.vn/" target="_blank" style="color:#355724;text-decoration:none;font-weight:bold">AI VIET NAM</a>
113
+ </div>
114
+ </div>
115
+ """
116
+ return gr.HTML(footer_html)
117
+
118
+ def _build_custom_css() -> str:
119
+ return f"""
120
+ @import url('https://fonts.googleapis.com/css2?family=Inter:wght@400;500;600;700&display=swap');
121
+
122
+ .gradio-container {{
123
+ min-height: 100vh !important;
124
+ width: 100vw !important;
125
+ margin: 0 !important;
126
+ padding: 0px !important;
127
+ background: linear-gradient(135deg, {BG1} 0%, {BG2} 50%, {BG3} 100%);
128
+ background-size: 600% 600%;
129
+ animation: gradientBG 7s ease infinite;
130
+ }}
131
+
132
+ /* Global font setup */
133
+ body, .gradio-container, .gr-block, .gr-markdown, .gr-button, .gr-input,
134
+ .gr-dropdown, .gr-number, .gr-plot, .gr-dataframe, .gr-accordion, .gr-form,
135
+ .gr-textbox, .gr-html, table, th, td, label, h1, h2, h3, h4, h5, h6, p, span, div {{
136
+ font-family: {FONT_FAMILY} !important;
137
+ }}
138
+
139
+ @keyframes gradientBG {{
140
+ 0% {{background-position: 0% 50%;}}
141
+ 50% {{background-position: 100% 50%;}}
142
+ 100% {{background-position: 0% 50%;}}
143
+ }}
144
+
145
+ /* Minimize spacing and padding */
146
+ .content-wrap {{
147
+ padding: 2px !important;
148
+ margin: 0 !important;
149
+ }}
150
+
151
+ /* Reduce component spacing */
152
+ .gr-row {{
153
+ gap: 5px !important;
154
+ margin: 2px 0 !important;
155
+ }}
156
+
157
+ .gr-column {{
158
+ gap: 4px !important;
159
+ padding: 4px !important;
160
+ }}
161
+
162
+ /* Accordion optimization */
163
+ .gr-accordion {{
164
+ margin: 4px 0 !important;
165
+ }}
166
+
167
+ .gr-accordion .gr-accordion-content {{
168
+ padding: 2px !important;
169
+ }}
170
+
171
+ /* Form elements spacing */
172
+ .gr-form {{
173
+ gap: 2px !important;
174
+ }}
175
+
176
+ /* Button styling */
177
+ .gr-button {{
178
+ margin: 2px 0 !important;
179
+ }}
180
+
181
+ /* DataFrame optimization */
182
+ .gr-dataframe {{
183
+ margin: 4px 0 !important;
184
+ }}
185
+
186
+ /* Remove horizontal scroll from data preview */
187
+ .gr-dataframe .wrap {{
188
+ overflow-x: auto !important;
189
+ max-width: 100% !important;
190
+ }}
191
+
192
+ /* Plot optimization */
193
+ .gr-plot {{
194
+ margin: 4px 0 !important;
195
+ }}
196
+
197
+ /* Reduce markdown margins */
198
+ .gr-markdown {{
199
+ margin: 2px 0 !important;
200
+ }}
201
+
202
+ /* Footer positioning */
203
+ .sticky-footer {{
204
+ position: fixed;
205
+ bottom: 0px;
206
+ left: 0;
207
+ width: 100%;
208
+ background: {BG1};
209
+ padding: 6px !important;
210
+ box-shadow: 0 -2px 10px rgba(0,0,0,0.1);
211
+ z-index: 1000;
212
+ }}
213
+ """
214
+
215
+ # Initialize CSS using defaults
216
+ custom_css = _build_custom_css()
217
+
218
+ def render_info_card(description: str = None, meta_items: list = None, icon: str = "🧠", title: str = "About this demo") -> str:
219
+ desc = description if description is not None else PROJECT_DESCRIPTION
220
+ items = meta_items if meta_items is not None else META_INFO
221
+ meta_html = " · ".join([f"<span><strong>{k}</strong>: {v}</span>" for k, v in items]) if items else ""
222
+ return f"""
223
+ <div style="margin: 8px 0 8px 0;">
224
+ <div style="background:#F5F9FF;border-left:6px solid {PRIMARY_COLOR};padding:14px 16px;border-radius:10px;box-shadow:0 1px 3px rgba(0,0,0,0.06);">
225
+ <div style="display:flex;gap:14px;align-items:flex-start;">
226
+ <div style="font-size:22px;">{icon}</div>
227
+ <div>
228
+ <div style="font-weight:700;color:{PRIMARY_COLOR};margin-bottom:4px;">{title}</div>
229
+ <div style="color:#000;font-size:14px;line-height:1.5;">{desc}</div>
230
+ <div style="margin-top:8px;color:#000;font-size:13px;">{meta_html}</div>
231
+ </div>
232
+ </div>
233
+ </div>
234
+ </div>
235
+ """
236
+
237
+ def render_disclaimer(text: str, icon: str = "⚠️", title: str = "Educational Use Only") -> str:
238
+ return f"""
239
+ <div style=\"margin: 8px 0 6px 0;\">
240
+ <div style=\"background:#FFF4F4;border-left:6px solid {ACCENT_COLOR};padding:12px 16px;border-radius:8px;box-shadow:0 1px 3px rgba(0,0,0,0.06);\">
241
+ <div style=\"display:flex;gap:10px;align-items:flex-start;color:#000;\">
242
+ <span style=\"font-size:20px\">{icon}</span>
243
+ <div>
244
+ <div style=\"font-weight:700; margin-bottom:4px;\">{title}</div>
245
+ <div style=\"font-size:14px; line-height:1.4;\">{text}</div>
246
+ </div>
247
+ </div>
248
+ </div>
249
+ </div>
250
+ """