heffnt commited on
Commit
a44add2
·
1 Parent(s): 3453dc9

Add initial project structure and configuration files

Browse files

- Created .dockerignore to exclude unnecessary files from Docker builds.
- Added .env.example for environment variable configuration.
- Updated .gitignore to include .env and virtual environment directories.
- Introduced app.py with a detailed description and initial setup for the Smart Confidant chatbot.
- Added deploy.sh for automated deployment to a remote server.
- Created environment.yml for managing dependencies with micromamba.
- Included PROJECT_REPORT.md for documentation of the deployment process and challenges.
- Established pyproject.toml for project metadata and dependencies.
- Enhanced README.md with setup instructions and features overview.
- Removed requirements.txt as dependencies are now managed in pyproject.toml.
- Added restart.sh for quick application restarts on the server.

Files changed (11) hide show
  1. .dockerignore +47 -0
  2. .env.example +0 -0
  3. .gitignore +18 -1
  4. README.md +93 -17
  5. app.py +288 -146
  6. deploy.sh +196 -0
  7. env.example +5 -0
  8. environment.yml +10 -0
  9. pyproject.toml +17 -0
  10. requirements.txt +0 -3
  11. restart.sh +90 -0
.dockerignore ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Git
2
+ .git
3
+ .gitignore
4
+ .gitattributes
5
+
6
+ # Documentation
7
+ *.md
8
+ !README.md
9
+
10
+ # Python
11
+ __pycache__/
12
+ *.py[cod]
13
+ *.pyo
14
+ *.pyd
15
+ *.egg-info/
16
+ .pytest_cache/
17
+
18
+ # Virtual environments
19
+ venv/
20
+ .venv/
21
+ ENV/
22
+ env/
23
+
24
+ # IDEs
25
+ .vscode/
26
+ .idea/
27
+ *.swp
28
+ *.swo
29
+ *~
30
+
31
+ # OS
32
+ .DS_Store
33
+ Thumbs.db
34
+
35
+ # Temporary files
36
+ tmp/
37
+ *.log
38
+ *.tmp
39
+
40
+ # Development files
41
+ .python-version
42
+ deploy.sh
43
+ docker-compose.yml
44
+
45
+ # Test files
46
+ tests/
47
+
.env.example ADDED
File without changes
.gitignore CHANGED
@@ -5,4 +5,21 @@ __pycache__/
5
  *.db
6
  *.sqlite3
7
  *.log
8
- *.env
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  *.db
6
  *.sqlite3
7
  *.log
8
+ .env
9
+
10
+ # Virtual environments
11
+ venv/
12
+ .venv/
13
+ ENV/
14
+ env/
15
+
16
+ # Conda
17
+ .conda/
18
+ *.egg-info/
19
+
20
+ # uv
21
+ .python-version
22
+ uv.lock
23
+
24
+ # Temporary files
25
+ tmp/
README.md CHANGED
@@ -1,17 +1,93 @@
1
- ---
2
- title: CSDS553 Demo
3
- emoji: 💬
4
- colorFrom: yellow
5
- colorTo: purple
6
- sdk: gradio
7
- sdk_version: 5.44.1
8
- app_file: app.py
9
- pinned: false
10
- hf_oauth: true
11
- hf_oauth_scopes:
12
- - inference-api
13
- ---
14
-
15
- An example chatbot using [Gradio](https://gradio.app), [`huggingface_hub`](https://huggingface.co/docs/huggingface_hub/v0.22.2/en/index), and the [Hugging Face Inference API](https://huggingface.co/docs/api-inference/index).
16
-
17
- Increment this to force push to Github: 1243
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 🎓🧙🏻‍♂️ Smart Confidant 🧙🏻‍♂️🎓
2
+
3
+ An AI chatbot assistant for Magic: The Gathering, built with [Gradio](https://gradio.app) and Hugging Face models.
4
+
5
+ ## Features
6
+
7
+ - 🎨 Custom themed UI with Magic: The Gathering aesthetics
8
+ - 🤖 Multiple model support (local and API-based)
9
+ - 💬 Chat history with custom avatars
10
+ - ⚙️ Configurable generation parameters (temperature, max tokens, top-p)
11
+ - 📊 Resource monitoring (CPU, memory usage)
12
+
13
+ ## Setup
14
+
15
+ ### Local Development (Windows/Mac/Linux)
16
+
17
+ ```bash
18
+ # 1. Set up environment variables (for API models):
19
+ cp env.example .env
20
+ # Edit .env and add your HuggingFace token
21
+
22
+ # 2. Create conda environment
23
+ conda env create -f environment.yml
24
+
25
+ # 3. Activate environment
26
+ conda activate smart-confidant
27
+
28
+ # 4. Install dependencies with uv
29
+ pip install uv
30
+ uv pip install -e .
31
+
32
+ # 5. Run the application
33
+ python app.py
34
+ ```
35
+
36
+ The app will be available at `http://localhost:8012`
37
+
38
+ ### Linux Deployment
39
+
40
+ Deploy to a remote server in one command:
41
+ ```bash
42
+ # 1. Set up your HuggingFace token (for API models):
43
+ cp env.example .env
44
+ # Edit .env and add your token
45
+
46
+ # 2. Deploy:
47
+ ./deploy.sh
48
+ ```
49
+
50
+ This script will:
51
+ - Load HF_TOKEN from `.env` file (if present)
52
+ - Handle SSH key authentication
53
+ - Copy your code to the server
54
+ - Install micromamba
55
+ - Set up environment
56
+ - Install dependencies with uv
57
+ - Start the application
58
+ - Pass HF_TOKEN to enable API models
59
+
60
+ The app will be available at `http://your-server:8012`
61
+
62
+ **Note:** To use API models, you need a HuggingFace API token:
63
+ 1. Go to https://huggingface.co/settings/tokens
64
+ 2. Create a new token (read access is sufficient)
65
+ 3. Copy `env.example` to `.env` and add your token: `HF_TOKEN=hf_...`
66
+ 4. The `.env` file is git-ignored for security
67
+
68
+ ## Available Models
69
+
70
+ ### API Models (require HF_TOKEN)
71
+ - **HuggingFaceH4/zephyr-7b-beta** (7B params) - Recommended: Best quality for chat
72
+ - **google/gemma-2-2b-it** (2B params) - Instruction-tuned, good balance
73
+ - **distilgpt2** (82M params) - Very small and fast (older generation)
74
+ - **gpt2** (124M params) - Reliable baseline (older generation)
75
+
76
+ ### Local Models (run on your device)
77
+ - **arnir0/Tiny-LLM** - Very small model for testing
78
+
79
+ API models are recommended as they're free with HuggingFace's Inference API and don't require local compute resources. Start with **zephyr-7b-beta** or **gemma-2-2b-it** for best results.
80
+
81
+ ## Configuration
82
+
83
+ Key configuration variables at the top of `app.py`:
84
+ - `LOCAL_MODELS`: List of local models to use
85
+ - `API_MODELS`: List of API models to use (all free with HF Inference API)
86
+ - `DEFAULT_SYSTEM_MESSAGE`: Default system prompt
87
+
88
+ ## Requirements
89
+
90
+ - Conda/Mamba (for local development)
91
+ - Git Bash (for running `deploy.sh` on Windows)
92
+
93
+ Python dependencies are managed in `pyproject.toml`.
app.py CHANGED
@@ -1,130 +1,190 @@
 
 
 
 
 
1
  import gradio as gr
 
2
  from huggingface_hub import InferenceClient
3
  import os
4
  import base64
5
  from pathlib import Path
6
- import threading
7
- import time
8
- import logging
9
 
 
10
  # Configuration
11
- LOCAL_MODELS = ["tiiuae/Falcon-H1-0.5B-Instruct"]
12
- API_MODELS = ["openai/gpt-oss-20b"]
 
 
13
  DEFAULT_SYSTEM_MESSAGE = "You are an expert assistant for Magic: The Gathering. You're name is Smart Confidant, but people tend to call you Bob."
14
  TITLE = "🎓🧙🏻‍♂️ Smart Confidant 🧙🏻‍♂️🎓"
15
 
16
- # Resource logging configuration
17
- RESOURCE_LOGGING_ENABLED = True
18
- RESOURCE_LOG_INTERVAL_SEC = 15
19
-
20
- # Create model options with labels
21
  MODEL_OPTIONS = []
22
  for model in LOCAL_MODELS:
23
  MODEL_OPTIONS.append(f"{model} (local)")
24
  for model in API_MODELS:
25
  MODEL_OPTIONS.append(f"{model} (api)")
26
 
 
27
  pipe = None
28
  stop_inference = False
29
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
30
  ASSETS_DIR = Path(__file__).parent / "assets"
31
  BACKGROUND_IMAGE_PATH = ASSETS_DIR / "confidant_pattern.png"
32
  try:
33
  with open(BACKGROUND_IMAGE_PATH, "rb") as _img_f:
34
  _encoded_img = base64.b64encode(_img_f.read()).decode("ascii")
35
  BACKGROUND_DATA_URL = f"data:image/png;base64,{_encoded_img}"
 
36
  except Exception as e:
37
- print(f"Error loading background image: {e}")
38
  BACKGROUND_DATA_URL = ""
39
 
40
- # Fancy styling
 
 
 
 
 
 
 
 
 
 
41
  fancy_css = f"""
42
- html, body, #root {{
43
- background-image: url('{BACKGROUND_DATA_URL}');
44
- background-repeat: repeat;
45
- background-size: auto;
46
- background-color: transparent;
 
 
47
  }}
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
48
  .gradio-container {{
49
- max-width: 700px;
50
- margin: 0 auto;
51
- padding: 20px;
52
- background-color: #2d2d2d;
53
- box-shadow: 0 4px 8px rgba(0, 0, 0, 0.1);
54
- border-radius: 10px;
55
- font-family: 'Arial', sans-serif;
56
  }}
57
- .gr-button {{
58
- background-color: #4CAF50;
59
- color: white;
60
- border: none;
61
- border-radius: 5px;
62
- padding: 10px 20px;
63
- cursor: pointer;
64
- transition: background-color 0.3s ease;
 
 
 
 
 
 
 
 
 
65
  }}
66
- .gr-button:hover {{
67
- background-color: #45a049;
 
 
 
 
68
  }}
69
- .gr-slider input {{
70
- color: #4CAF50;
 
 
 
71
  }}
72
- .gr-chat {{
73
- font-size: 16px;
 
 
 
 
 
 
 
74
  }}
75
- #title {{
76
- text-align: center;
77
- font-size: 2em;
78
- margin-bottom: 20px;
79
- color: #333;
 
 
 
 
 
 
 
 
 
 
 
 
 
 
80
  }}
81
  """
82
 
83
- def _configure_basic_logging():
84
- if len(logging.getLogger().handlers) == 0:
85
- logging.basicConfig(
86
- level=logging.INFO,
87
- format="%(asctime)s [%(levelname)s] %(message)s",
88
- )
89
-
90
-
91
- def _resource_logger_worker(interval_seconds: int):
92
- try:
93
- import psutil
94
-
95
- process = psutil.Process(os.getpid())
96
- # Prime CPU percent calculations
97
- psutil.cpu_percent(interval=None)
98
- process.cpu_percent(interval=None)
99
-
100
- while True:
101
- system_cpu_percent = psutil.cpu_percent(interval=None)
102
- system_mem_percent = psutil.virtual_memory().percent
103
- process_rss_mb = process.memory_info().rss / (1024 * 1024)
104
- process_cpu_percent = process.cpu_percent(interval=None)
105
-
106
- logging.info(
107
- f"System CPU: {system_cpu_percent:.1f}%, System Mem: {system_mem_percent:.1f}%, "
108
- f"Process RSS: {process_rss_mb:.1f} MB, Process CPU: {process_cpu_percent:.1f}%"
109
- )
110
-
111
- time.sleep(interval_seconds)
112
- except ImportError:
113
- logging.warning("psutil not installed; resource logging disabled.")
114
- except Exception as e:
115
- logging.exception(f"Resource logger stopped due to error: {e}")
116
-
117
-
118
- def start_resource_logger():
119
- _configure_basic_logging()
120
- thread = threading.Thread(
121
- target=_resource_logger_worker,
122
- args=(RESOURCE_LOG_INTERVAL_SEC,),
123
- daemon=True,
124
- name="resource-logger",
125
- )
126
- thread.start()
127
- return thread
128
 
129
  def respond(
130
  message,
@@ -133,78 +193,151 @@ def respond(
133
  max_tokens,
134
  temperature,
135
  top_p,
136
- hf_token: gr.OAuthToken,
137
  selected_model: str,
138
  ):
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
139
  global pipe
 
 
 
 
 
140
 
141
- # Build messages from history
142
- messages = [{"role": "system", "content": system_message}]
143
- messages.extend(history)
144
- messages.append({"role": "user", "content": message})
 
145
 
146
- # Determine if model is local or API and extract model name
147
- is_local = selected_model.endswith("(local)")
148
- model_name = selected_model.replace(" (local)", "").replace(" (api)", "")
149
-
150
- response = ""
151
-
152
- if is_local:
153
- print(f"[MODE] local - {model_name}")
154
- from transformers import pipeline
155
- import torch
156
- if pipe is None or pipe.model.name_or_path != model_name:
157
- pipe = pipeline("text-generation", model=model_name)
158
-
159
- # Build prompt as plain text
160
- prompt = "\n".join([f"{m['role']}: {m['content']}" for m in messages])
161
-
162
- outputs = pipe(
163
- prompt,
164
- max_new_tokens=max_tokens,
165
- do_sample=True,
166
- temperature=temperature,
167
- top_p=top_p,
168
- )
169
 
170
- response = outputs[0]["generated_text"][len(prompt):]
171
- yield response.strip()
 
172
 
173
- else:
174
- print(f"[MODE] api - {model_name}")
 
 
 
 
 
 
 
 
175
 
176
- if hf_token is None or not getattr(hf_token, "token", None):
177
- yield "⚠️ Please log in with your Hugging Face account first."
178
- return
 
179
 
180
- client = InferenceClient(token=hf_token.token, model=model_name)
 
 
 
 
 
 
 
 
 
181
 
182
- for chunk in client.chat_completion(
183
- messages,
184
- max_tokens=max_tokens,
185
- stream=True,
186
- temperature=temperature,
187
- top_p=top_p,
188
- ):
189
- choices = chunk.choices
190
- token = ""
191
- if len(choices) and choices[0].delta.content:
192
- token = choices[0].delta.content
193
- response += token
194
- yield response
 
 
 
 
 
 
195
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
196
 
197
- with gr.Blocks(css=fancy_css) as demo:
198
- gr.LoginButton()
199
- gr.Markdown(f"<h1 style='text-align: center;'>{TITLE}</h1>")
 
 
 
 
 
 
 
 
 
 
 
200
 
201
- # Create custom chatbot with avatar images
202
  chatbot = gr.Chatbot(
203
  type="messages",
204
  avatar_images=(str(ASSETS_DIR / "monster_icon.png"), str(ASSETS_DIR / "smart_confidant_icon.png"))
205
  )
206
 
207
- # Create additional inputs in an accordion
208
  with gr.Accordion("⚙️ Additional Settings", open=False):
209
  system_message = gr.Textbox(value=DEFAULT_SYSTEM_MESSAGE, label="System message")
210
  max_tokens = gr.Slider(minimum=1, maximum=2048, value=512, step=1, label="Max new tokens")
@@ -212,7 +345,7 @@ with gr.Blocks(css=fancy_css) as demo:
212
  top_p = gr.Slider(minimum=0.1, maximum=1.0, value=0.95, step=0.05, label="Top-p (nucleus sampling)")
213
  selected_model = gr.Radio(choices=MODEL_OPTIONS, label="Select Model", value=MODEL_OPTIONS[0])
214
 
215
- # Create ChatInterface with the custom chatbot and pre-rendered additional inputs
216
  gr.ChatInterface(
217
  fn=respond,
218
  chatbot=chatbot,
@@ -226,7 +359,16 @@ with gr.Blocks(css=fancy_css) as demo:
226
  type="messages",
227
  )
228
 
 
 
 
 
229
  if __name__ == "__main__":
230
- if RESOURCE_LOGGING_ENABLED:
231
- start_resource_logger()
232
- demo.launch()
 
 
 
 
 
 
1
+ """
2
+ Smart Confidant - A Magic: The Gathering chatbot with support for local and API-based LLMs.
3
+ Supports both local transformers models and HuggingFace API models with custom theming.
4
+ """
5
+
6
  import gradio as gr
7
+ from gradio.themes.base import Base
8
  from huggingface_hub import InferenceClient
9
  import os
10
  import base64
11
  from pathlib import Path
12
+ import traceback
13
+ from datetime import datetime
14
+ from threading import Lock
15
 
16
+ # ============================================================================
17
  # Configuration
18
+ # ============================================================================
19
+
20
+ LOCAL_MODELS = ["arnir0/Tiny-LLM"]
21
+ API_MODELS = ["google/gemma-2-2b-it", "HuggingFaceH4/zephyr-7b-beta"]
22
  DEFAULT_SYSTEM_MESSAGE = "You are an expert assistant for Magic: The Gathering. You're name is Smart Confidant, but people tend to call you Bob."
23
  TITLE = "🎓🧙🏻‍♂️ Smart Confidant 🧙🏻‍♂️🎓"
24
 
25
+ # Create labeled model options for the radio selector
 
 
 
 
26
  MODEL_OPTIONS = []
27
  for model in LOCAL_MODELS:
28
  MODEL_OPTIONS.append(f"{model} (local)")
29
  for model in API_MODELS:
30
  MODEL_OPTIONS.append(f"{model} (api)")
31
 
32
+ # Global state for local model pipeline (cached across requests)
33
  pipe = None
34
  stop_inference = False
35
 
36
+ # Debug logging setup with thread-safe access
37
+ debug_logs = []
38
+ debug_lock = Lock()
39
+ MAX_LOG_LINES = 100
40
+
41
+ # ============================================================================
42
+ # Debug Logging Functions
43
+ # ============================================================================
44
+
45
+ def log_debug(message, level="INFO"):
46
+ """Add timestamped message to debug log (thread-safe, rotating buffer)."""
47
+ timestamp = datetime.now().strftime("%H:%M:%S")
48
+ log_entry = f"[{timestamp}] [{level}] {message}"
49
+ with debug_lock:
50
+ debug_logs.append(log_entry)
51
+ if len(debug_logs) > MAX_LOG_LINES:
52
+ debug_logs.pop(0)
53
+ print(log_entry)
54
+ return "\n".join(debug_logs)
55
+
56
+ def get_debug_logs():
57
+ """Retrieve all debug logs as a single string."""
58
+ with debug_lock:
59
+ return "\n".join(debug_logs)
60
+
61
+ # ============================================================================
62
+ # Asset Loading & Theme Configuration
63
+ # ============================================================================
64
+
65
+ # Load background image as base64 data URL for CSS injection
66
  ASSETS_DIR = Path(__file__).parent / "assets"
67
  BACKGROUND_IMAGE_PATH = ASSETS_DIR / "confidant_pattern.png"
68
  try:
69
  with open(BACKGROUND_IMAGE_PATH, "rb") as _img_f:
70
  _encoded_img = base64.b64encode(_img_f.read()).decode("ascii")
71
  BACKGROUND_DATA_URL = f"data:image/png;base64,{_encoded_img}"
72
+ log_debug("Background image loaded successfully")
73
  except Exception as e:
74
+ log_debug(f"Error loading background image: {e}", "ERROR")
75
  BACKGROUND_DATA_URL = ""
76
 
77
+ class TransparentTheme(Base):
78
+ """Custom Gradio theme with transparent body background to show tiled image."""
79
+ def __init__(self):
80
+ super().__init__()
81
+ super().set(
82
+ body_background_fill="*neutral_950",
83
+ body_background_fill_dark="*neutral_950",
84
+ )
85
+
86
+ # Custom CSS for dark theme with tiled background image
87
+ # Uses aggressive selectors to override Gradio's default styling
88
  fancy_css = f"""
89
+ /* Tiled background image on page body */
90
+ body {{
91
+ background-image: url('{BACKGROUND_DATA_URL}') !important;
92
+ background-repeat: repeat !important;
93
+ background-size: auto !important;
94
+ background-attachment: fixed !important;
95
+ background-color: #1a1a1a !important;
96
  }}
97
+
98
+ /* Make Gradio wrapper divs transparent to show background */
99
+ gradio-app,
100
+ .gradio-container,
101
+ .gradio-container > div,
102
+ .gradio-container > div > div,
103
+ .main,
104
+ .contain,
105
+ [class*="svelte"] > div,
106
+ div[class*="wrap"]:not(.gr-button):not([class*="input"]):not([class*="textbox"]):not([class*="bubble"]):not([class*="message"]),
107
+ div[class*="container"]:not([class*="input"]):not([class*="button"]) {{
108
+ background: transparent !important;
109
+ background-color: transparent !important;
110
+ background-image: none !important;
111
+ }}
112
+
113
+ /* Center and constrain main container */
114
  .gradio-container {{
115
+ max-width: 700px !important;
116
+ margin: 0 auto !important;
117
+ padding: 20px !important;
118
+ box-shadow: 0 4px 8px rgba(0, 0, 0, 0.1) !important;
119
+ border-radius: 10px !important;
120
+ font-family: 'Arial', sans-serif !important;
 
121
  }}
122
+
123
+ /* Green title banner */
124
+ #title {{
125
+ text-align: center !important;
126
+ font-size: 2em !important;
127
+ margin-bottom: 20px !important;
128
+ color: #ffffff !important;
129
+ background-color: #4CAF50 !important;
130
+ padding: 20px !important;
131
+ border-radius: 10px !important;
132
+ box-shadow: 0 2px 4px rgba(0, 0, 0, 0.3) !important;
133
+ }}
134
+
135
+ /* Dark grey backgrounds for chatbot and settings components */
136
+ .block.svelte-12cmxck {{
137
+ background-color: rgba(60, 60, 60, 0.95) !important;
138
+ border-radius: 10px !important;
139
  }}
140
+
141
+ div[class*="bubble-wrap"],
142
+ div[class*="message-wrap"] {{
143
+ background-color: rgba(60, 60, 60, 0.95) !important;
144
+ border-radius: 10px !important;
145
+ padding: 15px !important;
146
  }}
147
+
148
+ .label-wrap,
149
+ div[class*="accordion"] {{
150
+ background-color: rgba(60, 60, 60, 0.95) !important;
151
+ border-radius: 10px !important;
152
  }}
153
+
154
+ /* White text for readability on dark backgrounds */
155
+ .block.svelte-12cmxck,
156
+ .block.svelte-12cmxck *,
157
+ div[class*="bubble-wrap"] *,
158
+ div[class*="message-wrap"] *,
159
+ .label-wrap,
160
+ .label-wrap * {{
161
+ color: #ffffff !important;
162
  }}
163
+
164
+ /* Green buttons with hover effect */
165
+ .gr-button,
166
+ button {{
167
+ background-color: #4CAF50 !important;
168
+ background-image: none !important;
169
+ color: white !important;
170
+ border: none !important;
171
+ border-radius: 5px !important;
172
+ padding: 10px 20px !important;
173
+ cursor: pointer !important;
174
+ transition: background-color 0.3s ease !important;
175
+ }}
176
+ .gr-button:hover,
177
+ button:hover {{
178
+ background-color: #45a049 !important;
179
+ }}
180
+ .gr-slider input {{
181
+ color: #4CAF50 !important;
182
  }}
183
  """
184
 
185
+ # ============================================================================
186
+ # Chat Response Handler
187
+ # ============================================================================
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
188
 
189
  def respond(
190
  message,
 
193
  max_tokens,
194
  temperature,
195
  top_p,
 
196
  selected_model: str,
197
  ):
198
+ """
199
+ Handle chat responses using either local transformers models or HuggingFace API.
200
+
201
+ Args:
202
+ message: User's input message
203
+ history: List of previous messages in conversation
204
+ system_message: System prompt to guide model behavior
205
+ max_tokens: Maximum tokens to generate
206
+ temperature: Sampling temperature (higher = more random)
207
+ top_p: Nucleus sampling threshold
208
+ selected_model: Model identifier with "(local)" or "(api)" suffix
209
+
210
+ Yields:
211
+ str: Generated response text or error message
212
+ """
213
  global pipe
214
+
215
+ try:
216
+ log_debug(f"New message received: '{message[:50]}...'")
217
+ log_debug(f"Selected model: {selected_model}")
218
+ log_debug(f"Parameters - max_tokens: {max_tokens}, temp: {temperature}, top_p: {top_p}")
219
 
220
+ # Build complete message history with system prompt
221
+ messages = [{"role": "system", "content": system_message}]
222
+ messages.extend(history)
223
+ messages.append({"role": "user", "content": message})
224
+ log_debug(f"Message history length: {len(messages)}")
225
 
226
+ # Parse model type and name from selection
227
+ is_local = selected_model.endswith("(local)")
228
+ model_name = selected_model.replace(" (local)", "").replace(" (api)", "")
229
+
230
+ response = ""
231
+
232
+ if is_local:
233
+ # ===== LOCAL MODEL PATH =====
234
+ log_debug(f"Using LOCAL mode with model: {model_name}")
235
+ try:
236
+ from transformers import pipeline
237
+ import torch
238
+ log_debug("Transformers imported successfully")
239
+
240
+ # Load or reuse cached pipeline
241
+ if pipe is None or pipe.model.name_or_path != model_name:
242
+ log_debug(f"Loading model pipeline for: {model_name}")
243
+ pipe = pipeline("text-generation", model=model_name)
244
+ log_debug("Model pipeline loaded successfully")
245
+ else:
246
+ log_debug("Using cached model pipeline")
 
 
247
 
248
+ # Format conversation as plain text prompt
249
+ prompt = "\n".join([f"{m['role']}: {m['content']}" for m in messages])
250
+ log_debug(f"Prompt length: {len(prompt)} characters")
251
 
252
+ # Run inference
253
+ log_debug("Starting inference...")
254
+ outputs = pipe(
255
+ prompt,
256
+ max_new_tokens=max_tokens,
257
+ do_sample=True,
258
+ temperature=temperature,
259
+ top_p=top_p,
260
+ )
261
+ log_debug("Inference completed")
262
 
263
+ # Extract new tokens only (strip original prompt)
264
+ response = outputs[0]["generated_text"][len(prompt):]
265
+ log_debug(f"Response length: {len(response)} characters")
266
+ yield response.strip()
267
 
268
+ except ImportError as e:
269
+ error_msg = f"Import error: {str(e)}"
270
+ log_debug(error_msg, "ERROR")
271
+ log_debug(traceback.format_exc(), "ERROR")
272
+ yield f"❌ Import Error: {str(e)}\n\nPlease check log.txt for details."
273
+ except Exception as e:
274
+ error_msg = f"Local model error: {str(e)}"
275
+ log_debug(error_msg, "ERROR")
276
+ log_debug(traceback.format_exc(), "ERROR")
277
+ yield f"❌ Local Model Error: {str(e)}\n\nPlease check log.txt for details."
278
 
279
+ else:
280
+ # ===== API MODEL PATH =====
281
+ log_debug(f"Using API mode with model: {model_name}")
282
+
283
+ try:
284
+ # Check for HuggingFace API token
285
+ hf_token = os.environ.get("HF_TOKEN", None)
286
+ if hf_token:
287
+ log_debug("HF_TOKEN found in environment")
288
+ else:
289
+ log_debug("No HF_TOKEN in environment - API call will likely fail", "WARN")
290
+
291
+ # Create HuggingFace Inference client
292
+ log_debug("Creating InferenceClient...")
293
+ client = InferenceClient(
294
+ provider="auto",
295
+ api_key=hf_token,
296
+ )
297
+ log_debug("InferenceClient created successfully")
298
 
299
+ # Call chat completion API
300
+ log_debug("Starting chat completion...")
301
+ completion = client.chat.completions.create(
302
+ model=model_name,
303
+ messages=messages,
304
+ max_tokens=max_tokens,
305
+ temperature=temperature,
306
+ top_p=top_p,
307
+ )
308
+
309
+ response = completion.choices[0].message.content
310
+ log_debug(f"Completion received. Response length: {len(response)} characters")
311
+ yield response
312
+
313
+ except Exception as e:
314
+ error_msg = f"API error: {str(e)}"
315
+ log_debug(error_msg, "ERROR")
316
+ log_debug(traceback.format_exc(), "ERROR")
317
+ yield f"❌ API Error: {str(e)}\n\nPlease check log.txt for details."
318
 
319
+ except Exception as e:
320
+ error_msg = f"Unexpected error in respond function: {str(e)}"
321
+ log_debug(error_msg, "ERROR")
322
+ log_debug(traceback.format_exc(), "ERROR")
323
+ yield f"❌ Unexpected Error: {str(e)}\n\nPlease check log.txt for details."
324
+
325
+
326
+ # ============================================================================
327
+ # Gradio UI Definition
328
+ # ============================================================================
329
+
330
+ with gr.Blocks(theme=TransparentTheme(), css=fancy_css) as demo:
331
+ # Title banner
332
+ gr.Markdown(f"<h1 id='title' style='text-align: center;'>{TITLE}</h1>")
333
 
334
+ # Chatbot component with custom avatar icons
335
  chatbot = gr.Chatbot(
336
  type="messages",
337
  avatar_images=(str(ASSETS_DIR / "monster_icon.png"), str(ASSETS_DIR / "smart_confidant_icon.png"))
338
  )
339
 
340
+ # Collapsible settings panel
341
  with gr.Accordion("⚙️ Additional Settings", open=False):
342
  system_message = gr.Textbox(value=DEFAULT_SYSTEM_MESSAGE, label="System message")
343
  max_tokens = gr.Slider(minimum=1, maximum=2048, value=512, step=1, label="Max new tokens")
 
345
  top_p = gr.Slider(minimum=0.1, maximum=1.0, value=0.95, step=0.05, label="Top-p (nucleus sampling)")
346
  selected_model = gr.Radio(choices=MODEL_OPTIONS, label="Select Model", value=MODEL_OPTIONS[0])
347
 
348
+ # Wire up chat interface with response handler
349
  gr.ChatInterface(
350
  fn=respond,
351
  chatbot=chatbot,
 
359
  type="messages",
360
  )
361
 
362
+ # ============================================================================
363
+ # Application Entry Point
364
+ # ============================================================================
365
+
366
  if __name__ == "__main__":
367
+ log_debug("="*50)
368
+ log_debug("Smart Confidant Application Starting")
369
+ log_debug(f"Available models: {MODEL_OPTIONS}")
370
+ log_debug(f"HF_TOKEN present: {'Yes' if os.environ.get('HF_TOKEN') else 'No'}")
371
+ log_debug("="*50)
372
+
373
+ # Launch on all interfaces for VM/container deployment, with Gradio share link
374
+ demo.launch(server_name="0.0.0.0", server_port=8012, share=True)
deploy.sh ADDED
@@ -0,0 +1,196 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #! /bin/bash
2
+
3
+ # Configuration
4
+ PORT=22012
5
+ MACHINE=paffenroth-23.dyn.wpi.edu
6
+ MY_KEY_PATH=$HOME/.ssh/mlopskey # Path to your personal SSH key
7
+ STUDENT_ADMIN_KEY_PATH=$HOME/.ssh/student-admin_key # Path to student-admin fallback key
8
+
9
+ # Load environment variables from .env file if it exists
10
+ if [ -f .env ]; then
11
+ echo "Loading environment variables from .env file..."
12
+ export $(grep -v '^#' .env | xargs)
13
+ fi
14
+
15
+ # Clean up from previous runs
16
+ ssh-keygen -f "$HOME/.ssh/known_hosts" -R "[$MACHINE]:$PORT" 2>/dev/null
17
+ rm -rf tmp
18
+
19
+ # Create a temporary directory
20
+ mkdir tmp
21
+
22
+ # Change the permissions of the directory
23
+ chmod 700 tmp
24
+
25
+ # Change to the temporary directory
26
+ cd tmp
27
+
28
+ echo "Checking if personal key works..."
29
+ # Try connecting with personal key
30
+ if ssh -i ${MY_KEY_PATH} -p ${PORT} -o StrictHostKeyChecking=no -o ConnectTimeout=10 student-admin@${MACHINE} "echo 'success'" > /dev/null 2>&1; then
31
+ echo "✓ Personal key works! No update needed."
32
+ MY_KEY=${MY_KEY_PATH}
33
+ else
34
+ echo "✗ Personal key failed. Updating with student-admin key..."
35
+
36
+ # Check if the keys exist
37
+ if [ ! -f "${MY_KEY_PATH}.pub" ]; then
38
+ echo "Error: Personal public key not found at ${MY_KEY_PATH}.pub"
39
+ echo "Creating a new key pair..."
40
+ ssh-keygen -f ${MY_KEY_PATH} -t ed25519 -N ""
41
+ fi
42
+
43
+ if [ ! -f "${STUDENT_ADMIN_KEY_PATH}" ]; then
44
+ echo "Error: Student-admin key not found at ${STUDENT_ADMIN_KEY_PATH}"
45
+ exit 1
46
+ fi
47
+
48
+ # Read the public key content
49
+ MY_PUB_KEY=$(cat ${MY_KEY_PATH}.pub)
50
+
51
+ # Update authorized_keys on the server using student-admin key
52
+ echo "Connecting with student-admin key to update authorized_keys..."
53
+ ssh -i ${STUDENT_ADMIN_KEY_PATH} -p ${PORT} -o StrictHostKeyChecking=no student-admin@${MACHINE} << EOF
54
+ mkdir -p ~/.ssh
55
+ chmod 700 ~/.ssh
56
+ touch ~/.ssh/authorized_keys
57
+ chmod 600 ~/.ssh/authorized_keys
58
+ # Remove any old keys from this machine
59
+ grep -v 'rcpaffenroth@paffenroth-23' ~/.ssh/authorized_keys > ~/.ssh/authorized_keys.tmp 2>/dev/null || true
60
+ mv ~/.ssh/authorized_keys.tmp ~/.ssh/authorized_keys 2>/dev/null || true
61
+ # Add the new key
62
+ echo '${MY_PUB_KEY}' >> ~/.ssh/authorized_keys
63
+ echo 'Key updated'
64
+ EOF
65
+
66
+ if [ $? -ne 0 ]; then
67
+ echo "Failed to update key with student-admin key"
68
+ exit 1
69
+ fi
70
+
71
+ # Verify the personal key now works
72
+ echo "Verifying personal key..."
73
+ sleep 2
74
+
75
+ if ssh -i ${MY_KEY_PATH} -p ${PORT} -o StrictHostKeyChecking=no student-admin@${MACHINE} "echo 'success'" > /dev/null 2>&1; then
76
+ echo "✓ Success! Personal key is now working."
77
+ MY_KEY=${MY_KEY_PATH}
78
+ else
79
+ echo "✗ Personal key still not working after update"
80
+ exit 1
81
+ fi
82
+ fi
83
+
84
+ # Add the key to the ssh-agent
85
+ eval "$(ssh-agent -s)"
86
+ ssh-add ${MY_KEY}
87
+
88
+ # Check the key file on the server
89
+ echo "Checking authorized_keys on server:"
90
+ ssh -i ${MY_KEY} -p ${PORT} -o StrictHostKeyChecking=no student-admin@${MACHINE} "cat ~/.ssh/authorized_keys"
91
+
92
+ # Clone or copy the repo
93
+ # If using git:
94
+ # git clone https://github.com/yourusername/Smart_Confidant
95
+ # Or just copy the local directory:
96
+ echo "Copying Smart_Confidant code..."
97
+ mkdir -p Smart_Confidant
98
+ # Copy all files except tmp and .git directories
99
+ for item in ../*; do
100
+ base=$(basename "$item")
101
+ if [ "$base" != "tmp" ] && [ "$base" != ".git" ]; then
102
+ cp -r "$item" Smart_Confidant/
103
+ fi
104
+ done
105
+
106
+ # Copy the files to the server
107
+ echo "Uploading code to server..."
108
+ scp -i ${MY_KEY} -P ${PORT} -o StrictHostKeyChecking=no -r Smart_Confidant student-admin@${MACHINE}:~/
109
+
110
+ if [ $? -eq 0 ]; then
111
+ echo "✓ Code successfully uploaded to server"
112
+ else
113
+ echo "✗ Failed to upload code"
114
+ exit 1
115
+ fi
116
+
117
+ # Define SSH command for subsequent steps using the confirmed key
118
+ COMMAND="ssh -i ${MY_KEY} -p ${PORT} -o StrictHostKeyChecking=no student-admin@${MACHINE}"
119
+
120
+ # Run all setup in a single SSH session
121
+ echo "Setting up environment on remote server..."
122
+ # Pass HF_TOKEN to the remote session
123
+ ${COMMAND} bash -s << ENDSSH
124
+ set -e
125
+ export HF_TOKEN='${HF_TOKEN}'
126
+
127
+ # Stop old process
128
+ echo "→ Stopping old process if running..."
129
+ pkill -f 'python.*app.py' || true
130
+
131
+ # Check if micromamba is installed
132
+ if [ ! -f ~/bin/micromamba ]; then
133
+ echo "→ Installing micromamba..."
134
+ curl -Ls https://micro.mamba.pm/api/micromamba/linux-64/latest | tar -xvj -C ~/ bin/micromamba
135
+ mkdir -p ~/micromamba
136
+ export MAMBA_ROOT_PREFIX=~/micromamba
137
+ echo 'export MAMBA_ROOT_PREFIX=~/micromamba' >> ~/.bashrc
138
+ echo 'eval "$(~/bin/micromamba shell hook -s bash)"' >> ~/.bashrc
139
+ echo "✓ Micromamba installed"
140
+ else
141
+ echo "✓ Micromamba already installed"
142
+ export MAMBA_ROOT_PREFIX=~/micromamba
143
+ fi
144
+
145
+ eval "$(~/bin/micromamba shell hook -s bash)" 2>/dev/null || true
146
+
147
+ cd Smart_Confidant
148
+
149
+ # Check if environment exists
150
+ if ~/bin/micromamba env list | grep -q "smart-confidant"; then
151
+ echo "→ Updating existing environment..."
152
+ ~/bin/micromamba install -n smart-confidant -f environment.yml -y
153
+ else
154
+ echo "→ Creating new environment..."
155
+ ~/bin/micromamba create -f environment.yml -y
156
+ fi
157
+
158
+ # Check if uv is installed
159
+ if ! ~/bin/micromamba run -n smart-confidant which uv &>/dev/null; then
160
+ echo "→ Installing uv..."
161
+ ~/bin/micromamba run -n smart-confidant pip install uv
162
+ else
163
+ echo "✓ uv already installed"
164
+ fi
165
+
166
+ # Install/update dependencies
167
+ echo "→ Installing/updating dependencies..."
168
+ ~/bin/micromamba run -n smart-confidant uv pip install -e .
169
+
170
+ # Start application
171
+ echo "→ Starting application..."
172
+ # Pass HF_TOKEN if it exists
173
+ if [ ! -z "$HF_TOKEN" ]; then
174
+ echo "→ HF_TOKEN provided, API models will be available"
175
+ nohup ~/bin/micromamba run -n smart-confidant -e HF_TOKEN="$HF_TOKEN" python -u app.py > ~/log.txt 2>&1 &
176
+ else
177
+ echo "⚠ HF_TOKEN not set - API models will not work"
178
+ nohup ~/bin/micromamba run -n smart-confidant python -u app.py > ~/log.txt 2>&1 &
179
+ fi
180
+
181
+ # Wait for the app to start
182
+ sleep 5
183
+
184
+ echo "✓ Setup complete"
185
+ ENDSSH
186
+
187
+ # Extract the Gradio share link from the remote log file
188
+ SHARE_LINK=$(${COMMAND} "grep -oP 'https://[a-z0-9]+\.gradio\.live' ~/log.txt | tail -1" 2>/dev/null)
189
+
190
+ echo ""
191
+ echo "=========================================="
192
+ echo "Deployment complete!"
193
+ echo "Public Gradio Share Link: ${SHARE_LINK}"
194
+ echo "==========================================="
195
+
196
+
env.example ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ # HuggingFace API Token
2
+ # Get your token from: https://huggingface.co/settings/tokens
3
+ # Copy this file to .env and add your actual token
4
+ HF_TOKEN=your_huggingface_token_here
5
+
environment.yml ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ name: smart-confidant
2
+ channels:
3
+ - pytorch
4
+ - conda-forge
5
+ dependencies:
6
+ - python=3.10
7
+ - pytorch=2.3.0
8
+ - cpuonly
9
+ - pip
10
+
pyproject.toml ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [project]
2
+ name = "smart-confidant"
3
+ version = "0.1.0"
4
+ description = "An AI chatbot assistant for Magic: The Gathering"
5
+ readme = "README.md"
6
+ requires-python = ">=3.10"
7
+ dependencies = [
8
+ "huggingface-hub>=0.27.0",
9
+ "gradio>=4.43.0",
10
+ "transformers>=4.43.0",
11
+ "accelerate>=0.33.0",
12
+ "pydantic>=2.6.0",
13
+ "psutil>=5.9.0",
14
+ "sentencepiece>=0.1.99",
15
+ "protobuf>=3.20.0",
16
+ ]
17
+
requirements.txt DELETED
@@ -1,3 +0,0 @@
1
- transformers
2
- torch
3
- psutil
 
 
 
 
restart.sh ADDED
@@ -0,0 +1,90 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #! /bin/bash
2
+
3
+ # Configuration
4
+ PORT=22012
5
+ MACHINE=paffenroth-23.dyn.wpi.edu
6
+ MY_KEY_PATH=$HOME/.ssh/mlopskey # Path to your personal SSH key
7
+
8
+ # Load environment variables from .env file if it exists
9
+ if [ -f .env ]; then
10
+ echo "Loading environment variables from .env file..."
11
+ export $(grep -v '^#' .env | xargs)
12
+ fi
13
+
14
+ # Define SSH command
15
+ COMMAND="ssh -i ${MY_KEY_PATH} -p ${PORT} -o StrictHostKeyChecking=no student-admin@${MACHINE}"
16
+
17
+ # Clean up from previous runs
18
+ rm -rf tmp
19
+
20
+ # Create a temporary directory
21
+ mkdir tmp
22
+
23
+ # Change the permissions of the directory
24
+ chmod 700 tmp
25
+
26
+ # Change to the temporary directory
27
+ cd tmp
28
+
29
+ # Copy the Smart_Confidant code
30
+ echo "Copying Smart_Confidant code..."
31
+ mkdir -p Smart_Confidant
32
+ # Copy all files except tmp and .git directories
33
+ for item in ../*; do
34
+ base=$(basename "$item")
35
+ if [ "$base" != "tmp" ] && [ "$base" != ".git" ]; then
36
+ cp -r "$item" Smart_Confidant/
37
+ fi
38
+ done
39
+
40
+ # Copy the files to the server
41
+ echo "Uploading code to server..."
42
+ scp -i ${MY_KEY_PATH} -P ${PORT} -o StrictHostKeyChecking=no -r Smart_Confidant student-admin@${MACHINE}:~/
43
+
44
+ if [ $? -eq 0 ]; then
45
+ echo "✓ Code successfully uploaded to server"
46
+ else
47
+ echo "✗ Failed to upload code"
48
+ exit 1
49
+ fi
50
+
51
+ echo "Restarting application on remote server..."
52
+
53
+ # Restart the application in a single SSH session
54
+ ${COMMAND} bash -s << ENDSSH
55
+ set -e
56
+ export HF_TOKEN='${HF_TOKEN}'
57
+
58
+ # Stop old process
59
+ echo "→ Stopping old process if running..."
60
+ pkill -f 'python.*app.py' || true
61
+
62
+ # Change to app directory
63
+ cd Smart_Confidant
64
+
65
+ # Start application
66
+ echo "→ Starting application..."
67
+ # Pass HF_TOKEN if it exists
68
+ if [ ! -z "$HF_TOKEN" ]; then
69
+ echo "→ HF_TOKEN provided, API models will be available"
70
+ nohup ~/bin/micromamba run -n smart-confidant -e HF_TOKEN="$HF_TOKEN" python -u app.py > ~/log.txt 2>&1 &
71
+ else
72
+ echo "⚠ HF_TOKEN not set - API models will not work"
73
+ nohup ~/bin/micromamba run -n smart-confidant python -u app.py > ~/log.txt 2>&1 &
74
+ fi
75
+
76
+ # Wait for the app to start
77
+ sleep 20
78
+
79
+ echo "✓ Restart complete"
80
+ ENDSSH
81
+
82
+ # Extract the Gradio share link from the remote log file
83
+ SHARE_LINK=$(${COMMAND} "grep -oP 'https://[a-z0-9]+\.gradio\.live' ~/log.txt | tail -1" 2>/dev/null)
84
+
85
+ echo ""
86
+ echo "=========================================="
87
+ echo "Restart complete!"
88
+ echo "Public Gradio Share Link: ${SHARE_LINK}"
89
+ echo "==========================================="
90
+