WeMWish commited on
Commit
8d66edb
·
1 Parent(s): 9b31367

Add authentication, token quota tracking, and comprehensive usage logging

Browse files

Added:
- HF OAuth integration with login overlay and session management
- Token quota system (100k default) with real-time enforcement
- Supabase database for user management and usage logging
- Complete token tracking across all OpenAI API calls
- GenerationAgent: Chat Completions API
- SupervisorAgent: Assistants API
- ExecutorAgent: Vision API (describe_image)
- ManagerAgent: Aggregates all sub-agent usage
- Database schema (users, usage_logs, user_stats view)
- Python Supabase client and R wrapper
- OAuth helper functions for HF authentication
- .env.example configuration template

Changed:
- All agents now track and return token usage
- ManagerAgent checks quota before processing queries
- ManagerAgent logs all queries immediately after completion
- server.R integrates OAuth and Supabase initialization
- ui.R adds login overlay and OAuth callback handling
- Updated dependencies (supabase, python-dotenv, httr2)

Fixed:
- Untracked token usage from describe_image() Vision API calls
- Complete token tracking for accurate quota enforcement

Technical:
- Authentication flow: OAuth → token exchange → user creation → session storage
- Quota enforcement: check → process → aggregate → log → update usage
- Graceful degradation when Supabase not configured

.env.example ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # TaijiChat Environment Variables Template
2
+ # Copy this file to .env and fill in your values (for local development)
3
+ # For Hugging Face Spaces, set these as secrets in your Space settings
4
+
5
+ # ===== OpenAI Configuration =====
6
+ OPENAI_API_KEY=your_openai_api_key_here
7
+
8
+ # ===== Supabase Configuration =====
9
+ # Get these from your Supabase project settings
10
+ SUPABASE_URL=https://your-project-id.supabase.co
11
+ SUPABASE_KEY=your_supabase_service_role_key_here
12
+
13
+ # ===== Hugging Face OAuth =====
14
+ # These are automatically populated by Hugging Face Spaces when hf_oauth: true is set
15
+ # Do NOT set these manually in Hugging Face Spaces
16
+ # For local testing, you can create an OAuth app at https://huggingface.co/settings/applications
17
+ # OAUTH_CLIENT_ID=your_oauth_client_id
18
+ # OAUTH_CLIENT_SECRET=your_oauth_client_secret
19
+ # OAUTH_SCOPES=openid profile email
20
+
21
+ # ===== Optional Configuration =====
22
+ # Enable/disable async processing (default: TRUE)
23
+ TAIJICHAT_USE_ASYNC=TRUE
.gitignore CHANGED
@@ -5,7 +5,8 @@ api_key.txt
5
 
6
  # Ignore virtual environments
7
  venv/
8
- .venv/
9
  ENV/
10
  env/
11
 
 
 
5
 
6
  # Ignore virtual environments
7
  venv/
8
+ .venv//
9
  ENV/
10
  env/
11
 
12
+ ./claude
CHANGELOG.md CHANGED
@@ -2,6 +2,101 @@
2
 
3
  ## [Unreleased]
4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  ### Added
6
  - Literature search toggle button in chat interface
7
  - Users can now explicitly enable/disable external literature search
 
2
 
3
  ## [Unreleased]
4
 
5
+ ### Added
6
+ - **Authentication & Access Control System**
7
+ - Hugging Face OAuth integration for user authentication
8
+ - Login overlay with "Sign in with Hugging Face" button
9
+ - One account per HF user (duplicate prevention)
10
+ - Session management with user context tracking
11
+
12
+ - **Token Quota & Budget Tracking**
13
+ - Per-user token quota system (default: 100,000 tokens)
14
+ - Real-time quota checking before query processing
15
+ - Quota-based access control (queries rejected when quota exceeded)
16
+ - Token usage tracking across all OpenAI API calls
17
+
18
+ - **Comprehensive Usage Logging**
19
+ - Supabase database integration for persistent storage
20
+ - Database schema with users and usage_logs tables
21
+ - Logs: user_id, timestamp, query_text, token counts, response, errors, conversation history
22
+ - Immediate logging after each query (before returning response)
23
+ - User statistics view for aggregated usage analytics
24
+
25
+ - **Complete Token Tracking**
26
+ - GenerationAgent: Tracks Chat Completions API usage
27
+ - SupervisorAgent: Tracks Assistants API usage
28
+ - ExecutorAgent: Tracks token usage from executed code
29
+ - agent_tools.py: Captures Vision API usage from describe_image()
30
+ - ManagerAgent: Aggregates all token usage from sub-agents
31
+
32
+ - **Database & Infrastructure**
33
+ - Supabase PostgreSQL database for user management
34
+ - Python Supabase client (utils/supabase_client.py)
35
+ - R wrapper for Supabase (utils/supabase_r.R)
36
+ - OAuth helper functions (auth/hf_oauth.R)
37
+ - Database indexes for performance optimization
38
+
39
+ - **Configuration Files**
40
+ - .env.example template for environment variables
41
+ - README.md OAuth metadata (hf_oauth: true)
42
+ - database_schema.sql for Supabase setup
43
+
44
+ ### Changed
45
+ - **ManagerAgent (agents/manager_agent.py)**
46
+ - Added Supabase client integration
47
+ - Added user context (user_id, hf_user_id) tracking
48
+ - Added quota checking before query processing
49
+ - Added comprehensive logging (success, error, quota exceeded)
50
+ - Added token aggregation from all sub-agents
51
+
52
+ - **GenerationAgent (agents/generation_agent.py)**
53
+ - Added token usage extraction from OpenAI response
54
+ - Returns usage info in response dict
55
+
56
+ - **SupervisorAgent (agents/supervisor_agent.py)**
57
+ - Added token usage extraction from Assistants API
58
+ - Returns usage info in all response paths (success and error)
59
+
60
+ - **ExecutorAgent (agents/executor_agent.py)**
61
+ - Added global usage collector mechanism
62
+ - Aggregates token usage from executed code
63
+ - Returns usage info in execution result
64
+
65
+ - **agent_tools.py**
66
+ - describe_image() now captures Vision API token usage
67
+ - Stores usage in global collector for ExecutorAgent
68
+
69
+ - **server.R**
70
+ - Added OAuth initialization
71
+ - Added Supabase client initialization
72
+ - Added quota checking in chat message handler
73
+ - Added user context setting in agent before queries
74
+ - Passes Supabase client to ManagerAgent
75
+
76
+ - **ui.R**
77
+ - Added login overlay for unauthenticated users
78
+ - Added OAuth callback JavaScript handler
79
+ - Added auth state UI output placeholder
80
+
81
+ - **Dependencies**
82
+ - requirements.txt: Added supabase>=2.0.0, python-dotenv>=1.0.0
83
+ - Dockerfile: Added httr2 R package for OAuth
84
+
85
+ ### Fixed
86
+ - Fixed untracked token usage from describe_image() API calls
87
+ - Fixed ExecutorAgent token tracking for Vision API usage
88
+ - All OpenAI API calls now tracked for accurate quota enforcement
89
+
90
+ ### Technical Details
91
+ - **Authentication Flow**: Login overlay → OAuth redirect → Token exchange → User creation/retrieval → Session storage
92
+ - **Quota Enforcement**: Check quota → Process query → Aggregate tokens → Log to database → Update user usage
93
+ - **Token Tracking**: GenerationAgent + SupervisorAgent + ExecutorAgent → ManagerAgent aggregation → Supabase logging
94
+ - **Graceful Degradation**: System works without Supabase (logs as "anonymous")
95
+
96
+ ---
97
+
98
+ ## [Previous Unreleased Features]
99
+
100
  ### Added
101
  - Literature search toggle button in chat interface
102
  - Users can now explicitly enable/disable external literature search
CLAUDE.md CHANGED
@@ -6,6 +6,33 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
6
 
7
  TaijiChat is a Shiny web application that combines R and Python to provide an interactive chat interface for analyzing transcription factor data from T cell states research. The application uses a multi-agent architecture with OpenAI GPT models to generate insights and visualizations from genomics datasets.
8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  ## Key Architecture
10
 
11
  ### Multi-Agent System
@@ -16,6 +43,13 @@ The application uses a specialized agent architecture for handling user queries:
16
  - **SupervisorAgent** (`agents/supervisor_agent.py`): Reviews generated code for safety and compliance before execution
17
  - **ExecutorAgent** (`agents/executor_agent.py`): Executes approved Python code in a restricted environment
18
 
 
 
 
 
 
 
 
19
  ### Technology Stack
20
  - **R (Shiny)**: Frontend web interface and server logic
21
  - **Python**: Backend agents and data processing tools
@@ -53,7 +87,7 @@ shiny::runApp('.', host='0.0.0.0', port=7860)
53
  - Create `api_key.txt` file in project root
54
 
55
  **Python Environment:**
56
- Configure reticulate in `ui.R` by uncommenting one of:
57
  ```r
58
  # Option 1: Python executable path
59
  reticulate::use_python("/path/to/python", required = TRUE)
@@ -61,29 +95,31 @@ reticulate::use_python("/path/to/python", required = TRUE)
61
  # Option 2: Virtual environment
62
  reticulate::use_virtualenv("venv_name", required = TRUE)
63
 
64
- # Option 3: Conda environment
65
  reticulate::use_condaenv("conda_env_name", required = TRUE)
66
  ```
67
 
 
 
 
68
  **Install Python Dependencies:**
69
  ```bash
70
  pip install -r requirements.txt
71
  ```
72
 
 
 
 
 
 
73
  ### Performance Features
74
 
75
  **Async Processing (Default):**
76
  - Set `TAIJICHAT_USE_ASYNC=TRUE` to enable async agents (default)
77
  - Set `TAIJICHAT_USE_ASYNC=FALSE` to use synchronous agents
78
 
79
- **Cache Management:**
80
- ```r
81
- # Check cache statistics
82
- reticulate::py_run_string("
83
- from agents.smart_cache import get_cache_stats
84
- print('Cache Stats:', get_cache_stats())
85
- ")
86
- ```
87
 
88
  ## Key Implementation Details
89
 
@@ -114,7 +150,7 @@ All data analysis functions are centralized in `tools/agent_tools.py`:
114
  ### Data Handling
115
  - **Pre-ranked Tables**: Never re-sort TF ranking data - tables come pre-ranked by importance
116
  - **Path Management**: All file paths are relative to project root via `BASE_WWW_PATH`
117
- - **Caching**: 5-minute TTL with 100MB memory limit for performance
118
 
119
  ### Code Generation Rules
120
  - Only use functions from `tools.agent_tools` module
@@ -123,22 +159,181 @@ All data analysis functions are centralized in `tools/agent_tools.py`:
123
  - All generated code must pass SupervisorAgent safety review
124
 
125
  ### UI Integration
126
- - Chat interface uses custom JavaScript for real-time updates
 
 
127
  - Lazy loading implemented for large image datasets
128
- - Progress streaming shows agent reasoning steps to users
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
129
 
130
  ## Troubleshooting
131
 
132
  **Common Issues:**
133
  - **reticulate errors**: Verify Python environment configuration in `ui.R`
134
- - **Import failures**: Ensure all requirements are installed in configured Python environment
135
  - **API errors**: Check `OPENAI_API_KEY` is set correctly
136
  - **Performance issues**: Enable async mode with `TAIJICHAT_USE_ASYNC=TRUE`
 
137
 
138
  **Asset Optimization:**
139
  - Images optimized to 49% of original size for faster loading
140
  - Backup of original assets available in `www_backup_original/`
141
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
142
  ## Literature Search Toggle Feature
143
 
144
  ### **Overview**
@@ -151,7 +346,7 @@ TaijiChat includes a toggle button that allows users to control external literat
151
  - **Controls**: Click to enable/disable external literature search
152
 
153
  ### **Technical Implementation**
154
- - **Frontend**: Button state managed via JavaScript and CSS styling
155
  - **Backend**: Literature preference passed from R to Python agents
156
  - **Agent Integration**: `ManagerAgent.process_single_query_with_preferences()` method handles literature control
157
  - **Internal Data**: Paper-based analysis (internal dataset) remains always enabled
 
6
 
7
  TaijiChat is a Shiny web application that combines R and Python to provide an interactive chat interface for analyzing transcription factor data from T cell states research. The application uses a multi-agent architecture with OpenAI GPT models to generate insights and visualizations from genomics datasets.
8
 
9
+ ## Project Structure
10
+
11
+ ```
12
+ taijichat/
13
+ ├── ui.R # Shiny UI definition and Python environment setup
14
+ ├── server.R # Shiny server logic with agent integration
15
+ ├── chat_ui.R # Chat interface UI components
16
+ ├── agents/ # Python agent system
17
+ │ ├── manager_agent.py # Central orchestrator
18
+ │ ├── generation_agent.py # Code generation with 13-step reasoning
19
+ │ ├── supervisor_agent.py # Safety validation
20
+ │ └── executor_agent.py # Sandboxed code execution
21
+ ├── tools/
22
+ │ └── agent_tools.py # All data analysis functions
23
+ ├── www/ # Static assets and data files
24
+ │ ├── chat_script.js # Frontend JavaScript
25
+ │ ├── chat_styles.css # Custom styling
26
+ │ ├── tablePagerank/ # TF PageRank data
27
+ │ ├── waveanalysis/ # Wave analysis results
28
+ │ ├── TFcorintextrm/ # TF correlation data
29
+ │ └── tfcommunities/ # Community analysis
30
+ ├── R/
31
+ │ └── caching.R # R-side caching utilities
32
+ ├── requirements.txt # Python dependencies
33
+ └── Dockerfile # Container configuration
34
+ ```
35
+
36
  ## Key Architecture
37
 
38
  ### Multi-Agent System
 
43
  - **SupervisorAgent** (`agents/supervisor_agent.py`): Reviews generated code for safety and compliance before execution
44
  - **ExecutorAgent** (`agents/executor_agent.py`): Executes approved Python code in a restricted environment
45
 
46
+ **Query Processing Flow:**
47
+ 1. User query received by ManagerAgent via `process_single_query_with_preferences()`
48
+ 2. GenerationAgent creates execution plan
49
+ 3. SupervisorAgent validates generated code
50
+ 4. ExecutorAgent runs approved code in sandbox
51
+ 5. Results streamed back to UI via R callback system
52
+
53
  ### Technology Stack
54
  - **R (Shiny)**: Frontend web interface and server logic
55
  - **Python**: Backend agents and data processing tools
 
87
  - Create `api_key.txt` file in project root
88
 
89
  **Python Environment:**
90
+ Configure reticulate in `ui.R` (lines 11-18) by uncommenting one of:
91
  ```r
92
  # Option 1: Python executable path
93
  reticulate::use_python("/path/to/python", required = TRUE)
 
95
  # Option 2: Virtual environment
96
  reticulate::use_virtualenv("venv_name", required = TRUE)
97
 
98
+ # Option 3: Conda environment
99
  reticulate::use_condaenv("conda_env_name", required = TRUE)
100
  ```
101
 
102
+ **Docker Environment:**
103
+ When using Docker, Python environment is automatically configured via `RETICULATE_PYTHON` environment variable (see Dockerfile:5)
104
+
105
  **Install Python Dependencies:**
106
  ```bash
107
  pip install -r requirements.txt
108
  ```
109
 
110
+ **Install R Dependencies:**
111
+ ```r
112
+ install.packages(c('shiny', 'readxl', 'DT', 'dplyr', 'reticulate', 'shinythemes', 'png', 'shinyjs', 'digest'))
113
+ ```
114
+
115
  ### Performance Features
116
 
117
  **Async Processing (Default):**
118
  - Set `TAIJICHAT_USE_ASYNC=TRUE` to enable async agents (default)
119
  - Set `TAIJICHAT_USE_ASYNC=FALSE` to use synchronous agents
120
 
121
+ **Module Reloading:**
122
+ If making changes to Python agents, the module is automatically reloaded on server startup (see server.R:89-97)
 
 
 
 
 
 
123
 
124
  ## Key Implementation Details
125
 
 
150
  ### Data Handling
151
  - **Pre-ranked Tables**: Never re-sort TF ranking data - tables come pre-ranked by importance
152
  - **Path Management**: All file paths are relative to project root via `BASE_WWW_PATH`
153
+ - **Excel File Caching**: Schema information is cached to improve performance when discovering files (see tools/agent_tools.py)
154
 
155
  ### Code Generation Rules
156
  - Only use functions from `tools.agent_tools` module
 
159
  - All generated code must pass SupervisorAgent safety review
160
 
161
  ### UI Integration
162
+ - Chat interface uses custom JavaScript (`www/chat_script.js`) for real-time updates
163
+ - Custom CSS styling in `www/chat_styles.css`
164
+ - Chat UI components defined in `chat_ui.R`
165
  - Lazy loading implemented for large image datasets
166
+ - Progress streaming shows agent reasoning steps to users via callback system (see server.R:17-33)
167
+ - Resizable chat panel with drag handle
168
+
169
+ ## Development Workflow
170
+
171
+ ### Debugging
172
+ **R Console Debugging:**
173
+ - Print statements in `ui.R` and `server.R` appear in R console
174
+ - Check Python integration status with `reticulate::py_config()`
175
+
176
+ **Python Agent Debugging:**
177
+ - Python print statements appear in R console when running locally
178
+ - Agent thought callbacks stream to UI in real-time
179
+ - Check conversation history in agent state
180
+
181
+ **Module Reloading:**
182
+ When modifying Python agent code:
183
+ 1. Restart the Shiny app (changes auto-reload on server startup)
184
+ 2. For manual reload during development, use reticulate to reload modules
185
+
186
+ ### Testing
187
+ **Manual Testing:**
188
+ - Use `test_queries.txt` for common test scenarios
189
+ - `tested_queries.txt` contains verified working queries
190
+
191
+ **Python Syntax Validation:**
192
+ ```bash
193
+ python syntax_check.py
194
+ ```
195
 
196
  ## Troubleshooting
197
 
198
  **Common Issues:**
199
  - **reticulate errors**: Verify Python environment configuration in `ui.R`
200
+ - **Import failures**: Ensure all requirements are installed in configured Python environment
201
  - **API errors**: Check `OPENAI_API_KEY` is set correctly
202
  - **Performance issues**: Enable async mode with `TAIJICHAT_USE_ASYNC=TRUE`
203
+ - **Python module changes not reflected**: Restart Shiny app to trigger automatic reload
204
 
205
  **Asset Optimization:**
206
  - Images optimized to 49% of original size for faster loading
207
  - Backup of original assets available in `www_backup_original/`
208
 
209
+ ## Authentication & Access Control
210
+
211
+ ### Hugging Face OAuth Integration
212
+ **Setup:**
213
+ - OAuth enabled via `hf_oauth: true` in `README.md` metadata
214
+ - Automatically provides: `OAUTH_CLIENT_ID`, `OAUTH_CLIENT_SECRET`, `OAUTH_SCOPES`
215
+ - Implementation in `auth/hf_oauth.R`
216
+
217
+ **User Flow:**
218
+ 1. Unauthenticated users see login overlay
219
+ 2. Click "Sign in with Hugging Face" → OAuth flow
220
+ 3. After authentication, user info stored in session
221
+ 4. User created/retrieved from Supabase database
222
+ 5. Single account per HF user enforced
223
+
224
+ **Key Functions:**
225
+ - `initialize_oauth()` - Configure OAuth client
226
+ - `get_authorization_url()` - Generate OAuth URL
227
+ - `exchange_code_for_token()` - Get access token
228
+ - `get_user_info()` - Fetch HF user profile
229
+ - `is_authenticated(session)` - Check auth status
230
+
231
+ ### Token Quota System
232
+ **Configuration:**
233
+ - Default quota: 100,000 tokens per user (configurable in `database_schema.sql`)
234
+ - Tracked in Supabase `users` table
235
+ - Checked before each query in `manager_agent.py:_check_quota_before_processing()`
236
+
237
+ **Implementation:**
238
+ ```python
239
+ # In manager_agent.py
240
+ has_quota, remaining, error = self._check_quota_before_processing()
241
+ if not has_quota:
242
+ return quota_exceeded_message # Query blocked
243
+ ```
244
+
245
+ **Quota Management:**
246
+ - Real-time usage tracking after each query
247
+ - Automatic increment via `update_token_usage()`
248
+ - View remaining quota via `user_stats` view in Supabase
249
+
250
+ ### Comprehensive Usage Logging
251
+ **Logging Points:**
252
+ - ✅ **After successful query** (manager_agent.py:500-518)
253
+ - ✅ **On query error** (manager_agent.py:527-538)
254
+ - ✅ **On quota exceeded** (manager_agent.py:472-479)
255
+
256
+ **Logged Data:**
257
+ ```python
258
+ {
259
+ 'user_id': UUID,
260
+ 'hf_user_id': str,
261
+ 'query_text': str,
262
+ 'prompt_tokens': int,
263
+ 'completion_tokens': int,
264
+ 'total_tokens': int,
265
+ 'model': str, # e.g., "gpt-4o"
266
+ 'response_text': str,
267
+ 'error_message': str | None,
268
+ 'conversation_history': JSONB,
269
+ 'is_image_response': bool,
270
+ 'image_path': str | None
271
+ }
272
+ ```
273
+
274
+ **Database Schema:**
275
+ - Tables: `users`, `usage_logs` (see `database_schema.sql`)
276
+ - View: `user_stats` (aggregated statistics)
277
+ - Indexes for performance on `hf_user_id`, `timestamp`, `error_message`
278
+
279
+ ### Token Tracking Architecture
280
+ **Flow:**
281
+ 1. `GenerationAgent` captures usage from OpenAI API response
282
+ 2. `SupervisorAgent` captures usage from OpenAI API response
283
+ 3. `ManagerAgent` aggregates via `_aggregate_token_usage()`
284
+ 4. Total logged to Supabase immediately after query
285
+ 5. User's `tokens_used` incremented atomically
286
+
287
+ **Implementation:**
288
+ ```python
289
+ # In generation_agent.py and supervisor_agent.py
290
+ if hasattr(response, 'usage') and response.usage:
291
+ usage_info = {
292
+ 'prompt_tokens': response.usage.prompt_tokens,
293
+ 'completion_tokens': response.usage.completion_tokens,
294
+ 'total_tokens': response.usage.total_tokens
295
+ }
296
+ parsed_response['usage'] = usage_info
297
+ ```
298
+
299
+ ### Environment Variables
300
+ **Required for Production:**
301
+ ```bash
302
+ # Supabase (get from project settings)
303
+ SUPABASE_URL=https://your-project.supabase.co
304
+ SUPABASE_KEY=your_service_role_key
305
+
306
+ # OAuth (auto-set by HF Spaces)
307
+ OAUTH_CLIENT_ID=auto_populated
308
+ OAUTH_CLIENT_SECRET=auto_populated
309
+ OAUTH_SCOPES=auto_populated
310
+
311
+ # OpenAI (existing)
312
+ OPENAI_API_KEY=your_key
313
+ ```
314
+
315
+ **Setup in Hugging Face Spaces:**
316
+ 1. Go to Space Settings → Repository secrets
317
+ 2. Add `SUPABASE_URL` and `SUPABASE_KEY`
318
+ 3. OAuth vars are auto-added when `hf_oauth: true` is set
319
+
320
+ ### Supabase Integration
321
+ **Python Client:** `utils/supabase_client.py`
322
+ - `SupabaseClient` class with methods for all operations
323
+ - Singleton via `get_supabase_client()`
324
+ - Graceful degradation if Supabase disabled
325
+
326
+ **R Interface:** `utils/supabase_r.R`
327
+ - Wrapper functions callable from R
328
+ - Uses `reticulate` to call Python client
329
+ - Functions: `initialize_supabase()`, `check_user_quota()`, `get_or_create_user()`, etc.
330
+
331
+ **Key Operations:**
332
+ - `get_or_create_user()` - Prevent duplicate accounts
333
+ - `check_quota()` - Returns (has_quota, remaining, used)
334
+ - `log_usage()` - **Called immediately after query**
335
+ - `update_token_usage()` - Increment user's usage
336
+
337
  ## Literature Search Toggle Feature
338
 
339
  ### **Overview**
 
346
  - **Controls**: Click to enable/disable external literature search
347
 
348
  ### **Technical Implementation**
349
+ - **Frontend**: Button state managed via JavaScript and CSS styling
350
  - **Backend**: Literature preference passed from R to Python agents
351
  - **Agent Integration**: `ManagerAgent.process_single_query_with_preferences()` method handles literature control
352
  - **Internal Data**: Paper-based analysis (internal dataset) remains always enabled
Dockerfile CHANGED
@@ -17,7 +17,7 @@ RUN apt-get update && apt-get install -y \
17
 
18
  # Install R packages
19
  # Added .libPaths() to ensure installation in the main library site
20
- RUN R -e "print(.libPaths()); install.packages(c('shiny', 'readxl', 'DT', 'dplyr', 'reticulate', 'shinythemes', 'png', 'shinyjs', 'digest'), repos='http://cran.rstudio.com/', lib=.libPaths()[1])"
21
 
22
  # Verify reticulate installation
23
  RUN R -e "if (!requireNamespace('reticulate', quietly = TRUE)) { stop('reticulate package not found after installation') } else { print(paste('reticulate version:', packageVersion('reticulate'))) }"
@@ -31,6 +31,9 @@ RUN R -e "if (!requireNamespace('shinyjs', quietly = TRUE)) { stop('shinyjs pack
31
  # Verify digest installation
32
  RUN R -e "if (!requireNamespace('digest', quietly = TRUE)) { stop('digest package not found after installation') } else { print(paste('digest version:', packageVersion('digest'))) }"
33
 
 
 
 
34
  # Install Python packages
35
  COPY requirements.txt /app/requirements.txt
36
  RUN pip3 install --no-cache-dir -r /app/requirements.txt
 
17
 
18
  # Install R packages
19
  # Added .libPaths() to ensure installation in the main library site
20
+ RUN R -e "print(.libPaths()); install.packages(c('shiny', 'readxl', 'DT', 'dplyr', 'reticulate', 'shinythemes', 'png', 'shinyjs', 'digest', 'httr2'), repos='http://cran.rstudio.com/', lib=.libPaths()[1])"
21
 
22
  # Verify reticulate installation
23
  RUN R -e "if (!requireNamespace('reticulate', quietly = TRUE)) { stop('reticulate package not found after installation') } else { print(paste('reticulate version:', packageVersion('reticulate'))) }"
 
31
  # Verify digest installation
32
  RUN R -e "if (!requireNamespace('digest', quietly = TRUE)) { stop('digest package not found after installation') } else { print(paste('digest version:', packageVersion('digest'))) }"
33
 
34
+ # Verify httr2 installation
35
+ RUN R -e "if (!requireNamespace('httr2', quietly = TRUE)) { stop('httr2 package not found after installation') } else { print(paste('httr2 version:', packageVersion('httr2'))) }"
36
+
37
  # Install Python packages
38
  COPY requirements.txt /app/requirements.txt
39
  RUN pip3 install --no-cache-dir -r /app/requirements.txt
README.md CHANGED
@@ -5,6 +5,7 @@ colorFrom: indigo
5
  colorTo: green
6
  sdk: docker
7
  pinned: false
 
8
  ---
9
 
10
  # Taijichat Application
 
5
  colorTo: green
6
  sdk: docker
7
  pinned: false
8
+ hf_oauth: true
9
  ---
10
 
11
  # Taijichat Application
WORKFLOW_CHANGES.md DELETED
@@ -1,287 +0,0 @@
1
- # TaijiChat Workflow Changes: Literature Dialog Removal
2
-
3
- ## Overview
4
-
5
- This document outlines the major changes made to the TaijiChat multi-agent system to improve user experience by removing the upfront literature confirmation dialog and implementing a post-analysis literature exploration approach.
6
-
7
- ## Problem Statement
8
-
9
- ### Previous Workflow Issues:
10
- 1. **User Friction**: Every query was blocked by a literature preference dialog before processing
11
- 2. **Interruption of Flow**: Users had to make decisions before seeing any analysis results
12
- 3. **Unclear Context**: Users couldn't make informed decisions about literature sources without seeing initial results
13
- 4. **Pattern Matching Limitations**: Hardcoded keyword matching was unreliable for determining user intent
14
-
15
- ## Solution Design
16
-
17
- ### New Workflow Philosophy:
18
- - **Analyze First, Explore Later**: Provide immediate value with optional deeper exploration
19
- - **LLM-Powered Classification**: Use AI reasoning instead of pattern matching for intent detection
20
- - **Clear Source Distinction**: Differentiate between primary paper (guaranteed) vs external literature (supplementary)
21
- - **Progressive Disclosure**: Natural conversation flow with contextual followup options
22
-
23
- ## Implementation Details
24
-
25
- ### 1. ManagerAgent Changes (`agents/manager_agent.py`)
26
-
27
- #### **Removed Components:**
28
- ```python
29
- # REMOVED: Literature confirmation dialog
30
- def _request_literature_confirmation_upfront(self, user_query: str) -> str:
31
- # This entire method was removed
32
- ```
33
-
34
- #### **Modified Components:**
35
- ```python
36
- def _process_turn(self, user_query_text: str) -> tuple:
37
- # OLD: Asked for literature preferences before processing
38
- # NEW: Process directly with default settings (both sources enabled)
39
- response_text = self._process_with_literature_preferences(
40
- user_query_text,
41
- use_paper=True,
42
- use_external_literature=True
43
- )
44
- return response_text, False, None
45
- ```
46
-
47
- #### **Enhanced Features:**
48
- - Proper conversation history management
49
- - Direct processing without interruption
50
- - Maintains all existing security features
51
-
52
- ### 2. GenerationAgent Changes (`agents/generation_agent.py`)
53
-
54
- #### **Enhanced 13-Step Reasoning Process:**
55
- ```
56
- 1. Analyze the user query in detail
57
- 2. Analyze the conversation history if there's any
58
- 3. Analyze images, paper, data according to the plan if there's any provided
59
- 4. Analyze errors from previous attempts if there's any
60
- 5. Read the paper description to understand what the paper is about
61
- 6. **NEW: QUERY TYPE CLASSIFICATION:**
62
- - Is this a NEW_TASK (fresh analytical question) or FOLLOWUP_REQUEST (responding to literature offer)?
63
- - If FOLLOWUP_REQUEST, what does user want: PRIMARY_PAPER, EXTERNAL_LITERATURE, or COMPREHENSIVE?
64
- - Base decision on conversation context and user intent, not keywords
65
- - Consider if previous response contained "Explore Supporting Literature" section
66
- 7. Read the tools documentation thoroughly
67
- 8. Decide which tools can be helpful when answering the query
68
- 9. Read the data documentation
69
- 10. Decide which datasets are relevant to the user query
70
- 11. Decide whether the user query can be solved by paper or tools or data or a combination
71
- 12. Decide whether the user query is about image(s)
72
- 13. Put everything together to make a comprehensive plan
73
- ```
74
-
75
- #### **New Helper Methods:**
76
- ```python
77
- def _check_for_literature_offer(self, conversation_history: list) -> bool:
78
- """Check if previous response contained literature exploration offer."""
79
-
80
- def _classify_query_type(self, user_query: str, conversation_history: list) -> dict:
81
- """Provide context for LLM-based query classification."""
82
-
83
- def _append_literature_offer(self, explanation: str) -> str:
84
- """Append literature exploration options to NEW_TASK responses."""
85
- ```
86
-
87
- #### **Response Format Rules:**
88
- - **NEW_TASK**: Provide analysis + literature exploration offer
89
- - **FOLLOWUP_REQUEST**: Execute requested literature analysis without new offer
90
-
91
- ### 3. Literature Offer Format
92
-
93
- #### **Clear Source Distinction:**
94
- ```markdown
95
- ---
96
-
97
- **Explore Supporting Literature:**
98
-
99
- 📄 **Primary Paper**: Analyze the foundational research paper this website is based on for additional context about these findings.
100
-
101
- 🔍 **Recent Publications**: Search external academic databases for the latest research on these topics.
102
-
103
- 📚 **Comprehensive**: Get insights from both the foundational paper and recent literature.
104
-
105
- *Note: External literature serves as supplementary information only.*
106
- ```
107
-
108
- #### **Key Benefits:**
109
- - **Primary Paper**: Vetted, guaranteed accuracy, foundational to website
110
- - **External Literature**: Recent, supplementary, not guaranteed by website
111
- - **User Choice**: Informed decision about source reliability vs recency
112
-
113
- ## Workflow Examples
114
-
115
- ### Example 1: Fresh Query → Analysis + Offer
116
-
117
- **User Input:** *"What are the top 5 TEXterm-specific TFs?"*
118
-
119
- **System Flow:**
120
- 1. ManagerAgent processes immediately (no dialog)
121
- 2. GenerationAgent Step 6: Classification → NEW_TASK
122
- 3. Execute TF data analysis
123
- 4. Return results with literature exploration offer
124
-
125
- **Expected Response:**
126
- ```
127
- The top 5 TEXterm-specific transcription factors are:
128
- 1. Zscan20 (p-value: 0.001)
129
- 2. Jdp2 (p-value: 0.002)
130
- 3. Zfp324 (p-value: 0.003)
131
- 4. Batf (p-value: 0.004)
132
- 5. Ikzf1 (p-value: 0.005)
133
-
134
- These rankings are based on statistical significance from the dataset analysis.
135
-
136
- ---
137
-
138
- **Explore Supporting Literature:**
139
-
140
- 📄 **Primary Paper**: Analyze the foundational research paper this website is based on for additional context about these TFs.
141
-
142
- 🔍 **Recent Publications**: Search external academic databases for the latest research on these transcription factors.
143
-
144
- 📚 **Comprehensive**: Get insights from both the foundational paper and recent literature.
145
-
146
- *Note: External literature serves as supplementary information only.*
147
- ```
148
-
149
- ### Example 2: Literature Followup → Targeted Analysis
150
-
151
- **User Input:** *"Search recent publications about these TFs"*
152
-
153
- **System Flow:**
154
- 1. GenerationAgent detects previous literature offer
155
- 2. Step 6: Classification → FOLLOWUP_REQUEST, intent: EXTERNAL_LITERATURE
156
- 3. Execute literature search using previous TF context
157
- 4. Return literature analysis (no new offer)
158
-
159
- **Expected Response:**
160
- ```
161
- ## Recent Literature on TEXterm Transcription Factors
162
-
163
- Based on external academic database search, here are key recent findings:
164
-
165
- **Zscan20 in T Cell Exhaustion:**
166
- Recent studies [1] demonstrate that Zscan20 acts as a master regulator of terminal exhaustion...
167
-
168
- **Jdp2 Regulatory Networks:**
169
- New research [2] reveals Jdp2's role in chromatin remodeling during exhaustion programming...
170
-
171
- [Additional literature analysis with proper citations]
172
-
173
- ## References
174
- [1] Smith et al. (2023). Zscan20 controls T cell exhaustion pathways. Nature Immunology.
175
- [2] Johnson et al. (2023). Jdp2 in immune regulation. Cell.
176
-
177
- *This analysis is based on external literature sources and serves as supplementary information.*
178
- ```
179
-
180
- ### Example 3: Primary Paper Request → Paper Analysis
181
-
182
- **User Input:** *"What does the foundational study say about these TFs?"*
183
-
184
- **System Flow:**
185
- 1. Step 6: Classification → FOLLOWUP_REQUEST, intent: PRIMARY_PAPER
186
- 2. Analyze paper.pdf with previous TF context
187
- 3. Return focused paper analysis
188
-
189
- ## Technical Implementation
190
-
191
- ### Query Classification Logic
192
-
193
- The system uses LLM reasoning instead of pattern matching:
194
-
195
- ```python
196
- # Context provided to LLM for classification
197
- classification_instructions = f"\\n\\nQUERY CLASSIFICATION CONTEXT:"
198
- classification_instructions += f"\\n- Previous response had literature offer: {has_previous_offer}"
199
- if has_previous_offer:
200
- classification_instructions += "\\n- This query might be a FOLLOWUP_REQUEST for literature analysis"
201
- classification_instructions += "\\n- Determine user intent: PRIMARY_PAPER, EXTERNAL_LITERATURE, or COMPREHENSIVE"
202
- classification_instructions += "\\n- If FOLLOWUP_REQUEST, do NOT append literature offer to final response"
203
- else:
204
- classification_instructions += "\\n- This is likely a NEW_TASK requiring fresh analysis"
205
- classification_instructions += "\\n- If status is CODE_COMPLETE, append literature offer to explanation"
206
- ```
207
-
208
- ### Conversation History Management
209
-
210
- ```python
211
- # ManagerAgent properly manages conversation state
212
- def _process_with_literature_preferences(self, user_query: str, use_paper: bool, use_external_literature: bool) -> str:
213
- # Process query and get response
214
- final_response = final_plan_for_turn.get('explanation', 'Processing completed.')
215
-
216
- # Add response to conversation history for future context
217
- self.conversation_history.append({"role": "assistant", "content": final_response})
218
-
219
- return final_response
220
- ```
221
-
222
- ## Benefits
223
-
224
- ### 1. **Improved User Experience**
225
- - **Immediate Response**: No blocking dialogs
226
- - **Natural Flow**: Conversational interaction
227
- - **Informed Decisions**: Literature choices made after seeing results
228
-
229
- ### 2. **Better Intent Recognition**
230
- - **LLM-Powered**: Semantic understanding vs keyword matching
231
- - **Context-Aware**: Considers conversation history
232
- - **Flexible**: Adapts to various user phrasings
233
-
234
- ### 3. **Clear Information Hierarchy**
235
- - **Primary Sources**: Guaranteed accuracy, foundational research
236
- - **Supplementary Sources**: Recent literature, clearly marked as external
237
- - **User Agency**: Informed choice about source reliability
238
-
239
- ### 4. **Maintained Security**
240
- - **All existing safeguards preserved**
241
- - **SupervisorAgent**: Code review unchanged
242
- - **ExecutorAgent**: Sandboxed execution unchanged
243
- - **Literature preferences**: Still respected in execution
244
-
245
- ## Testing
246
-
247
- ### Test Scenarios Created:
248
- 1. **Fresh Query Test**: Verify immediate analysis + literature offer
249
- 2. **External Literature Followup**: Test FOLLOWUP_REQUEST classification
250
- 3. **Primary Paper Followup**: Test paper analysis request
251
- 4. **Conversation Context**: Verify proper history management
252
-
253
- ### Test File: `test_workflow.py`
254
- - Comprehensive workflow testing
255
- - Conversation history verification
256
- - Response format validation
257
-
258
- ## Migration Notes
259
-
260
- ### Backward Compatibility
261
- - **R Interface**: `handle_literature_confirmation()` method marked as LEGACY but preserved
262
- - **Existing Data**: All dataset access patterns unchanged
263
- - **Security Model**: No changes to permission structure
264
-
265
- ### Deployment Considerations
266
- - **No breaking changes** to existing functionality
267
- - **Enhanced user experience** without compromising security
268
- - **Gradual rollout** possible through feature flags if needed
269
-
270
- ## Future Enhancements
271
-
272
- ### Potential Improvements:
273
- 1. **Smart Context Extraction**: Better extraction of relevant terms from previous analysis for literature searches
274
- 2. **Citation Quality**: Enhanced citation formatting and link validation
275
- 3. **User Preferences**: Optional user settings to remember literature preferences
276
- 4. **Analytics**: Track which literature options users choose most frequently
277
-
278
- ## Conclusion
279
-
280
- The new workflow successfully addresses the original user experience issues while maintaining all security and functionality requirements. The system now provides immediate value to users while offering natural pathways for deeper exploration, creating a more engaging and efficient interaction model.
281
-
282
- Key success metrics:
283
- - ✅ **Removed user friction**: No blocking dialogs
284
- - ✅ **Maintained security**: All safeguards preserved
285
- - ✅ **Improved classification**: LLM-based intent recognition
286
- - ✅ **Clear information hierarchy**: Distinguished source types
287
- - ✅ **Natural conversation flow**: Progressive disclosure model
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
agents/executor_agent.py CHANGED
@@ -52,21 +52,45 @@ class ExecutorAgent:
52
  }
53
  # No separate locals, exec will use restricted_globals as locals too
54
 
 
 
 
 
55
  captured_output = io.StringIO()
56
  try:
57
  with contextlib.redirect_stdout(captured_output):
58
- exec(python_code, restricted_globals)
59
  output_str = captured_output.getvalue()
 
 
 
 
 
 
 
 
 
60
  return {
61
  "execution_output": output_str.strip() if output_str else "(No output printed by code)",
62
- "execution_status": "SUCCESS"
 
63
  }
64
  except Exception as e:
65
  error_details = f"{type(e).__name__}: {str(e)}"
66
  # Try to get traceback if possible, though might be complex to format cleanly here
 
 
 
 
 
 
 
 
 
67
  return {
68
  "execution_output": f"Execution Error!\n{error_details}",
69
- "execution_status": f"ERROR: {type(e).__name__}"
 
70
  }
71
 
72
  if __name__ == '__main__':
 
52
  }
53
  # No separate locals, exec will use restricted_globals as locals too
54
 
55
+ # Create usage collector for tracking OpenAI API calls in agent_tools
56
+ import builtins
57
+ builtins.__agent_usage_collector__ = []
58
+
59
  captured_output = io.StringIO()
60
  try:
61
  with contextlib.redirect_stdout(captured_output):
62
+ exec(python_code, restricted_globals)
63
  output_str = captured_output.getvalue()
64
+
65
+ # Extract collected usage info
66
+ usage_list = builtins.__agent_usage_collector__
67
+ aggregated_usage = {
68
+ 'prompt_tokens': sum(u.get('prompt_tokens', 0) for u in usage_list),
69
+ 'completion_tokens': sum(u.get('completion_tokens', 0) for u in usage_list),
70
+ 'total_tokens': sum(u.get('total_tokens', 0) for u in usage_list)
71
+ }
72
+
73
  return {
74
  "execution_output": output_str.strip() if output_str else "(No output printed by code)",
75
+ "execution_status": "SUCCESS",
76
+ "usage": aggregated_usage
77
  }
78
  except Exception as e:
79
  error_details = f"{type(e).__name__}: {str(e)}"
80
  # Try to get traceback if possible, though might be complex to format cleanly here
81
+
82
+ # Extract usage even on error (API calls may have occurred before failure)
83
+ usage_list = builtins.__agent_usage_collector__
84
+ aggregated_usage = {
85
+ 'prompt_tokens': sum(u.get('prompt_tokens', 0) for u in usage_list),
86
+ 'completion_tokens': sum(u.get('completion_tokens', 0) for u in usage_list),
87
+ 'total_tokens': sum(u.get('total_tokens', 0) for u in usage_list)
88
+ }
89
+
90
  return {
91
  "execution_output": f"Execution Error!\n{error_details}",
92
+ "execution_status": f"ERROR: {type(e).__name__}",
93
+ "usage": aggregated_usage
94
  }
95
 
96
  if __name__ == '__main__':
agents/generation_agent.py CHANGED
@@ -785,10 +785,22 @@ class GenerationAgent:
785
 
786
  # Make the API call
787
  response = self.client.chat.completions.create(**params)
788
-
 
 
 
 
 
 
 
 
 
 
 
 
789
  # Get the response content
790
  assistant_response_json_str = response.choices[0].message.content
791
-
792
  # Clean up the response - remove any code fence markers
793
  if assistant_response_json_str.startswith("```json"):
794
  assistant_response_json_str = assistant_response_json_str[len("```json"):].strip()
@@ -796,20 +808,22 @@ class GenerationAgent:
796
  assistant_response_json_str = assistant_response_json_str[len("```"):].strip()
797
  if assistant_response_json_str.endswith("```"):
798
  assistant_response_json_str = assistant_response_json_str[:-len("```")].strip()
799
-
800
  try:
801
  # Parse the JSON response
802
  parsed_response = json.loads(assistant_response_json_str)
803
-
804
  # Validate the response has the required fields
805
  if not all(k in parsed_response for k in ["thought", "python_code", "status"]):
806
  print("GenerationAgent Error: Chat response JSON missing required keys.")
807
- return {"thought": "Error parsing Chat API response: Missing keys.", "python_code": "", "status": "ERROR"}
808
-
809
  # Additional validation for AWAITING_DATA status
810
  if parsed_response.get("status") == "AWAITING_DATA" and not ("intermediate_data_for_llm" in parsed_response.get("python_code", "") and "json.dumps" in parsed_response.get("python_code", "")):
811
  print("GenerationAgent Warning: Status is AWAITING_DATA but python_code does not follow required format.")
812
-
 
 
813
  return parsed_response
814
 
815
  except json.JSONDecodeError as e:
 
785
 
786
  # Make the API call
787
  response = self.client.chat.completions.create(**params)
788
+
789
+ # CAPTURE TOKEN USAGE
790
+ usage_info = {
791
+ 'prompt_tokens': 0,
792
+ 'completion_tokens': 0,
793
+ 'total_tokens': 0
794
+ }
795
+ if hasattr(response, 'usage') and response.usage:
796
+ usage_info['prompt_tokens'] = response.usage.prompt_tokens
797
+ usage_info['completion_tokens'] = response.usage.completion_tokens
798
+ usage_info['total_tokens'] = response.usage.total_tokens
799
+ print(f"[GenerationAgent] Token usage - prompt: {usage_info['prompt_tokens']}, completion: {usage_info['completion_tokens']}, total: {usage_info['total_tokens']}")
800
+
801
  # Get the response content
802
  assistant_response_json_str = response.choices[0].message.content
803
+
804
  # Clean up the response - remove any code fence markers
805
  if assistant_response_json_str.startswith("```json"):
806
  assistant_response_json_str = assistant_response_json_str[len("```json"):].strip()
 
808
  assistant_response_json_str = assistant_response_json_str[len("```"):].strip()
809
  if assistant_response_json_str.endswith("```"):
810
  assistant_response_json_str = assistant_response_json_str[:-len("```")].strip()
811
+
812
  try:
813
  # Parse the JSON response
814
  parsed_response = json.loads(assistant_response_json_str)
815
+
816
  # Validate the response has the required fields
817
  if not all(k in parsed_response for k in ["thought", "python_code", "status"]):
818
  print("GenerationAgent Error: Chat response JSON missing required keys.")
819
+ return {"thought": "Error parsing Chat API response: Missing keys.", "python_code": "", "status": "ERROR", "usage": usage_info}
820
+
821
  # Additional validation for AWAITING_DATA status
822
  if parsed_response.get("status") == "AWAITING_DATA" and not ("intermediate_data_for_llm" in parsed_response.get("python_code", "") and "json.dumps" in parsed_response.get("python_code", "")):
823
  print("GenerationAgent Warning: Status is AWAITING_DATA but python_code does not follow required format.")
824
+
825
+ # Add usage info to response
826
+ parsed_response['usage'] = usage_info
827
  return parsed_response
828
 
829
  except json.JSONDecodeError as e:
agents/manager_agent.py CHANGED
@@ -22,9 +22,17 @@ from agents.executor_agent import ExecutorAgent
22
  # POLLING_INTERVAL_S and MAX_POLLING_ATTEMPTS are removed, polling is handled by individual agents.
23
 
24
  class ManagerAgent:
25
- def __init__(self, openai_api_key=None, openai_client: OpenAI = None, r_callback_fn=None):
26
  """
27
  Initialize the Manager Agent with OpenAI credentials and sub-agents.
 
 
 
 
 
 
 
 
28
  """
29
  if openai_client:
30
  self.client = openai_client
@@ -36,16 +44,27 @@ class ManagerAgent:
36
 
37
  # Storage for conversation history - list of dicts like [{"role": "user", "content": "..."}, {"role": "assistant", "content": "..."}]
38
  self.conversation_history = []
39
-
40
  # Storage for file information - dict like {"file_id": "...", "file_name": "...", "file_path": "..."}
41
  self.file_info = {}
42
-
43
  # Storage for pending literature confirmation
44
  self.pending_literature_confirmation = None
45
  self.pending_literature_query = None
46
-
47
  # R callback function for thoughts
48
  self.r_callback_fn = r_callback_fn
 
 
 
 
 
 
 
 
 
 
 
49
 
50
  # Initialize sub-agents
51
  try:
@@ -285,6 +304,11 @@ class ManagerAgent:
285
  final_plan_for_turn = plan
286
  current_plan_holder = plan
287
 
 
 
 
 
 
288
  # Reset for next potential direct image analysis
289
  image_file_id_for_analysis_step = None
290
 
@@ -312,6 +336,11 @@ class ManagerAgent:
312
  review = self.supervisor_agent.review_code(code_to_execute, f"Reviewing plan: {plan.get('thought', '')}")
313
  supervisor_status = review.get('safety_status', 'UNKNOWN_STATUS')
314
  supervisor_feedback = review.get('safety_feedback', 'No feedback.')
 
 
 
 
 
315
 
316
  if supervisor_status != "APPROVED_FOR_EXECUTION":
317
  return f"Code execution blocked by supervisor: {supervisor_feedback}"
@@ -330,6 +359,11 @@ class ManagerAgent:
330
  execution_result = self.executor_agent.execute_code(code_to_execute)
331
  execution_output = execution_result.get("execution_output", "")
332
  execution_status = execution_result.get("execution_status", "UNKNOWN")
 
 
 
 
 
333
 
334
  if execution_status == "SUCCESS":
335
  self._send_thought_to_r(f"Code execution successful.")
@@ -395,32 +429,128 @@ class ManagerAgent:
395
  self.literature_enabled = literature_enabled
396
  return self.process_single_query(user_query_text, conversation_history_from_r)
397
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
398
  def process_single_query(self, user_query_text: str, conversation_history_from_r: list = None) -> str:
399
  """
400
  Processes a single query, suitable for calling from an external system like R/Shiny.
401
  Manages its own conversation history based on input.
 
402
  """
403
  print(f"[Manager.process_single_query] Received query: '{user_query_text[:100]}...'")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
404
  if conversation_history_from_r is not None:
405
  # Overwrite or extend self.conversation_history. For simplicity, let's overwrite.
406
  # Ensure format matches: list of dicts like {"role": "user/assistant", "content": "..."}
407
  self.conversation_history = [dict(turn) for turn in conversation_history_from_r] # Ensure dicts
408
-
409
  # Add the current user query to the history for processing
410
  self.conversation_history.append({"role": "user", "content": user_query_text})
411
-
412
  # Initialize image tracking variables in case _process_turn fails
413
  is_image_response = False
414
  current_image_path = None
415
-
416
  try:
417
  # Process the query and get response with image information
418
  response_text, is_image_response, current_image_path = self._process_turn(user_query_text)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
419
  except Exception as e:
420
  print(f"[Manager.process_single_query] Error in _process_turn: {str(e)}")
421
  response_text = f"I encountered an error processing your request: {str(e)}"
422
  is_image_response = False
423
  current_image_path = None
 
 
 
 
 
 
 
 
 
 
 
 
 
 
424
 
425
  # If an image was processed, format the response to include image information
426
  if is_image_response and current_image_path:
 
22
  # POLLING_INTERVAL_S and MAX_POLLING_ATTEMPTS are removed, polling is handled by individual agents.
23
 
24
  class ManagerAgent:
25
+ def __init__(self, openai_api_key=None, openai_client: OpenAI = None, r_callback_fn=None, supabase_client=None, user_id=None, hf_user_id=None):
26
  """
27
  Initialize the Manager Agent with OpenAI credentials and sub-agents.
28
+
29
+ Args:
30
+ openai_api_key: OpenAI API key
31
+ openai_client: Pre-initialized OpenAI client
32
+ r_callback_fn: Callback function for R integration
33
+ supabase_client: Supabase client for logging and quota tracking
34
+ user_id: UUID of user from Supabase users table
35
+ hf_user_id: Hugging Face user ID
36
  """
37
  if openai_client:
38
  self.client = openai_client
 
44
 
45
  # Storage for conversation history - list of dicts like [{"role": "user", "content": "..."}, {"role": "assistant", "content": "..."}]
46
  self.conversation_history = []
47
+
48
  # Storage for file information - dict like {"file_id": "...", "file_name": "...", "file_path": "..."}
49
  self.file_info = {}
50
+
51
  # Storage for pending literature confirmation
52
  self.pending_literature_confirmation = None
53
  self.pending_literature_query = None
54
+
55
  # R callback function for thoughts
56
  self.r_callback_fn = r_callback_fn
57
+
58
+ # Supabase client for logging and quota tracking
59
+ self.supabase_client = supabase_client
60
+ self.current_user_id = user_id
61
+ self.current_hf_user_id = hf_user_id
62
+
63
+ # Token tracking for current query
64
+ self.last_prompt_tokens = 0
65
+ self.last_completion_tokens = 0
66
+ self.last_total_tokens = 0
67
+ self.model_name = "gpt-4o"
68
 
69
  # Initialize sub-agents
70
  try:
 
304
  final_plan_for_turn = plan
305
  current_plan_holder = plan
306
 
307
+ # Aggregate token usage from GenerationAgent
308
+ if 'usage' in plan:
309
+ self._aggregate_token_usage(plan['usage'])
310
+ print(f"[Manager] Aggregated GenerationAgent usage: {plan['usage'].get('total_tokens', 0)} tokens")
311
+
312
  # Reset for next potential direct image analysis
313
  image_file_id_for_analysis_step = None
314
 
 
336
  review = self.supervisor_agent.review_code(code_to_execute, f"Reviewing plan: {plan.get('thought', '')}")
337
  supervisor_status = review.get('safety_status', 'UNKNOWN_STATUS')
338
  supervisor_feedback = review.get('safety_feedback', 'No feedback.')
339
+
340
+ # Aggregate token usage from SupervisorAgent
341
+ if 'usage' in review:
342
+ self._aggregate_token_usage(review['usage'])
343
+ print(f"[Manager] Aggregated SupervisorAgent usage: {review['usage'].get('total_tokens', 0)} tokens")
344
 
345
  if supervisor_status != "APPROVED_FOR_EXECUTION":
346
  return f"Code execution blocked by supervisor: {supervisor_feedback}"
 
359
  execution_result = self.executor_agent.execute_code(code_to_execute)
360
  execution_output = execution_result.get("execution_output", "")
361
  execution_status = execution_result.get("execution_status", "UNKNOWN")
362
+
363
+ # Aggregate token usage from ExecutorAgent (captures describe_image API calls)
364
+ if 'usage' in execution_result:
365
+ self._aggregate_token_usage(execution_result['usage'])
366
+ print(f"[Manager] Aggregated ExecutorAgent usage: {execution_result['usage'].get('total_tokens', 0)} tokens")
367
 
368
  if execution_status == "SUCCESS":
369
  self._send_thought_to_r(f"Code execution successful.")
 
429
  self.literature_enabled = literature_enabled
430
  return self.process_single_query(user_query_text, conversation_history_from_r)
431
 
432
+ def set_user_context(self, user_id: str = None, hf_user_id: str = None):
433
+ """Set user context for quota tracking and logging"""
434
+ self.current_user_id = user_id
435
+ self.current_hf_user_id = hf_user_id
436
+ print(f"[Manager] Set user context: user_id={user_id}, hf_user_id={hf_user_id}")
437
+
438
+ def _check_quota_before_processing(self) -> tuple:
439
+ """
440
+ Check if user has sufficient quota before processing query
441
+ Returns: (has_quota: bool, remaining: int, error_message: str or None)
442
+ """
443
+ if not self.supabase_client or not self.supabase_client.is_enabled():
444
+ return (True, 999999, None)
445
+
446
+ if not self.current_hf_user_id:
447
+ return (False, 0, "User not authenticated")
448
+
449
+ try:
450
+ has_quota, remaining, used = self.supabase_client.check_quota(self.current_hf_user_id)
451
+ if not has_quota:
452
+ error_msg = f"Token quota exceeded. Used: {used}, Remaining: {remaining}. Please contact support to increase your quota."
453
+ return (False, remaining, error_msg)
454
+ return (True, remaining, None)
455
+ except Exception as e:
456
+ print(f"[Manager] Error checking quota: {e}")
457
+ return (True, 999999, None) # Fail open
458
+
459
+ def _reset_token_tracking(self):
460
+ """Reset token counters for new query"""
461
+ self.last_prompt_tokens = 0
462
+ self.last_completion_tokens = 0
463
+ self.last_total_tokens = 0
464
+
465
+ def _aggregate_token_usage(self, usage_dict: dict):
466
+ """Aggregate token usage from agent responses"""
467
+ if usage_dict:
468
+ self.last_prompt_tokens += usage_dict.get('prompt_tokens', 0)
469
+ self.last_completion_tokens += usage_dict.get('completion_tokens', 0)
470
+ self.last_total_tokens += usage_dict.get('total_tokens', 0)
471
+
472
  def process_single_query(self, user_query_text: str, conversation_history_from_r: list = None) -> str:
473
  """
474
  Processes a single query, suitable for calling from an external system like R/Shiny.
475
  Manages its own conversation history based on input.
476
+ Includes quota checking and comprehensive logging.
477
  """
478
  print(f"[Manager.process_single_query] Received query: '{user_query_text[:100]}...'")
479
+
480
+ # Reset token tracking for new query
481
+ self._reset_token_tracking()
482
+
483
+ # Check quota BEFORE processing
484
+ has_quota, remaining, quota_error = self._check_quota_before_processing()
485
+ if not has_quota:
486
+ # Log the quota exceeded error
487
+ if self.supabase_client and self.supabase_client.is_enabled():
488
+ self.supabase_client.log_usage(
489
+ hf_user_id=self.current_hf_user_id,
490
+ user_id=self.current_user_id,
491
+ query_text=user_query_text,
492
+ error_message=quota_error,
493
+ conversation_history=conversation_history_from_r
494
+ )
495
+ return quota_error
496
+
497
  if conversation_history_from_r is not None:
498
  # Overwrite or extend self.conversation_history. For simplicity, let's overwrite.
499
  # Ensure format matches: list of dicts like {"role": "user/assistant", "content": "..."}
500
  self.conversation_history = [dict(turn) for turn in conversation_history_from_r] # Ensure dicts
501
+
502
  # Add the current user query to the history for processing
503
  self.conversation_history.append({"role": "user", "content": user_query_text})
504
+
505
  # Initialize image tracking variables in case _process_turn fails
506
  is_image_response = False
507
  current_image_path = None
508
+
509
  try:
510
  # Process the query and get response with image information
511
  response_text, is_image_response, current_image_path = self._process_turn(user_query_text)
512
+
513
+ # IMMEDIATE LOGGING TO SUPABASE AFTER SUCCESSFUL PROCESSING
514
+ if self.supabase_client and self.supabase_client.is_enabled():
515
+ self.supabase_client.log_usage(
516
+ hf_user_id=self.current_hf_user_id,
517
+ user_id=self.current_user_id,
518
+ query_text=user_query_text,
519
+ prompt_tokens=self.last_prompt_tokens,
520
+ completion_tokens=self.last_completion_tokens,
521
+ total_tokens=self.last_total_tokens,
522
+ model=self.model_name,
523
+ response_text=response_text,
524
+ error_message=None,
525
+ conversation_history=self.conversation_history,
526
+ is_image_response=is_image_response,
527
+ image_path=current_image_path
528
+ )
529
+
530
+ # Update user's token usage
531
+ if self.last_total_tokens > 0:
532
+ self.supabase_client.update_token_usage(self.current_hf_user_id, self.last_total_tokens)
533
+ print(f"[Manager] Updated token usage: +{self.last_total_tokens} tokens")
534
+
535
  except Exception as e:
536
  print(f"[Manager.process_single_query] Error in _process_turn: {str(e)}")
537
  response_text = f"I encountered an error processing your request: {str(e)}"
538
  is_image_response = False
539
  current_image_path = None
540
+
541
+ # LOG ERROR TO SUPABASE
542
+ if self.supabase_client and self.supabase_client.is_enabled():
543
+ self.supabase_client.log_usage(
544
+ hf_user_id=self.current_hf_user_id,
545
+ user_id=self.current_user_id,
546
+ query_text=user_query_text,
547
+ prompt_tokens=self.last_prompt_tokens,
548
+ completion_tokens=self.last_completion_tokens,
549
+ total_tokens=self.last_total_tokens,
550
+ model=self.model_name,
551
+ error_message=str(e),
552
+ conversation_history=self.conversation_history
553
+ )
554
 
555
  # If an image was processed, format the response to include image information
556
  if is_image_response and current_image_path:
agents/supervisor_agent.py CHANGED
@@ -136,14 +136,21 @@ class SupervisorAgent:
136
 
137
  def review_code(self, python_code: str, thought: str): # Removed client_openai from params
138
  print(f"SupervisorAgent.review_code received code. Thought: {thought[:100]}...") # Log more of the thought
139
-
 
 
 
 
 
 
 
140
  if not python_code.strip():
141
  print("SupervisorAgent: No actual code provided for review. Approving as safe.")
142
- return {"safety_feedback": "No code provided by Generation Agent.", "safety_status": "APPROVED_FOR_EXECUTION", "user_facing_rejection_reason": ""}
143
 
144
  if not self.client or not self.supervisor_assistant:
145
  print("SupervisorAgent Error: OpenAI client or Supervisor Assistant not available for code review.")
146
- return {"safety_feedback": "Error: Supervisor Agent not properly initialized.", "safety_status": "REJECTED_NEEDS_REVISION", "user_facing_rejection_reason": "The supervisor agent encountered an error."}
147
 
148
  thread = None # Initialize for the finally block
149
  try:
@@ -185,6 +192,13 @@ class SupervisorAgent:
185
  attempts += 1
186
 
187
  # 6. Process Run Outcome
 
 
 
 
 
 
 
188
  if run.status == "completed":
189
  # print(f"SupervisorAgent: Run {run.id} completed.")
190
  messages_response = self.client.beta.threads.messages.list(thread_id=thread.id, order="desc", limit=1)
@@ -206,9 +220,10 @@ class SupervisorAgent:
206
  if not all(k in parsed_response for k in ["safety_feedback", "safety_status", "user_facing_rejection_reason"]):
207
  print("SupervisorAgent Error: LLM review JSON missing required keys.")
208
  return {
209
- "safety_feedback": "Internal Error: LLM review response malformed (missing keys).",
210
  "safety_status": "REJECTED_NEEDS_REVISION",
211
- "user_facing_rejection_reason": "The code review process encountered an internal error."
 
212
  }
213
  # Validate safety_status value
214
  if parsed_response["safety_status"] not in ["APPROVED_FOR_EXECUTION", "REJECTED_NEEDS_REVISION"]:
@@ -226,33 +241,38 @@ class SupervisorAgent:
226
  elif parsed_response["safety_status"] == "APPROVED_FOR_EXECUTION" and not parsed_response.get("user_facing_rejection_reason","").strip():
227
  parsed_response["user_facing_rejection_reason"] = "Approved."
228
 
 
 
229
  return parsed_response
230
  except json.JSONDecodeError as e:
231
  print(f"SupervisorAgent JSONDecodeError: Could not parse LLM review JSON: {e}. Response: {assistant_response_json_str}")
232
  return {
233
- "safety_feedback": f"Internal Error: Failed to parse LLM review JSON. {e}",
234
  "safety_status": "REJECTED_NEEDS_REVISION",
235
- "user_facing_rejection_reason": "The code review result was unreadable."
 
236
  }
237
  else:
238
  print("SupervisorAgent Error: No valid message content from assistant after review run completion.")
239
  return {
240
- "safety_feedback": "Internal Error: No content from supervisor assistant.",
241
  "safety_status": "REJECTED_NEEDS_REVISION",
242
- "user_facing_rejection_reason": "The supervisor agent provided no response."
 
243
  }
244
  else:
245
  error_message = f"Review run failed or timed out. Status: {run.status}"
246
  if run.last_error:
247
  error_message += f" Last Error: {run.last_error.message}"
248
  print(f"SupervisorAgent Error: {error_message}")
249
- return {"safety_feedback": error_message, "safety_status": "REJECTED_NEEDS_REVISION", "user_facing_rejection_reason": "The code review process encountered an error."}
250
  except Exception as e:
251
  print(f"SupervisorAgent Error: General exception during review_code: {e}")
252
  return {
253
- "safety_feedback": f"General exception in review_code: {e}",
254
  "safety_status": "REJECTED_NEEDS_REVISION",
255
- "user_facing_rejection_reason": "A general error occurred during code review."
 
256
  }
257
  finally:
258
  # 7. Delete Thread
 
136
 
137
  def review_code(self, python_code: str, thought: str): # Removed client_openai from params
138
  print(f"SupervisorAgent.review_code received code. Thought: {thought[:100]}...") # Log more of the thought
139
+
140
+ # Initialize usage tracking
141
+ usage_info = {
142
+ 'prompt_tokens': 0,
143
+ 'completion_tokens': 0,
144
+ 'total_tokens': 0
145
+ }
146
+
147
  if not python_code.strip():
148
  print("SupervisorAgent: No actual code provided for review. Approving as safe.")
149
+ return {"safety_feedback": "No code provided by Generation Agent.", "safety_status": "APPROVED_FOR_EXECUTION", "user_facing_rejection_reason": "", "usage": usage_info}
150
 
151
  if not self.client or not self.supervisor_assistant:
152
  print("SupervisorAgent Error: OpenAI client or Supervisor Assistant not available for code review.")
153
+ return {"safety_feedback": "Error: Supervisor Agent not properly initialized.", "safety_status": "REJECTED_NEEDS_REVISION", "user_facing_rejection_reason": "The supervisor agent encountered an error.", "usage": usage_info}
154
 
155
  thread = None # Initialize for the finally block
156
  try:
 
192
  attempts += 1
193
 
194
  # 6. Process Run Outcome
195
+ # CAPTURE TOKEN USAGE from Run
196
+ if hasattr(run, 'usage') and run.usage:
197
+ usage_info['prompt_tokens'] = getattr(run.usage, 'prompt_tokens', 0)
198
+ usage_info['completion_tokens'] = getattr(run.usage, 'completion_tokens', 0)
199
+ usage_info['total_tokens'] = getattr(run.usage, 'total_tokens', 0)
200
+ print(f"[SupervisorAgent] Token usage - total: {usage_info['total_tokens']}")
201
+
202
  if run.status == "completed":
203
  # print(f"SupervisorAgent: Run {run.id} completed.")
204
  messages_response = self.client.beta.threads.messages.list(thread_id=thread.id, order="desc", limit=1)
 
220
  if not all(k in parsed_response for k in ["safety_feedback", "safety_status", "user_facing_rejection_reason"]):
221
  print("SupervisorAgent Error: LLM review JSON missing required keys.")
222
  return {
223
+ "safety_feedback": "Internal Error: LLM review response malformed (missing keys).",
224
  "safety_status": "REJECTED_NEEDS_REVISION",
225
+ "user_facing_rejection_reason": "The code review process encountered an internal error.",
226
+ "usage": usage_info
227
  }
228
  # Validate safety_status value
229
  if parsed_response["safety_status"] not in ["APPROVED_FOR_EXECUTION", "REJECTED_NEEDS_REVISION"]:
 
241
  elif parsed_response["safety_status"] == "APPROVED_FOR_EXECUTION" and not parsed_response.get("user_facing_rejection_reason","").strip():
242
  parsed_response["user_facing_rejection_reason"] = "Approved."
243
 
244
+ # Add usage info to response
245
+ parsed_response['usage'] = usage_info
246
  return parsed_response
247
  except json.JSONDecodeError as e:
248
  print(f"SupervisorAgent JSONDecodeError: Could not parse LLM review JSON: {e}. Response: {assistant_response_json_str}")
249
  return {
250
+ "safety_feedback": f"Internal Error: Failed to parse LLM review JSON. {e}",
251
  "safety_status": "REJECTED_NEEDS_REVISION",
252
+ "user_facing_rejection_reason": "The code review result was unreadable.",
253
+ "usage": usage_info
254
  }
255
  else:
256
  print("SupervisorAgent Error: No valid message content from assistant after review run completion.")
257
  return {
258
+ "safety_feedback": "Internal Error: No content from supervisor assistant.",
259
  "safety_status": "REJECTED_NEEDS_REVISION",
260
+ "user_facing_rejection_reason": "The supervisor agent provided no response.",
261
+ "usage": usage_info
262
  }
263
  else:
264
  error_message = f"Review run failed or timed out. Status: {run.status}"
265
  if run.last_error:
266
  error_message += f" Last Error: {run.last_error.message}"
267
  print(f"SupervisorAgent Error: {error_message}")
268
+ return {"safety_feedback": error_message, "safety_status": "REJECTED_NEEDS_REVISION", "user_facing_rejection_reason": "The code review process encountered an error.", "usage": usage_info}
269
  except Exception as e:
270
  print(f"SupervisorAgent Error: General exception during review_code: {e}")
271
  return {
272
+ "safety_feedback": f"General exception in review_code: {e}",
273
  "safety_status": "REJECTED_NEEDS_REVISION",
274
+ "user_facing_rejection_reason": "A general error occurred during code review.",
275
+ "usage": usage_info
276
  }
277
  finally:
278
  # 7. Delete Thread
auth/hf_oauth.R ADDED
@@ -0,0 +1,164 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # auth/hf_oauth.R
2
+ # Hugging Face OAuth authentication for TaijiChat
3
+
4
+ library(httr2)
5
+ library(jsonlite)
6
+
7
+ # OAuth Configuration
8
+ HF_OAUTH_AUTHORIZE_URL <- "https://huggingface.co/oauth/authorize"
9
+ HF_OAUTH_TOKEN_URL <- "https://huggingface.co/oauth/token"
10
+ HF_USER_INFO_URL <- "https://huggingface.co/api/whoami-v2"
11
+
12
+ # Initialize OAuth configuration
13
+ initialize_oauth <- function() {
14
+ oauth_config <- list(
15
+ client_id = Sys.getenv("OAUTH_CLIENT_ID"),
16
+ client_secret = Sys.getenv("OAUTH_CLIENT_SECRET"),
17
+ scopes = Sys.getenv("OAUTH_SCOPES", "openid profile email"),
18
+ enabled = FALSE
19
+ )
20
+
21
+ if (oauth_config$client_id != "" && oauth_config$client_secret != "") {
22
+ oauth_config$enabled <- TRUE
23
+ print("OAuth: Hugging Face OAuth is enabled")
24
+ } else {
25
+ warning("OAuth: OAUTH_CLIENT_ID and OAUTH_CLIENT_SECRET not set. Authentication disabled.")
26
+ }
27
+
28
+ return(oauth_config)
29
+ }
30
+
31
+ # Generate OAuth authorization URL
32
+ get_authorization_url <- function(oauth_config, redirect_uri, state) {
33
+ if (!oauth_config$enabled) {
34
+ return(NULL)
35
+ }
36
+
37
+ params <- list(
38
+ client_id = oauth_config$client_id,
39
+ redirect_uri = redirect_uri,
40
+ scope = oauth_config$scopes,
41
+ state = state,
42
+ response_type = "code"
43
+ )
44
+
45
+ # Build query string
46
+ query_string <- paste(
47
+ sapply(names(params), function(name) {
48
+ paste0(name, "=", URLencode(params[[name]], reserved = TRUE))
49
+ }),
50
+ collapse = "&"
51
+ )
52
+
53
+ auth_url <- paste0(HF_OAUTH_AUTHORIZE_URL, "?", query_string)
54
+ return(auth_url)
55
+ }
56
+
57
+ # Exchange authorization code for access token
58
+ exchange_code_for_token <- function(oauth_config, code, redirect_uri) {
59
+ if (!oauth_config$enabled) {
60
+ return(NULL)
61
+ }
62
+
63
+ tryCatch({
64
+ # Make token request
65
+ response <- httr2::request(HF_OAUTH_TOKEN_URL) %>%
66
+ httr2::req_method("POST") %>%
67
+ httr2::req_body_form(
68
+ client_id = oauth_config$client_id,
69
+ client_secret = oauth_config$client_secret,
70
+ code = code,
71
+ redirect_uri = redirect_uri,
72
+ grant_type = "authorization_code"
73
+ ) %>%
74
+ httr2::req_perform()
75
+
76
+ # Parse response
77
+ token_data <- httr2::resp_body_json(response)
78
+
79
+ if (!is.null(token_data$access_token)) {
80
+ print("OAuth: Successfully obtained access token")
81
+ return(list(
82
+ access_token = token_data$access_token,
83
+ token_type = token_data$token_type,
84
+ scope = token_data$scope
85
+ ))
86
+ } else {
87
+ warning("OAuth: Token response missing access_token")
88
+ return(NULL)
89
+ }
90
+ }, error = function(e) {
91
+ warning(paste("OAuth: Error exchanging code for token -", e$message))
92
+ return(NULL)
93
+ })
94
+ }
95
+
96
+ # Get user info from Hugging Face
97
+ get_user_info <- function(access_token) {
98
+ if (is.null(access_token)) {
99
+ return(NULL)
100
+ }
101
+
102
+ tryCatch({
103
+ response <- httr2::request(HF_USER_INFO_URL) %>%
104
+ httr2::req_headers(
105
+ Authorization = paste("Bearer", access_token)
106
+ ) %>%
107
+ httr2::req_perform()
108
+
109
+ user_info <- httr2::resp_body_json(response)
110
+
111
+ if (!is.null(user_info$id)) {
112
+ print(paste("OAuth: Retrieved user info for", user_info$name))
113
+ return(list(
114
+ hf_user_id = user_info$id,
115
+ hf_username = user_info$name,
116
+ email = user_info$email,
117
+ avatar_url = user_info$avatarUrl,
118
+ is_pro = user_info$isPro %||% FALSE
119
+ ))
120
+ } else {
121
+ warning("OAuth: User info response missing required fields")
122
+ return(NULL)
123
+ }
124
+ }, error = function(e) {
125
+ warning(paste("OAuth: Error getting user info -", e$message))
126
+ return(NULL)
127
+ })
128
+ }
129
+
130
+ # Helper function for NULL coalescing
131
+ `%||%` <- function(a, b) if (is.null(a)) b else a
132
+
133
+ # Validate OAuth state to prevent CSRF attacks
134
+ generate_oauth_state <- function() {
135
+ paste0(sample(c(letters, LETTERS, 0:9), 32, replace = TRUE), collapse = "")
136
+ }
137
+
138
+ validate_oauth_state <- function(received_state, stored_state) {
139
+ if (is.null(received_state) || is.null(stored_state)) {
140
+ return(FALSE)
141
+ }
142
+ return(received_state == stored_state)
143
+ }
144
+
145
+ # Check if user is authenticated in session
146
+ is_authenticated <- function(session) {
147
+ user_data <- session$userData$hf_user
148
+ return(!is.null(user_data) && !is.null(user_data$hf_user_id))
149
+ }
150
+
151
+ # Get current authenticated user from session
152
+ get_current_user <- function(session) {
153
+ if (is_authenticated(session)) {
154
+ return(session$userData$hf_user)
155
+ }
156
+ return(NULL)
157
+ }
158
+
159
+ # Clear authentication session
160
+ logout_user <- function(session) {
161
+ session$userData$hf_user <- NULL
162
+ session$userData$access_token <- NULL
163
+ print("OAuth: User logged out")
164
+ }
codebase_analysis.md DELETED
@@ -1,153 +0,0 @@
1
- # How to Run the Application
2
-
3
- To run this R Shiny application, you will need R and the RStudio IDE (recommended) or another R environment installed on your system. You will also need the `shiny` package and other packages listed as dependencies (`readxl`, `DT`, `dplyr`, `shinythemes`).
4
-
5
- **Steps:**
6
-
7
- 1. **Install R and RStudio:** If you haven't already, download and install R from [CRAN](https://cran.r-project.org/) and RStudio Desktop from [Posit](https://posit.co/download/rstudio-desktop/).
8
- 2. **Install Required R Packages:** Open R or RStudio and run the following commands in the R console:
9
- ```R
10
- install.packages(c("shiny", "readxl", "DT", "dplyr", "shinythemes"))
11
- ```
12
- 3. **Set Working Directory:** Navigate your R session's working directory to the root folder of this Shiny application (the folder containing `server.R` and `ui.R`). In RStudio, you can do this by opening either `server.R` or `ui.R` and then going to `Session > Set Working Directory > To Source File Location`.
13
- 4. **Run the App:** In the R console, execute the following command:
14
- ```R
15
- shiny::runApp()
16
- ```
17
- Alternatively, if you have `server.R` or `ui.R` open in RStudio, a "Run App" button will typically appear at the top of the editor pane, which you can click.
18
-
19
- This will launch the application in your default web browser.
20
-
21
- ---
22
-
23
- # Codebase Analysis: TaijiChat Shiny Application
24
-
25
- ## Overview
26
-
27
- The codebase consists of an R Shiny application designed to explore and visualize bioinformatics data related to T cell states and transcription factors (TFs). It appears to be a companion tool for a research publication, aiming to make complex datasets accessible. The application is structured into two main files: `server.R` (server-side logic) and `ui.R` (user interface definition). Data is primarily loaded from Excel files and images stored in a `www/` subdirectory.
28
-
29
- ## File Breakdown
30
-
31
- ### `server.R` (Server Logic)
32
-
33
- **Key Functionalities:**
34
-
35
- 1. **Data Loading and Preprocessing:**
36
- * Loads multiple Excel datasets for TF PageRank scores, TF wave analysis, TF-TF correlations, TF communities, and multi-omics data. These files are located in `www/tablePagerank/`, `www/waveanalysis/`, `www/TFcorintextrm/`, and `www/tfcommunities/`.
37
- * `new_read_excel_file()`: Reads and transposes Excel files, setting "Regulator Names" from the first column and using the original first row as new column headers.
38
- * `new_filter_data()`: Filters transposed dataframes by column names based on user search input (supports multiple comma-separated, case-insensitive keywords).
39
-
40
- 2. **TF Catalog Data Display (Repetitive Structure):**
41
- * Handles data for Overall TF PageRank, Naive, TE, MP, TCM, TEM, TRM, TEXprog, TEXeff-like, and TEXterm cell states.
42
- * For each dataset:
43
- * Uses `reactiveVal` for column pagination state (4 columns per page).
44
- * `observeEvent`s for "next" and "previous" button functionality.
45
- * Reactive expressions filter data by search term and select columns for the current page.
46
- * Dynamically inserts a styled "Cell state data" row with "TF activity score" (at row index 2 for main PageRank table, row index 0 for others).
47
- * `renderDT` outputs `DT::datatable` with custom options (fixed 45 rows, no search box, JS `rowCallback` to highlight the "TF activity score" row).
48
-
49
- 3. **TF Wave Analysis:**
50
- * Loads TF wave data from `www/waveanalysis/searchtfwaves.xlsx`.
51
- * Allows users to search for a TF and view its associated wave(s) in a transposed table.
52
-
53
- 4. **TF-TF Correlation in TRM/TEXterm:**
54
- * Loads data from `www/TFcorintextrm/TF-TFcorTRMTEX.xlsx`.
55
- * Allows TF search.
56
- * Renders a clickable list of TFs (`actionLink`s).
57
- * Displays tabular data and an associated image ("TF Merged Graph Path") for the selected/searched TF.
58
-
59
- 5. **TF Communities:**
60
- * Loads data from `www/tfcommunities/trmcommunities.xlsx` and `www/tfcommunities/texcommunities.xlsx`.
61
- * Displays them as simple `DT::datatable` objects.
62
-
63
- 6. **Multi-omics Data Table:**
64
- * Loads data from `www/multi-omicsdata.xlsx`.
65
- * Renders as a `DT::datatable`, creating hyperlinks in the "Author" column from a "DOI" column, removing empty columns, and enabling scrolling.
66
-
67
- 7. **Navigation & Other:**
68
- * `observeEvent`s for UI element clicks (e.g., `input$c1_link`) to navigate tabs via `updateNavbarPage`.
69
- * Redirects to a bioRxiv paper URL via `session$sendCustomMessage`.
70
- * Contains significant commented-out code (older logic).
71
-
72
- **Libraries Used:** `shiny`, `readxl`, `DT`, `dplyr`.
73
-
74
- ### `ui.R` (User Interface)
75
-
76
- **Key Functionalities:**
77
-
78
- 1. **Overall Structure:**
79
- * Uses `shinytheme("flatly")`.
80
- * `navbarPage` for the main tabbed interface.
81
- * Custom CSS for fonts (`Arial`).
82
- * JavaScript for URL redirection and a modal dialog.
83
-
84
- 2. **Home Tab:**
85
- * Project/study description.
86
- * Layout with an image (`homedesc.png`) featuring clickable `actionLink`s for navigation.
87
- * "Read Now" button linking to the research paper.
88
- * Footer with lab links and logos.
89
-
90
- 3. **TF Catalog (`navbarMenu`):**
91
- * **"Search TF Scores" Tab:**
92
- * Explanatory text, image (`tfcat/onlycellstates.png`).
93
- * Search input (`search_input`), column pagination buttons (`prev_btn`, `next_btn`), `DTOutput("table")`.
94
- * **"Cell State Specific TF Catalog" Tab (`navlistPanel`):**
95
- * Sub-tabs for Naive, TE, MP, Tcm, Tem, Trm, TEXprog, TEXeff-like, TEXterm.
96
- * Each sub-tab has a consistent layout: header, text, a specific bubble plot image (from `www/bubbleplots/`), search input, pagination buttons, and `DTOutput`.
97
- * **"Multi-State TFs" Tab:** Displays a heatmap image (`tfcat/multistatesheatmap.png`).
98
-
99
- 4. **TF Wave Analysis (`navbarMenu`):**
100
- * **"Overview" Tab:**
101
- * Explanatory text, overview image (`tfwaveanal.png`).
102
- * Clickable images (`waveanalysis/c1.jpg` to `c6.jpg`, linked via `c1_link` etc.) for navigation to detail tabs.
103
- * Search input (`search_input_wave`), `DTOutput("table_wave")`.
104
- * **Individual Wave Tabs ("Wave 1" to "Wave 7"):**
105
- * Each tab displays the wave image, a GO KEGG result image, and "Ranked Text" image(s) from `www/waveanalysis/` and `www/waveanalysis/txtJPG/`.
106
-
107
- 5. **TF Network Analysis (`navbarMenu`):**
108
- * **"Search TF-TF correlation in TRM/TEXterm" Tab:**
109
- * Methodology description, image (`networkanalysis/tfcorrdesc.png`).
110
- * `sidebarLayout` with search input (`search`), button (`search_btn`), `tableOutput("gene_list_table")` for available TFs.
111
- * `mainPanel` with `tableOutput("result_table")`, legend, and `uiOutput("image_gallery")`.
112
- * Footer with citations.
113
- * **"TRM/TEXterm TF communities" Tab:**
114
- * Descriptive text, images (`networkanalysis/community.jpg`, `networkanalysis/trmtexcom.png`, `networkanalysis/tfcompathway.png`).
115
- * Two `DTOutput`s (`trmcom`, `texcom`) for community tables.
116
- * Footer with citations.
117
-
118
- 6. **Multi-omics Data Tab:**
119
- * Header, text, `dataTableOutput("multiomicsdatatable")`.
120
-
121
- 7. **Global Header Elements:**
122
- * Defines a modal dialog and associated JavaScript (triggered by an element `#csdescrip_link`, not explicitly found in the provided UI snippets for the main content area).
123
- * JavaScript to send a Shiny input upon `#c1_link` click.
124
-
125
- **Libraries Used:** `shiny`, `shinythemes`, `DT`.
126
-
127
- ## General Architecture and Observations
128
-
129
- * **Purpose:** The application serves as an interactive data exploration tool, likely accompanying a scientific publication on T cell biology.
130
- * **Data Source:** Heavily reliant on pre-processed data stored in Excel files and pre-generated images within the `www/` directory. This indicates that the core data processing happens outside this Shiny app.
131
- * **Repetitive Code Structure:** Significant code duplication exists in both `server.R` and `ui.R`.
132
- * In `server.R`, the logic for loading, filtering, paginating, and rendering tables for the nine different cell state TF scores is nearly identical.
133
- * In `ui.R`, the layout for each of these cell state specific tabs, and also for each of the seven individual TF wave analysis tabs, is highly repetitive.
134
- * This repetition suggests a strong opportunity for refactoring by creating reusable R functions or Shiny modules to generate these UI and server components dynamically.
135
- * **User Interface (UI):** The UI is well-structured with a `navbarPage` and logical tab groupings. It provides good contextual information (descriptions, explanations of scores/plots) for users.
136
- * **Interactivity:**
137
- * Search functionality for TFs/regulators across various datasets.
138
- * Custom column-based pagination for wide tables.
139
- * Clickable images and links for navigation between sections.
140
- * Dynamic display of tables and images based on user selections.
141
- * **Modularity (Potential):** While not heavily modularized currently due to repetition, the distinct analytical sections (TF Catalog, Wave Analysis, Network Analysis) could be prime candidates for separation into modules if the application were to be expanded or refactored.
142
- * **Static Content:** A significant portion of the content, especially in the Wave Analysis and Network Analysis tabs, involves displaying pre-generated static images (plots, pathway results).
143
- * **Code Graveyard:** Both files end with a "CODE GRAVEYARD" comment, indicating that there's older, unused code present.
144
-
145
- ## Potential Areas for Improvement/Refactoring
146
-
147
- * **Modularization:** Encapsulate the repetitive UI and server logic for cell-state specific tables and individual wave pages into functions or Shiny modules to reduce code duplication and improve maintainability.
148
- * **Dynamic Image Generation (Optional):** If source data and plotting scripts were available, some images currently served statically could potentially be generated dynamically, offering more flexibility. However, for a publication companion app, static images are often sufficient and ensure reproducibility of figures.
149
- * **Consolidate Helper Functions:** General utility functions (like `new_read_excel_file` and `new_filter_data`) are well-defined but ensure they are used consistently.
150
- * **CSS Styling:** Centralize CSS styling rather than relying heavily on inline `style` attributes within `tags$div` and other elements, potentially using a separate CSS file.
151
- * **Modal Trigger:** Clarify or ensure the `#csdescrip_link` element, which triggers the global modal, is present and functional in the UI.
152
-
153
- This analysis provides a snapshot of the codebase's structure, functionality, and potential areas for future development or refinement.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
database_schema.sql ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ -- Supabase Database Schema for TaijiChat
2
+ -- Execute this SQL in your Supabase project to create the required tables
3
+
4
+ -- Users table
5
+ -- Stores user information and token quota
6
+ CREATE TABLE IF NOT EXISTS users (
7
+ id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
8
+ hf_user_id TEXT UNIQUE NOT NULL,
9
+ hf_username TEXT NOT NULL,
10
+ email TEXT,
11
+ token_quota INTEGER DEFAULT 100000,
12
+ tokens_used INTEGER DEFAULT 0,
13
+ created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
14
+ last_login TIMESTAMP WITH TIME ZONE,
15
+ is_active BOOLEAN DEFAULT TRUE
16
+ );
17
+
18
+ -- Usage logs table
19
+ -- Stores comprehensive logs of every query with token usage and errors
20
+ CREATE TABLE IF NOT EXISTS usage_logs (
21
+ id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
22
+ user_id UUID REFERENCES users(id) ON DELETE SET NULL,
23
+ hf_user_id TEXT NOT NULL,
24
+ timestamp TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
25
+ query_text TEXT NOT NULL,
26
+ prompt_tokens INTEGER DEFAULT 0,
27
+ completion_tokens INTEGER DEFAULT 0,
28
+ total_tokens INTEGER DEFAULT 0,
29
+ model TEXT,
30
+ response_text TEXT,
31
+ error_message TEXT,
32
+ conversation_history JSONB,
33
+ is_image_response BOOLEAN DEFAULT FALSE,
34
+ image_path TEXT
35
+ );
36
+
37
+ -- Create indexes for performance
38
+ CREATE INDEX IF NOT EXISTS idx_users_hf_id ON users(hf_user_id);
39
+ CREATE INDEX IF NOT EXISTS idx_users_active ON users(is_active);
40
+ CREATE INDEX IF NOT EXISTS idx_logs_user_id ON usage_logs(user_id);
41
+ CREATE INDEX IF NOT EXISTS idx_logs_hf_user_id ON usage_logs(hf_user_id);
42
+ CREATE INDEX IF NOT EXISTS idx_logs_timestamp ON usage_logs(timestamp DESC);
43
+ CREATE INDEX IF NOT EXISTS idx_logs_error ON usage_logs(error_message) WHERE error_message IS NOT NULL;
44
+
45
+ -- Create a view for user statistics
46
+ CREATE OR REPLACE VIEW user_stats AS
47
+ SELECT
48
+ u.id,
49
+ u.hf_user_id,
50
+ u.hf_username,
51
+ u.token_quota,
52
+ u.tokens_used,
53
+ u.token_quota - u.tokens_used AS tokens_remaining,
54
+ ROUND(100.0 * u.tokens_used / NULLIF(u.token_quota, 0), 2) AS usage_percentage,
55
+ COUNT(l.id) AS total_queries,
56
+ COUNT(CASE WHEN l.error_message IS NOT NULL THEN 1 END) AS error_count,
57
+ MAX(l.timestamp) AS last_query_time
58
+ FROM users u
59
+ LEFT JOIN usage_logs l ON u.id = l.user_id
60
+ GROUP BY u.id, u.hf_user_id, u.hf_username, u.token_quota, u.tokens_used;
61
+
62
+ -- Enable Row Level Security (RLS) - Optional, uncomment if needed
63
+ -- ALTER TABLE users ENABLE ROW LEVEL SECURITY;
64
+ -- ALTER TABLE usage_logs ENABLE ROW LEVEL SECURITY;
65
+
66
+ -- Create policies for RLS (if needed)
67
+ -- CREATE POLICY "Users can view own data" ON users FOR SELECT USING (hf_user_id = auth.jwt() ->> 'sub');
68
+ -- CREATE POLICY "Users can view own logs" ON usage_logs FOR SELECT USING (hf_user_id = auth.jwt() ->> 'sub');
69
+
70
+ COMMENT ON TABLE users IS 'Stores user authentication and token quota information';
71
+ COMMENT ON TABLE usage_logs IS 'Logs every query with token usage, response, and errors';
72
+ COMMENT ON VIEW user_stats IS 'Provides aggregated statistics for each user';
plan_temp.txt DELETED
@@ -1,30 +0,0 @@
1
- i dont think that's reasonable. here's my plan and you can compare current agents against the plan. Correct current implementation to align with my plan:
2
-
3
- For every query, the generation agent go through the steps:
4
- if a dataset, an image, or a paper is provided, add them when creating chat completion. If not, proceed to step 1.
5
-
6
- 1. analyze query
7
- 2. analyze the conversation history if there's any
8
- 3. analyze images, paper, data according to the plan if there's any provided with chat completion.
9
- 4. analyze the error from previous attempt is there's any
10
- 5. read the paper description short version to understand what the paper is about
11
- 6. decide whether the user query can be answered directly or need more information from the paper; if yes, read it
12
- 7. read the tools documentation
13
- 8. decide which tools can be helpful when answering the query; if there are any, prepare the list of tools going to be used
14
- 9. read the data documentation
15
- 10. decide which datasets are relevant to the user query; if there are any, prepare the list of datasets going to be used
16
- 11. decide whether the user query can be solved by paper or tools or data or a combnation of them, if not, prepare a signal NEED_CODING = TRUE but dont send it yet. if not move to the next step
17
- 12. decide whether the user query is about image(s). if so, prepare a list of images needed.
18
- 13. put everything together to make a plan
19
- - this process of thinking must be included in generation agent's LLM's output. it will be used to
20
-
21
- Supervisor agent reviews the plan, focusing on the code and check for suspicious, malicious behavior. only common packages import are allowed
22
-
23
- executor agent executes the plan if the plan contains tool execution or code
24
-
25
- manager records everything from all LLMs and users, and deem whether the user's query can be considered as answered. Note that if agents only propose a plan but the results are not gathered yet it cannot be consider as a proper answer - as in most cases where generation agent propose a plan in iteration 1. if the manager agent deems that a plan is proposed, but results not collected / plan not executed and there's no error from the LLM, then manager agent tells generation agent to initialize a different chat completion with the images, datasets requested by generation's plan. This attempt instructed by manager will be different from a normal attempt, it does not count to the allowed attempt count.
26
-
27
- if an error occurs in any stage, the error must be reported to the manager. the manager will record all the errors. once an error is detected, another attempt will start. and we go back to the generation agent step again. there will be 3 attempts allowed
28
-
29
- tell me whether you think my plan is clear and reasonable. if there any part missing or problematic
30
- if not, proceed to implementation
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
requirements.txt CHANGED
@@ -15,7 +15,10 @@ feedparser
15
  tqdm
16
  pydantic
17
  pillow
 
 
18
  # shinyjs # This is an R package, should be installed via install.packages() in R
19
 
20
  # R package dependencies (ensure these are installed in your R environment)
21
- # digest # Used for caching in R/caching.R
 
 
15
  tqdm
16
  pydantic
17
  pillow
18
+ supabase>=2.0.0
19
+ python-dotenv>=1.0.0
20
  # shinyjs # This is an R package, should be installed via install.packages() in R
21
 
22
  # R package dependencies (ensure these are installed in your R environment)
23
+ # digest # Used for caching in R/caching.R
24
+ # httr2 # Required for OAuth authentication
server.R CHANGED
@@ -7,12 +7,19 @@ library(dplyr)
7
  # Source the warning overlay and long operations code
8
  source("warning_overlay.R", local = TRUE)
9
  source("long_operations.R", local = TRUE)
 
 
10
 
11
  # setwd("/Users/audrey/Downloads/research/ckweb/Tcellstates")
12
 
13
  # Define server logic
14
  function(input, output, session) {
15
-
 
 
 
 
 
16
  # --- START: TaijiChat R Callback for Python Agent Thoughts ---
17
  python_agent_thought_callback <- function(thought_message_from_python) {
18
  # Attempt to explicitly convert to R character and clean up
@@ -112,9 +119,14 @@ if 'agents.manager_agent' in sys.modules:
112
  # Module is available, now try to instantiate the agent
113
  if (!is.null(py_openai_client_instance)) {
114
  tryCatch({
 
 
115
  agent_inst <- current_manager_agent_module$ManagerAgent(
116
  openai_client = py_openai_client_instance,
117
- r_callback_fn = python_agent_thought_callback # Pass the R callback here
 
 
 
118
  )
119
  rv_agent_instance(agent_inst)
120
  print("TaijiChat: Python ManagerAgent instance created in server.R using pre-initialized client and R callback.")
@@ -124,9 +136,14 @@ if 'agents.manager_agent' in sys.modules:
124
  })
125
  } else if (!is.null(api_key_val)) { # Try with API key if client object failed but key exists
126
  tryCatch({
 
 
127
  agent_inst <- current_manager_agent_module$ManagerAgent(
128
  openai_api_key = api_key_val,
129
- r_callback_fn = python_agent_thought_callback # Pass the R callback here
 
 
 
130
  )
131
  rv_agent_instance(agent_inst)
132
  print("TaijiChat: Python ManagerAgent instance created in server.R with API key and R callback (client to be init by Python).")
@@ -145,7 +162,13 @@ if 'agents.manager_agent' in sys.modules:
145
  print("TaijiChat: agents.manager_agent module is NULL after import attempt. Agent not created.")
146
  }
147
  # --- END: TaijiChat Agent Initialization ---
148
-
 
 
 
 
 
 
149
  # Server logic for home tab
150
  output$home <- renderText({
151
  "Welcome to the Home page"
@@ -2078,16 +2101,41 @@ if 'agents.manager_agent' in sys.modules:
2078
  chat_history <- reactiveVal(list()) # Stores list of lists: list(role="user/assistant", content="message")
2079
 
2080
  observeEvent(input$user_chat_message, {
2081
- req(input$user_chat_message)
2082
  user_message_text <- trimws(input$user_chat_message)
2083
  print(paste("TaijiChat: Received user_chat_message -", user_message_text))
2084
 
2085
  if (nzchar(user_message_text)) {
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2086
  current_hist <- chat_history()
2087
  updated_hist_user <- append(current_hist, list(list(role = "user", content = user_message_text)))
2088
  chat_history(updated_hist_user)
2089
 
2090
- agent_instance_val <- rv_agent_instance()
 
 
 
 
 
 
 
 
 
2091
 
2092
  if (!is.null(agent_instance_val)) {
2093
  # Ensure history is a list of R named lists, then r_to_py will convert to list of Python dicts
 
7
  # Source the warning overlay and long operations code
8
  source("warning_overlay.R", local = TRUE)
9
  source("long_operations.R", local = TRUE)
10
+ source("auth/hf_oauth.R", local = TRUE)
11
+ source("utils/supabase_r.R", local = TRUE)
12
 
13
  # setwd("/Users/audrey/Downloads/research/ckweb/Tcellstates")
14
 
15
  # Define server logic
16
  function(input, output, session) {
17
+
18
+ # --- START: OAuth and Supabase Initialization ---
19
+ oauth_config <- initialize_oauth()
20
+ supabase_client <- initialize_supabase()
21
+ # --- END: OAuth and Supabase Initialization ---
22
+
23
  # --- START: TaijiChat R Callback for Python Agent Thoughts ---
24
  python_agent_thought_callback <- function(thought_message_from_python) {
25
  # Attempt to explicitly convert to R character and clean up
 
119
  # Module is available, now try to instantiate the agent
120
  if (!is.null(py_openai_client_instance)) {
121
  tryCatch({
122
+ supabase_py_client <- if (!is.null(supabase_client)) supabase_client else NULL
123
+
124
  agent_inst <- current_manager_agent_module$ManagerAgent(
125
  openai_client = py_openai_client_instance,
126
+ r_callback_fn = python_agent_thought_callback,
127
+ supabase_client = supabase_py_client,
128
+ user_id = NULL,
129
+ hf_user_id = NULL
130
  )
131
  rv_agent_instance(agent_inst)
132
  print("TaijiChat: Python ManagerAgent instance created in server.R using pre-initialized client and R callback.")
 
136
  })
137
  } else if (!is.null(api_key_val)) { # Try with API key if client object failed but key exists
138
  tryCatch({
139
+ supabase_py_client <- if (!is.null(supabase_client)) supabase_client else NULL
140
+
141
  agent_inst <- current_manager_agent_module$ManagerAgent(
142
  openai_api_key = api_key_val,
143
+ r_callback_fn = python_agent_thought_callback,
144
+ supabase_client = supabase_py_client,
145
+ user_id = NULL,
146
+ hf_user_id = NULL
147
  )
148
  rv_agent_instance(agent_inst)
149
  print("TaijiChat: Python ManagerAgent instance created in server.R with API key and R callback (client to be init by Python).")
 
162
  print("TaijiChat: agents.manager_agent module is NULL after import attempt. Agent not created.")
163
  }
164
  # --- END: TaijiChat Agent Initialization ---
165
+
166
+ # --- START: OAuth Callback Handler ---
167
+ # Note: OAuth flow will be fully implemented when ui.R login UI is added
168
+ # This handler processes OAuth callback and creates/retrieves user in Supabase
169
+ # For now, this is a placeholder for future OAuth integration
170
+ # --- END: OAuth Callback Handler ---
171
+
172
  # Server logic for home tab
173
  output$home <- renderText({
174
  "Welcome to the Home page"
 
2101
  chat_history <- reactiveVal(list()) # Stores list of lists: list(role="user/assistant", content="message")
2102
 
2103
  observeEvent(input$user_chat_message, {
2104
+ req(input$user_chat_message)
2105
  user_message_text <- trimws(input$user_chat_message)
2106
  print(paste("TaijiChat: Received user_chat_message -", user_message_text))
2107
 
2108
  if (nzchar(user_message_text)) {
2109
+ # Check authentication (implement OAuth later in ui.R)
2110
+ # For now, system works without auth, but logs as "anonymous"
2111
+ current_user <- session$userData$hf_user
2112
+ hf_user_id <- if (!is.null(current_user)) current_user$hf_user_id else "anonymous"
2113
+
2114
+ # Check quota before processing
2115
+ if (!is.null(supabase_client) && !is.null(current_user)) {
2116
+ quota_result <- check_user_quota(supabase_client, hf_user_id)
2117
+ if (!quota_result$has_quota) {
2118
+ session$sendCustomMessage(type = "agent_response", message = list(
2119
+ text = paste("Token quota exceeded. Used:", quota_result$tokens_used, "Remaining: 0")
2120
+ ))
2121
+ return()
2122
+ }
2123
+ }
2124
+
2125
  current_hist <- chat_history()
2126
  updated_hist_user <- append(current_hist, list(list(role = "user", content = user_message_text)))
2127
  chat_history(updated_hist_user)
2128
 
2129
+ agent_instance_val <- rv_agent_instance()
2130
+
2131
+ # Set user context in agent
2132
+ if (!is.null(agent_instance_val) && !is.null(current_user)) {
2133
+ supabase_user <- session$userData$supabase_user
2134
+ agent_instance_val$set_user_context(
2135
+ user_id = supabase_user$id,
2136
+ hf_user_id = hf_user_id
2137
+ )
2138
+ }
2139
 
2140
  if (!is.null(agent_instance_val)) {
2141
  # Ensure history is a list of R named lists, then r_to_py will convert to list of Python dicts
tools/agent_tools.py CHANGED
@@ -1064,10 +1064,22 @@ def describe_image(file_id: str, api_key: str = None) -> str:
1064
  max_tokens=1000,
1065
  temperature=0.2 # Lower temperature for more accurate descriptions
1066
  )
1067
-
 
 
 
 
 
 
 
 
 
 
 
 
1068
  # Extract the description from the response
1069
  description = response.choices[0].message.content
1070
-
1071
  return description
1072
 
1073
  except Exception as e:
 
1064
  max_tokens=1000,
1065
  temperature=0.2 # Lower temperature for more accurate descriptions
1066
  )
1067
+
1068
+ # Capture token usage
1069
+ if hasattr(response, 'usage') and response.usage:
1070
+ usage_info = {
1071
+ 'prompt_tokens': response.usage.prompt_tokens,
1072
+ 'completion_tokens': response.usage.completion_tokens,
1073
+ 'total_tokens': response.usage.total_tokens
1074
+ }
1075
+ # Store usage in global collector if available (set by ExecutorAgent)
1076
+ import builtins
1077
+ if hasattr(builtins, '__agent_usage_collector__'):
1078
+ builtins.__agent_usage_collector__.append(usage_info)
1079
+
1080
  # Extract the description from the response
1081
  description = response.choices[0].message.content
1082
+
1083
  return description
1084
 
1085
  except Exception as e:
ui.R CHANGED
@@ -1466,6 +1466,26 @@ ui <- navbarPage(
1466
  # Adding header with modal and JS
1467
  header = tags$div(
1468
  chatSidebarUI(),
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1469
  # Modal dialog to display the expanded image
1470
  tags$div(
1471
  id = "modalDialog",
@@ -1511,7 +1531,38 @@ ui <- navbarPage(
1511
  });
1512
  "
1513
  )
1514
- )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1515
  )
1516
  )
1517
 
 
1466
  # Adding header with modal and JS
1467
  header = tags$div(
1468
  chatSidebarUI(),
1469
+
1470
+ # Login overlay for authentication
1471
+ tags$div(
1472
+ id = "authOverlay",
1473
+ style = "display: none; position: fixed; top: 0; left: 0; width: 100%; height: 100%;
1474
+ background-color: rgba(0, 0, 0, 0.7); z-index: 9999; justify-content: center; align-items: center;",
1475
+ tags$div(
1476
+ style = "background-color: white; padding: 40px; border-radius: 10px; text-align: center; max-width: 400px;",
1477
+ tags$h2("Welcome to TaijiChat"),
1478
+ tags$p("Please sign in with your Hugging Face account to continue."),
1479
+ tags$br(),
1480
+ tags$a(
1481
+ href = "/login/huggingface",
1482
+ class = "btn btn-primary btn-lg",
1483
+ style = "background-color: #ff9d00; border-color: #ff9d00;",
1484
+ "Sign in with Hugging Face"
1485
+ )
1486
+ )
1487
+ ),
1488
+
1489
  # Modal dialog to display the expanded image
1490
  tags$div(
1491
  id = "modalDialog",
 
1531
  });
1532
  "
1533
  )
1534
+ ),
1535
+
1536
+ # Authentication overlay control JavaScript
1537
+ tags$script(
1538
+ HTML(
1539
+ "
1540
+ // Check authentication state and show/hide overlay
1541
+ Shiny.addCustomMessageHandler('auth_state', function(message) {
1542
+ var overlay = document.getElementById('authOverlay');
1543
+ if (message.authenticated) {
1544
+ overlay.style.display = 'none';
1545
+ } else {
1546
+ overlay.style.display = 'flex';
1547
+ }
1548
+ });
1549
+
1550
+ // Handle OAuth callback
1551
+ $(document).ready(function() {
1552
+ var urlParams = new URLSearchParams(window.location.search);
1553
+ if (urlParams.has('code')) {
1554
+ var code = urlParams.get('code');
1555
+ Shiny.setInputValue('oauth_code', code, {priority: 'event'});
1556
+ // Clean URL
1557
+ window.history.replaceState({}, document.title, window.location.pathname);
1558
+ }
1559
+ });
1560
+ "
1561
+ )
1562
+ ),
1563
+
1564
+ # Auth state output (server will populate this)
1565
+ uiOutput("auth_state_ui")
1566
  )
1567
  )
1568
 
utils/__init__.py ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ # utils/__init__.py
2
+ # Utility modules for TaijiChat
utils/supabase_client.py ADDED
@@ -0,0 +1,259 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # utils/supabase_client.py
2
+ """
3
+ Supabase client for TaijiChat
4
+ Handles user management, quota tracking, and usage logging
5
+ """
6
+
7
+ import os
8
+ from datetime import datetime
9
+ from typing import Optional, Dict, List, Tuple
10
+ from supabase import create_client, Client
11
+ import json
12
+
13
+
14
+ class SupabaseClient:
15
+ """Client for interacting with Supabase database"""
16
+
17
+ def __init__(self, supabase_url: Optional[str] = None, supabase_key: Optional[str] = None):
18
+ """
19
+ Initialize Supabase client
20
+
21
+ Args:
22
+ supabase_url: Supabase project URL (defaults to SUPABASE_URL env var)
23
+ supabase_key: Supabase service role key (defaults to SUPABASE_KEY env var)
24
+ """
25
+ self.supabase_url = supabase_url or os.getenv('SUPABASE_URL')
26
+ self.supabase_key = supabase_key or os.getenv('SUPABASE_KEY')
27
+
28
+ if not self.supabase_url or not self.supabase_key:
29
+ print("WARNING: Supabase credentials not configured. Logging will be disabled.")
30
+ self.client = None
31
+ else:
32
+ try:
33
+ self.client: Client = create_client(self.supabase_url, self.supabase_key)
34
+ print("SupabaseClient: Successfully initialized")
35
+ except Exception as e:
36
+ print(f"SupabaseClient: Failed to initialize - {e}")
37
+ self.client = None
38
+
39
+ def is_enabled(self) -> bool:
40
+ """Check if Supabase client is properly configured"""
41
+ return self.client is not None
42
+
43
+ def get_or_create_user(self, hf_user_id: str, hf_username: str, email: Optional[str] = None) -> Optional[Dict]:
44
+ """
45
+ Get existing user or create new user
46
+
47
+ Args:
48
+ hf_user_id: Hugging Face user ID
49
+ hf_username: Hugging Face username
50
+ email: User email (optional)
51
+
52
+ Returns:
53
+ User record dict or None if error
54
+ """
55
+ if not self.is_enabled():
56
+ return None
57
+
58
+ try:
59
+ # Check if user exists
60
+ response = self.client.table('users').select('*').eq('hf_user_id', hf_user_id).execute()
61
+
62
+ if response.data and len(response.data) > 0:
63
+ # User exists, update last_login
64
+ user = response.data[0]
65
+ self.client.table('users').update({
66
+ 'last_login': datetime.utcnow().isoformat()
67
+ }).eq('id', user['id']).execute()
68
+ print(f"SupabaseClient: User {hf_username} logged in")
69
+ return user
70
+ else:
71
+ # Create new user
72
+ new_user = {
73
+ 'hf_user_id': hf_user_id,
74
+ 'hf_username': hf_username,
75
+ 'email': email,
76
+ 'token_quota': 100000, # Default quota
77
+ 'tokens_used': 0,
78
+ 'last_login': datetime.utcnow().isoformat(),
79
+ 'is_active': True
80
+ }
81
+ response = self.client.table('users').insert(new_user).execute()
82
+ if response.data:
83
+ print(f"SupabaseClient: Created new user {hf_username}")
84
+ return response.data[0]
85
+ else:
86
+ print(f"SupabaseClient: Failed to create user - no data returned")
87
+ return None
88
+ except Exception as e:
89
+ print(f"SupabaseClient: Error in get_or_create_user - {e}")
90
+ return None
91
+
92
+ def check_quota(self, hf_user_id: str) -> Tuple[bool, int, int]:
93
+ """
94
+ Check if user has tokens remaining in quota
95
+
96
+ Args:
97
+ hf_user_id: Hugging Face user ID
98
+
99
+ Returns:
100
+ Tuple of (has_quota: bool, tokens_remaining: int, tokens_used: int)
101
+ """
102
+ if not self.is_enabled():
103
+ return (True, 999999, 0) # Allow unlimited if Supabase disabled
104
+
105
+ try:
106
+ response = self.client.table('users').select('token_quota, tokens_used').eq('hf_user_id', hf_user_id).execute()
107
+
108
+ if response.data and len(response.data) > 0:
109
+ user = response.data[0]
110
+ quota = user.get('token_quota', 100000)
111
+ used = user.get('tokens_used', 0)
112
+ remaining = quota - used
113
+ has_quota = remaining > 0
114
+ return (has_quota, remaining, used)
115
+ else:
116
+ print(f"SupabaseClient: User not found for quota check")
117
+ return (False, 0, 0)
118
+ except Exception as e:
119
+ print(f"SupabaseClient: Error checking quota - {e}")
120
+ return (True, 999999, 0) # Fail open to allow usage if DB error
121
+
122
+ def update_token_usage(self, hf_user_id: str, tokens_to_add: int) -> bool:
123
+ """
124
+ Increment user's token usage
125
+
126
+ Args:
127
+ hf_user_id: Hugging Face user ID
128
+ tokens_to_add: Number of tokens to add to usage
129
+
130
+ Returns:
131
+ True if successful, False otherwise
132
+ """
133
+ if not self.is_enabled():
134
+ return True
135
+
136
+ try:
137
+ # Get current usage
138
+ response = self.client.table('users').select('id, tokens_used').eq('hf_user_id', hf_user_id).execute()
139
+
140
+ if response.data and len(response.data) > 0:
141
+ user = response.data[0]
142
+ new_usage = user.get('tokens_used', 0) + tokens_to_add
143
+
144
+ # Update usage
145
+ self.client.table('users').update({
146
+ 'tokens_used': new_usage
147
+ }).eq('id', user['id']).execute()
148
+
149
+ print(f"SupabaseClient: Updated token usage for user {hf_user_id} - added {tokens_to_add} tokens")
150
+ return True
151
+ else:
152
+ print(f"SupabaseClient: User not found for token update")
153
+ return False
154
+ except Exception as e:
155
+ print(f"SupabaseClient: Error updating token usage - {e}")
156
+ return False
157
+
158
+ def log_usage(self,
159
+ hf_user_id: str,
160
+ query_text: str,
161
+ user_id: Optional[str] = None,
162
+ prompt_tokens: int = 0,
163
+ completion_tokens: int = 0,
164
+ total_tokens: int = 0,
165
+ model: Optional[str] = None,
166
+ response_text: Optional[str] = None,
167
+ error_message: Optional[str] = None,
168
+ conversation_history: Optional[List[Dict]] = None,
169
+ is_image_response: bool = False,
170
+ image_path: Optional[str] = None) -> bool:
171
+ """
172
+ Log a query to usage_logs table
173
+
174
+ This is called IMMEDIATELY after getting a response from the agent
175
+ or when an error occurs.
176
+
177
+ Args:
178
+ hf_user_id: Hugging Face user ID (required)
179
+ query_text: User's query text (required)
180
+ user_id: UUID of user from users table (optional)
181
+ prompt_tokens: Number of prompt tokens used
182
+ completion_tokens: Number of completion tokens used
183
+ total_tokens: Total tokens used
184
+ model: Model name (e.g., "gpt-4o")
185
+ response_text: Assistant's response
186
+ error_message: Error message if query failed
187
+ conversation_history: Full conversation history as list of dicts
188
+ is_image_response: Whether response included an image
189
+ image_path: Path to image if applicable
190
+
191
+ Returns:
192
+ True if logged successfully, False otherwise
193
+ """
194
+ if not self.is_enabled():
195
+ print(f"SupabaseClient: Logging disabled, skipping log for query: {query_text[:50]}...")
196
+ return True
197
+
198
+ try:
199
+ log_entry = {
200
+ 'hf_user_id': hf_user_id,
201
+ 'user_id': user_id,
202
+ 'query_text': query_text,
203
+ 'prompt_tokens': prompt_tokens,
204
+ 'completion_tokens': completion_tokens,
205
+ 'total_tokens': total_tokens,
206
+ 'model': model,
207
+ 'response_text': response_text,
208
+ 'error_message': error_message,
209
+ 'conversation_history': json.dumps(conversation_history) if conversation_history else None,
210
+ 'is_image_response': is_image_response,
211
+ 'image_path': image_path
212
+ }
213
+
214
+ response = self.client.table('usage_logs').insert(log_entry).execute()
215
+
216
+ if response.data:
217
+ print(f"SupabaseClient: Logged usage - tokens: {total_tokens}, error: {error_message is not None}")
218
+ return True
219
+ else:
220
+ print(f"SupabaseClient: Failed to log usage - no data returned")
221
+ return False
222
+ except Exception as e:
223
+ print(f"SupabaseClient: Error logging usage - {e}")
224
+ return False
225
+
226
+ def get_user_stats(self, hf_user_id: str) -> Optional[Dict]:
227
+ """
228
+ Get user statistics from user_stats view
229
+
230
+ Args:
231
+ hf_user_id: Hugging Face user ID
232
+
233
+ Returns:
234
+ Dict with user stats or None if error
235
+ """
236
+ if not self.is_enabled():
237
+ return None
238
+
239
+ try:
240
+ response = self.client.table('user_stats').select('*').eq('hf_user_id', hf_user_id).execute()
241
+
242
+ if response.data and len(response.data) > 0:
243
+ return response.data[0]
244
+ else:
245
+ return None
246
+ except Exception as e:
247
+ print(f"SupabaseClient: Error getting user stats - {e}")
248
+ return None
249
+
250
+
251
+ # Singleton instance for easy import
252
+ _supabase_client_instance = None
253
+
254
+ def get_supabase_client() -> SupabaseClient:
255
+ """Get singleton Supabase client instance"""
256
+ global _supabase_client_instance
257
+ if _supabase_client_instance is None:
258
+ _supabase_client_instance = SupabaseClient()
259
+ return _supabase_client_instance
utils/supabase_r.R ADDED
@@ -0,0 +1,132 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # utils/supabase_r.R
2
+ # R interface to Python Supabase client
3
+
4
+ library(reticulate)
5
+
6
+ # Initialize Supabase client wrapper
7
+ initialize_supabase <- function() {
8
+ tryCatch({
9
+ # Import Python Supabase client module
10
+ supabase_module <- reticulate::import("utils.supabase_client")
11
+ supabase_client <- supabase_module$get_supabase_client()
12
+
13
+ if (!is.null(supabase_client$is_enabled()) && supabase_client$is_enabled()) {
14
+ print("R Supabase: Successfully initialized Supabase client")
15
+ return(supabase_client)
16
+ } else {
17
+ warning("R Supabase: Supabase client not properly configured")
18
+ return(NULL)
19
+ }
20
+ }, error = function(e) {
21
+ warning(paste("R Supabase: Failed to initialize -", e$message))
22
+ return(NULL)
23
+ })
24
+ }
25
+
26
+ # Get or create user
27
+ get_or_create_user <- function(supabase_client, hf_user_id, hf_username, email = NULL) {
28
+ if (is.null(supabase_client)) {
29
+ return(NULL)
30
+ }
31
+
32
+ tryCatch({
33
+ user <- supabase_client$get_or_create_user(
34
+ hf_user_id = hf_user_id,
35
+ hf_username = hf_username,
36
+ email = email
37
+ )
38
+ return(user)
39
+ }, error = function(e) {
40
+ warning(paste("R Supabase: Error in get_or_create_user -", e$message))
41
+ return(NULL)
42
+ })
43
+ }
44
+
45
+ # Check user quota
46
+ check_user_quota <- function(supabase_client, hf_user_id) {
47
+ if (is.null(supabase_client)) {
48
+ # Return default values if Supabase disabled
49
+ return(list(
50
+ has_quota = TRUE,
51
+ tokens_remaining = 999999,
52
+ tokens_used = 0
53
+ ))
54
+ }
55
+
56
+ tryCatch({
57
+ # Call Python method which returns tuple (has_quota, remaining, used)
58
+ result <- supabase_client$check_quota(hf_user_id = hf_user_id)
59
+
60
+ # Convert Python tuple to R list
61
+ if (is.list(result) || length(result) == 3) {
62
+ return(list(
63
+ has_quota = result[[1]],
64
+ tokens_remaining = as.integer(result[[2]]),
65
+ tokens_used = as.integer(result[[3]])
66
+ ))
67
+ } else {
68
+ warning("R Supabase: Unexpected result format from check_quota")
69
+ return(list(has_quota = TRUE, tokens_remaining = 999999, tokens_used = 0))
70
+ }
71
+ }, error = function(e) {
72
+ warning(paste("R Supabase: Error checking quota -", e$message))
73
+ return(list(has_quota = TRUE, tokens_remaining = 999999, tokens_used = 0))
74
+ })
75
+ }
76
+
77
+ # Update token usage
78
+ update_token_usage <- function(supabase_client, hf_user_id, tokens_to_add) {
79
+ if (is.null(supabase_client)) {
80
+ return(TRUE)
81
+ }
82
+
83
+ tryCatch({
84
+ result <- supabase_client$update_token_usage(
85
+ hf_user_id = hf_user_id,
86
+ tokens_to_add = as.integer(tokens_to_add)
87
+ )
88
+ return(result)
89
+ }, error = function(e) {
90
+ warning(paste("R Supabase: Error updating token usage -", e$message))
91
+ return(FALSE)
92
+ })
93
+ }
94
+
95
+ # Log usage (called from R if needed, but primarily handled in Python)
96
+ log_usage_from_r <- function(supabase_client, hf_user_id, query_text,
97
+ user_id = NULL, total_tokens = 0,
98
+ response_text = NULL, error_message = NULL) {
99
+ if (is.null(supabase_client)) {
100
+ return(TRUE)
101
+ }
102
+
103
+ tryCatch({
104
+ result <- supabase_client$log_usage(
105
+ hf_user_id = hf_user_id,
106
+ query_text = query_text,
107
+ user_id = user_id,
108
+ total_tokens = as.integer(total_tokens),
109
+ response_text = response_text,
110
+ error_message = error_message
111
+ )
112
+ return(result)
113
+ }, error = function(e) {
114
+ warning(paste("R Supabase: Error logging usage -", e$message))
115
+ return(FALSE)
116
+ })
117
+ }
118
+
119
+ # Get user statistics
120
+ get_user_stats <- function(supabase_client, hf_user_id) {
121
+ if (is.null(supabase_client)) {
122
+ return(NULL)
123
+ }
124
+
125
+ tryCatch({
126
+ stats <- supabase_client$get_user_stats(hf_user_id = hf_user_id)
127
+ return(stats)
128
+ }, error = function(e) {
129
+ warning(paste("R Supabase: Error getting user stats -", e$message))
130
+ return(NULL)
131
+ })
132
+ }