.gitignore CHANGED
@@ -9,5 +9,4 @@ __pycache__/
9
  *.log
10
 
11
  agentic_implementation/*.json
12
- agentic_implementation/*.db
13
- logs/
 
9
  *.log
10
 
11
  agentic_implementation/*.json
12
+ agentic_implementation/*.db
 
README.md CHANGED
@@ -1,19 +1,17 @@
1
  ---
2
- tag: "mcp-server-track"
3
  title: MailQuery
4
  emoji: 💬
5
  colorFrom: yellow
6
  colorTo: purple
7
  sdk: gradio
8
  sdk_version: 5.0.1
9
- app_file: agentic_implementation/email_mcp_server_oauth.py
10
  pinned: false
11
  short_description: Answer any questions you have about the content of your mail
12
  ---
13
 
 
14
 
15
- Our Google authentication verification is currently pending. If you would like to proceed with testing, we can add you as a test user. To facilitate this, please complete the Google Form provided below.
16
 
17
- https://forms.gle/FRWwhhMeKaiXDAJQ9
18
 
19
- Link to the vide demo: https://www.youtube.com/watch?v=Ie8kdGU6bjY
 
1
  ---
 
2
  title: MailQuery
3
  emoji: 💬
4
  colorFrom: yellow
5
  colorTo: purple
6
  sdk: gradio
7
  sdk_version: 5.0.1
8
+ app_file: app.py
9
  pinned: false
10
  short_description: Answer any questions you have about the content of your mail
11
  ---
12
 
13
+ An example chatbot using [Gradio](https://gradio.app), [`huggingface_hub`](https://huggingface.co/docs/huggingface_hub/v0.22.2/en/index), and the [Hugging Face Inference API](https://huggingface.co/docs/api-inference/index).
14
 
 
15
 
 
16
 
17
+ uvicorn main:app --reload
agentic_implementation/README_OAuth.md DELETED
@@ -1,258 +0,0 @@
1
- # Gmail MCP Server with OAuth Authentication
2
-
3
- This is an enhanced version of the Gmail MCP (Model Context Protocol) server that uses **OAuth 2.0 authentication** instead of requiring users to provide email credentials for each query.
4
-
5
- ## 🚀 Key Features
6
-
7
- - **OAuth 2.0 Authentication**: Secure authentication flow using Google's OAuth system
8
- - **One-time Setup**: Authenticate once, use anywhere
9
- - **Automatic Token Refresh**: Handles token expiration automatically
10
- - **Encrypted Storage**: Credentials are encrypted and stored securely
11
- - **No More Password Sharing**: No need to provide email/password to Claude
12
-
13
- ## 📋 Prerequisites
14
-
15
- 1. **Google Account**: You need a Gmail account
16
- 2. **Google Cloud Project**: Free to create
17
- 3. **Python 3.8+**: Required for running the server
18
-
19
- ## 🛠️ Setup Instructions
20
-
21
- ### Step 1: Install Dependencies
22
-
23
- ```bash
24
- pip install -r requirements_oauth.txt
25
- ```
26
-
27
- ### Step 2: Run the Interactive Setup
28
-
29
- The setup script will guide you through the entire process:
30
-
31
- ```bash
32
- python setup_oauth.py
33
- ```
34
-
35
- This will walk you through:
36
- 1. Creating a Google Cloud project
37
- 2. Enabling the Gmail API
38
- 3. Setting up OAuth consent screen
39
- 4. Creating OAuth credentials
40
- 5. Testing the authentication flow
41
-
42
- ### Step 3: Start the MCP Server
43
-
44
- ```bash
45
- python email_mcp_server_oauth.py
46
- ```
47
-
48
- The server will start and show you:
49
- - Authentication status
50
- - MCP endpoint URL
51
- - Web interface URL
52
-
53
- ## 🔧 Claude Desktop Configuration
54
-
55
- Add this configuration to your Claude Desktop MCP settings:
56
-
57
- ```json
58
- {
59
- "mcpServers": {
60
- "gmail-oauth": {
61
- "command": "npx",
62
- "args": [
63
- "mcp-remote",
64
- "http://localhost:7860/gradio_api/mcp/sse"
65
- ]
66
- }
67
- }
68
- }
69
- ```
70
-
71
- ## 🔍 Available Tools
72
-
73
- ### 1. search_emails
74
- Search your emails using natural language queries - **no credentials needed!**
75
-
76
- **Parameters:**
77
- - `query`: Natural language query (e.g., "show me emails from amazon last week")
78
-
79
- **Example Usage in Claude:**
80
- > "Can you search my emails for messages from Swiggy in the last week?"
81
-
82
- ### 2. get_email_details
83
- Get full details of a specific email by message ID.
84
-
85
- **Parameters:**
86
- - `message_id`: Message ID from search results
87
-
88
- ### 3. analyze_email_patterns
89
- Analyze email patterns from a specific sender over time.
90
-
91
- **Parameters:**
92
- - `sender_keyword`: Sender to analyze (e.g., "amazon", "google")
93
- - `days_back`: Number of days to analyze (default: "30")
94
-
95
- ### 4. authenticate_user
96
- Trigger the OAuth authentication flow from Claude Desktop.
97
-
98
- **Parameters:** None
99
-
100
- ### 5. get_authentication_status
101
- Check current authentication status.
102
-
103
- **Parameters:** None
104
-
105
- ## 🔐 Security Features
106
-
107
- ### Encrypted Storage
108
- - All credentials are encrypted using Fernet encryption
109
- - Encryption keys are stored securely with proper permissions
110
- - No plaintext credentials are ever stored
111
-
112
- ### OAuth Benefits
113
- - No need to share Gmail passwords
114
- - Granular permission control
115
- - Easy revocation from Google Account settings
116
- - Automatic token refresh
117
-
118
- ### Local Storage
119
- - All data stored locally on your machine
120
- - No cloud storage of credentials
121
- - You maintain full control
122
-
123
- ## 🔧 Advanced Usage
124
-
125
- ### Command Line Tools
126
-
127
- Check authentication status:
128
- ```bash
129
- python setup_oauth.py --status
130
- ```
131
-
132
- Re-authenticate:
133
- ```bash
134
- python setup_oauth.py --auth
135
- ```
136
-
137
- Clear stored credentials:
138
- ```bash
139
- python setup_oauth.py --clear
140
- ```
141
-
142
- Show help:
143
- ```bash
144
- python setup_oauth.py --help
145
- ```
146
-
147
- ### Web Interface
148
-
149
- When the server is running, you can access the web interface at:
150
- ```
151
- http://localhost:7860
152
- ```
153
-
154
- Use this interface to:
155
- - Check authentication status
156
- - Trigger authentication flow
157
- - Test email search functionality
158
-
159
- ## 🆚 Comparison: OAuth vs App Passwords
160
-
161
- | Feature | App Password (Old) | OAuth (New) |
162
- |---------|-------------------|-------------|
163
- | **Setup Complexity** | Simple | One-time setup required |
164
- | **Security** | Share app password | No password sharing |
165
- | **User Experience** | Enter credentials each time | Authenticate once |
166
- | **Revocation** | Change app password | Revoke from Google Account |
167
- | **Token Management** | Manual | Automatic refresh |
168
- | **Scope Control** | Full Gmail access | Granular permissions |
169
-
170
- ## 🐛 Troubleshooting
171
-
172
- ### Authentication Issues
173
-
174
- **"OAuth not configured" error:**
175
- ```bash
176
- python setup_oauth.py
177
- ```
178
-
179
- **"Not authenticated" error:**
180
- ```bash
181
- python setup_oauth.py --auth
182
- ```
183
-
184
- **Authentication timeout:**
185
- - Check if port 8080 is available
186
- - Try disabling firewall temporarily
187
- - Ensure browser can access localhost:8080
188
-
189
- ### Common Issues
190
-
191
- **"No module named 'google.auth'" error:**
192
- ```bash
193
- pip install -r requirements_oauth.txt
194
- ```
195
-
196
- **"Permission denied" on credential files:**
197
- ```bash
198
- # Check permissions
199
- ls -la ~/.mailquery_oauth/
200
- # Should show restricted permissions (600/700)
201
- ```
202
-
203
- **Browser doesn't open:**
204
- - Copy the authorization URL manually
205
- - Paste it in your browser
206
- - Complete the flow manually
207
-
208
- ### Getting Help
209
-
210
- 1. Check authentication status: `python setup_oauth.py --status`
211
- 2. Review server logs for detailed error messages
212
- 3. Ensure Google Cloud project is properly configured
213
- 4. Verify OAuth consent screen is set up correctly
214
-
215
- ## 📁 File Structure
216
-
217
- ```
218
- ~/.mailquery_oauth/
219
- ├── client_secret.json # OAuth client configuration
220
- ├── token.pickle # Encrypted access/refresh tokens
221
- └── key.key # Encryption key (secure permissions)
222
- ```
223
-
224
- ## 🔄 Migration from App Password Version
225
-
226
- If you're migrating from the app password version:
227
-
228
- 1. Run the new OAuth setup: `python setup_oauth.py`
229
- 2. Update your Claude Desktop configuration to use the new server
230
- 3. The old environment variables (EMAIL_ID, APP_PASSWORD) are no longer needed
231
-
232
- ## 📞 Support
233
-
234
- For issues or questions:
235
- 1. Check the troubleshooting section above
236
- 2. Review the setup script output for specific guidance
237
- 3. Ensure all prerequisites are met
238
- 4. Verify Google Cloud project configuration
239
-
240
- ## 🎯 Example Queries for Claude
241
-
242
- Once set up, you can ask Claude:
243
-
244
- - "Search my emails for messages from Amazon in the last month"
245
- - "Show me emails from my bank from last week"
246
- - "Analyze my LinkedIn email patterns over the last 60 days"
247
- - "Find emails from Swiggy today"
248
- - "Get details of the email with ID xyz123"
249
-
250
- Claude will automatically use the OAuth-authenticated tools without asking for credentials!
251
-
252
- ## 🔒 Privacy & Data
253
-
254
- - **No data leaves your machine**: All processing happens locally
255
- - **Google only provides**: Access to your Gmail via official APIs
256
- - **We store**: Encrypted authentication tokens only
257
- - **We never store**: Email content, passwords, or personal data
258
- - **You control**: Access can be revoked anytime from Google Account settings
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
agentic_implementation/agent.py ADDED
@@ -0,0 +1,141 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # agent.py
2
+
3
+ import json
4
+ from typing import Dict, Any
5
+
6
+ from re_act import (
7
+ get_plan_from_llm,
8
+ think,
9
+ act,
10
+ store_name_email_mapping,
11
+ extract_sender_info,
12
+ client
13
+ )
14
+ from schemas import PlanStep
15
+ from logger import logger # from logger.py
16
+
17
+
18
+ def run_agent():
19
+ """
20
+ Main REPL loop for the email agent.
21
+ """
22
+ logger.info("Starting Email Agent REPL...")
23
+ print("🤖 Email Agent ready. Type 'exit' to quit.\n")
24
+
25
+ while True:
26
+ try:
27
+ user_query = input("🗨 You: ").strip()
28
+ logger.info("Received user input: %s", user_query)
29
+
30
+ if user_query.lower() in ("exit", "quit"):
31
+ logger.info("Exit command detected, shutting down agent.")
32
+ print("👋 Goodbye!")
33
+ break
34
+
35
+ # 1) Generate plan
36
+ try:
37
+ plan = get_plan_from_llm(user_query)
38
+ logger.debug("Generated plan: %s", plan)
39
+ except Exception as e:
40
+ logger.error("Failed to generate plan: %s", e)
41
+ print(f"❌ Could not generate a plan: {e}")
42
+ continue
43
+
44
+ # print plan for user transparency
45
+ print("\n\nplan:")
46
+ print(plan)
47
+ print("\n\n")
48
+
49
+ results: Dict[str, Any] = {}
50
+
51
+ # 2) Execute each plan step
52
+ for step in plan.plan:
53
+ logger.info("Processing step: %s", step.action)
54
+
55
+ if step.action == "done":
56
+ logger.info("Encountered 'done' action. Plan complete.")
57
+ print("✅ Plan complete.")
58
+ break
59
+
60
+ try:
61
+ should_run, updated_step, user_prompt = think(step, results, user_query)
62
+ logger.debug(
63
+ "Think outcome for '%s': should_run=%s, updated_step=%s, user_prompt=%s",
64
+ step.action, should_run, updated_step, user_prompt
65
+ )
66
+ except Exception as e:
67
+ logger.error("Error in think() for step '%s': %s", step.action, e)
68
+ print(f"❌ Error in planning step '{step.action}': {e}")
69
+ break
70
+
71
+ # Handle user prompt (e.g., missing email mapping)
72
+ if user_prompt:
73
+ logger.info("User prompt required: %s", user_prompt)
74
+ print(f"❓ {user_prompt}")
75
+ user_input = input("📧 Email: ").strip()
76
+
77
+ try:
78
+ sender_info = extract_sender_info(user_query)
79
+ sender_intent = sender_info.get("sender_intent", "")
80
+ store_name_email_mapping(sender_intent, user_input)
81
+ logger.info("Stored mapping: %s -> %s", sender_intent, user_input)
82
+ print(f"✅ Stored mapping: {sender_intent} → {user_input}")
83
+
84
+ # Retry current step
85
+ should_run, updated_step, _ = think(step, results, user_query)
86
+ logger.debug(
87
+ "Post-mapping think outcome: should_run=%s, updated_step=%s",
88
+ should_run, updated_step
89
+ )
90
+ except Exception as e:
91
+ logger.error("Error storing mapping or retrying step '%s': %s", step.action, e)
92
+ print(f"❌ Error storing mapping or retrying step: {e}")
93
+ break
94
+
95
+ if not should_run:
96
+ logger.info("Skipping step: %s", step.action)
97
+ print(f"⏭️ Skipping `{step.action}`")
98
+ continue
99
+
100
+ # Execute the action
101
+ try:
102
+ output = act(updated_step)
103
+ results[updated_step.action] = output
104
+ logger.info("Action '%s' executed successfully.", updated_step.action)
105
+ print(f"🔧 Ran `{updated_step.action}`")
106
+ except Exception as e:
107
+ logger.error("Error executing action '%s': %s", updated_step.action, e)
108
+ print(f"❌ Error running `{updated_step.action}`: {e}")
109
+ break
110
+
111
+ # 3) Summarize results via LLM
112
+ try:
113
+ summary_rsp = client.chat.completions.create(
114
+ model="gpt-4o-mini",
115
+ temperature=0.0,
116
+ messages=[
117
+ {"role": "system", "content": "Summarize these results for the user in a friendly way."},
118
+ {"role": "assistant", "content": json.dumps(results)}
119
+ ],
120
+ )
121
+ summary = summary_rsp.choices[0].message.content
122
+ logger.info("Summary generated successfully.")
123
+ print("\n📋 Summary:\n" + summary)
124
+ except Exception as e:
125
+ logger.error("Failed to generate summary: %s", e)
126
+ print("\n❌ Failed to generate summary.")
127
+
128
+ print("\nAnything else I can help you with?\n")
129
+
130
+ except KeyboardInterrupt:
131
+ logger.info("KeyboardInterrupt received, shutting down.")
132
+ print("\n👋 Goodbye!")
133
+ break
134
+ except Exception as e:
135
+ logger.exception("Unexpected error in REPL loop: %s", e)
136
+ print(f"❌ Unexpected error: {e}")
137
+ continue
138
+
139
+
140
+ if __name__ == "__main__":
141
+ run_agent()
agentic_implementation/email_db.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "agarwal.27@iitj.ac.in": {
3
+ "emails": [
4
+ {
5
+ "date": "07-Jun-2025",
6
+ "time": "16:42:51",
7
+ "subject": "testing",
8
+ "content": "hi bro",
9
+ "message_id": "<CAPziNCaSuVqpqNNfsRjhVbx22jN_vos3EGK_Odt-8WiD0HRKKQ@mail.gmail.com>"
10
+ }
11
+ ],
12
+ "last_scraped": "08-Jun-2025"
13
+ }
14
+ }
agentic_implementation/email_mcp_server_oauth.py DELETED
@@ -1,786 +0,0 @@
1
- #!/usr/bin/env python3
2
- """
3
- Gmail MCP Server with OAuth Authentication and Multi-Account Support
4
- """
5
-
6
- import gradio as gr
7
- import json
8
- import base64
9
- from email.mime.text import MIMEText
10
- from googleapiclient.errors import HttpError
11
- import os
12
- from typing import Dict, List
13
- from datetime import datetime, timedelta
14
- from dotenv import load_dotenv
15
- from fastapi.responses import HTMLResponse
16
- from fastapi import FastAPI,Request
17
- from fastapi.routing import APIRoute
18
- # Import OAuth-enabled modules
19
- # from tools import extract_query_info, analyze_emails
20
- from gmail_api_scraper import GmailAPIScraper
21
- from oauth_manager import oauth_manager
22
- from logger import logger
23
-
24
- load_dotenv()
25
-
26
- if not oauth_manager.client_secrets_file.exists():
27
- print("Setupy")
28
- oauth_manager.setup_client_secrets(
29
- os.environ["GOOGLE_CLIENT_ID"],
30
- os.environ["GOOGLE_CLIENT_SECRET"]
31
- )
32
-
33
- # Initialize Gmail API scraper
34
- gmail_scraper = GmailAPIScraper()
35
-
36
- def check_authentication() -> tuple[bool, str]:
37
- """Check if user is authenticated and return status"""
38
- current_account = oauth_manager.get_current_account()
39
- if current_account and oauth_manager.is_authenticated():
40
- return True, current_account
41
- else:
42
- return False, "Not authenticated"
43
-
44
- def simple_analyze_emails(emails) -> dict:
45
- """
46
- Simple email analysis without OpenAI - just basic statistics and patterns
47
- """
48
- if not emails:
49
- return {"summary": "No emails to analyze.", "insights": []}
50
-
51
- # Basic statistics
52
- total_count = len(emails)
53
-
54
- # Group by sender
55
- senders = {}
56
- subjects = []
57
- dates = []
58
-
59
- for email in emails:
60
- sender = email.get("from", "Unknown")
61
- # Extract just the email domain for grouping
62
- if "<" in sender and ">" in sender:
63
- email_part = sender.split("<")[1].split(">")[0]
64
- else:
65
- email_part = sender
66
-
67
- domain = email_part.split("@")[-1] if "@" in email_part else sender
68
-
69
- senders[domain] = senders.get(domain, 0) + 1
70
- subjects.append(email.get("subject", ""))
71
- dates.append(email.get("date", ""))
72
-
73
- # Create insights
74
- insights = []
75
- insights.append(f"Found {total_count} emails total")
76
-
77
- if senders:
78
- top_sender = max(senders.items(), key=lambda x: x[1])
79
- insights.append(f"Most emails from: {top_sender[0]} ({top_sender[1]} emails)")
80
-
81
- if len(senders) > 1:
82
- insights.append(f"Emails from {len(senders)} different domains")
83
-
84
- # Date range
85
- if dates:
86
- unique_dates = list(set(dates))
87
- if len(unique_dates) > 1:
88
- insights.append(f"Spanning {len(unique_dates)} different days")
89
-
90
- # Subject analysis
91
- if subjects:
92
- # Count common words in subjects (simple approach)
93
- all_words = []
94
- for subject in subjects:
95
- words = subject.lower().split()
96
- all_words.extend([w for w in words if len(w) > 3]) # Only words longer than 3 chars
97
-
98
- if all_words:
99
- word_counts = {}
100
- for word in all_words:
101
- word_counts[word] = word_counts.get(word, 0) + 1
102
-
103
- if word_counts:
104
- common_word = max(word_counts.items(), key=lambda x: x[1])
105
- if common_word[1] > 1:
106
- insights.append(f"Common subject word: '{common_word[0]}' appears {common_word[1]} times")
107
-
108
- summary = f"Analysis of {total_count} emails from {len(senders)} sender(s)"
109
-
110
- return {
111
- "summary": summary,
112
- "insights": insights
113
- }
114
-
115
- def authenticate_user() -> str:
116
- """
117
- Start OAuth authentication flow for Gmail access.
118
- Opens a browser window for user to authenticate with Google.
119
-
120
- Returns:
121
- str: JSON string containing authentication result
122
- """
123
- try:
124
- logger.info("Starting OAuth authentication flow...")
125
-
126
- if oauth_manager.is_authenticated():
127
- user_email = oauth_manager.get_current_account()
128
- return json.dumps({
129
- "success": True,
130
- "message": "Already authenticated!",
131
- "user_email": user_email,
132
- "instructions": [
133
- "You are already authenticated and ready to use email tools",
134
- f"Currently authenticated as: {user_email}"
135
- ]
136
- }, indent=2)
137
- # Check if OAuth is configured
138
- if not oauth_manager.client_secrets_file.exists():
139
- return json.dumps({
140
- "error": "OAuth not configured",
141
- "message": "Please run 'python setup_oauth.py' first to configure OAuth credentials.",
142
- "success": False
143
- }, indent=2)
144
-
145
- # Start authentication
146
- success = oauth_manager.authenticate_interactive()
147
-
148
- if success:
149
- user_email = oauth_manager.get_current_account()
150
- result = {
151
- "success": True,
152
- "message": "Authentication successful! You can now use the email tools.",
153
- "user_email": user_email,
154
- "instructions": [
155
- "Authentication completed successfully",
156
- "You can now search emails, get email details, and analyze patterns",
157
- f"Currently authenticated as: {user_email}"
158
- ]
159
- }
160
- else:
161
- # Authentication not completed, provide manual instructions
162
- auth_url = oauth_manager.get_pending_auth_url()
163
- callback_url = oauth_manager.get_hf_redirect_uri()
164
-
165
- if auth_url:
166
- result = {
167
- "success": False,
168
- "message": "Manual authentication required",
169
- "auth_url": auth_url,
170
- "callback_url": callback_url,
171
- "instructions": [
172
- "Authentication URL has been generated",
173
- "Please click the link below to authenticate:",
174
- "1. Open the authentication URL(auth_url) in a new browser tab",
175
- "2. Sign in with your Google account",
176
- "3. Grant Gmail access permissions",
177
- "4. You'll be redirected back automatically",
178
- "5. Try clicking 'Submit' again after completing authentication"
179
- ],
180
- "note": "After completing authentication in the popup, click Submit again to verify"
181
- }
182
- else:
183
- result = {
184
- "success": False,
185
- "error": "Failed to generate authentication URL",
186
- "message": "Could not start authentication process. Check your OAuth configuration."
187
- }
188
-
189
- return json.dumps(result, indent=2)
190
-
191
- except Exception as e:
192
- logger.error("Error in authenticate_user: %s", e)
193
- error_result = {
194
- "success": False,
195
- "error": str(e),
196
- "message": "Authentication failed due to an error."
197
- }
198
- return json.dumps(error_result, indent=2)
199
-
200
-
201
- def handle_oauth_callback(auth_code: str) -> str:
202
- """Handle OAuth callback for Hugging Face Spaces
203
-
204
- Args:
205
- auth_code: Authorization code from OAuth callback
206
-
207
- Returns:
208
- HTML response string
209
- """
210
- try:
211
- if not auth_code:
212
- return """
213
- <html>
214
- <head><title>OAuth Error</title></head>
215
- <body style="font-family: Arial, sans-serif; text-align: center; padding: 50px;">
216
- <h1 style="color: #d32f2f;">Authentication Error</h1>
217
- <p>No authorization code received.</p>
218
- <button onclick="window.close()" style="padding: 10px 20px; margin: 20px; background: #1976d2; color: white; border: none; border-radius: 4px; cursor: pointer;">Close Window</button>
219
- </body>
220
- </html>
221
- """
222
- print(f"Received OAuth callback with code: {auth_code}")
223
- success = oauth_manager.complete_hf_spaces_auth(auth_code)
224
-
225
- if success:
226
- user_email = oauth_manager.get_current_account()
227
- return f"""
228
- <html>
229
- <head><title>OAuth Success</title></head>
230
- <body style="font-family: Arial, sans-serif; text-align: center; padding: 50px;">
231
- <h1 style="color: #2e7d32;">🎉 Authentication Successful!</h1>
232
- <p>You are now authenticated as:</p>
233
- <p style="font-weight: bold; font-size: 18px; color: #1976d2;">{user_email}</p>
234
- <p>You can now close this window and return to the main application.</p>
235
- <p style="color: #666; font-size: 14px;">This window will close automatically in 5 seconds...</p>
236
- <button onclick="window.close()" style="padding: 10px 20px; margin: 20px; background: #2e7d32; color: white; border: none; border-radius: 4px; cursor: pointer;">Close Window</button>
237
- <script>
238
- setTimeout(function() {{
239
- window.close();
240
- }}, 5000);
241
- </script>
242
- </body>
243
- </html>
244
- """
245
- else:
246
- return """
247
- <html>
248
- <head><title>OAuth Error</title></head>
249
- <body style="font-family: Arial, sans-serif; text-align: center; padding: 50px;">
250
- <h1 style="color: #d32f2f;">Authentication Failed</h1>
251
- <p>Unable to complete authentication. Please try again.</p>
252
- <p>Make sure you granted all required permissions.</p>
253
- <button onclick="window.close()" style="padding: 10px 20px; margin: 20px; background: #d32f2f; color: white; border: none; border-radius: 4px; cursor: pointer;">Close Window</button>
254
- </body>
255
- </html>
256
- """
257
-
258
- except Exception as e:
259
- logger.error(f"Error handling OAuth callback: {e}")
260
- return f"""
261
- <html>
262
- <head><title>OAuth Error</title></head>
263
- <body style="font-family: Arial, sans-serif; text-align: center; padding: 50px;">
264
- <h1 style="color: #d32f2f;">Authentication Error</h1>
265
- <p>An error occurred during authentication:</p>
266
- <pre style="background: #f5f5f5; padding: 10px; border-radius: 4px; text-align: left; max-width: 500px; margin: 0 auto;">{str(e)}</pre>
267
- <button onclick="window.close()" style="padding: 10px 20px; margin: 20px; background: #d32f2f; color: white; border: none; border-radius: 4px; cursor: pointer;">Close Window</button>
268
- </body>
269
- </html>
270
- """
271
-
272
-
273
- def switch_account(target_email: str) -> str:
274
- """
275
- Switch to a different authenticated Gmail account.
276
-
277
- Args:
278
- target_email (str): Email address to switch to
279
-
280
- Returns:
281
- str: JSON string containing switch result
282
- """
283
- try:
284
- logger.info("Switching to account: %s", target_email)
285
-
286
- # Check if target account is authenticated
287
- if not oauth_manager.is_authenticated(target_email):
288
- return json.dumps({
289
- "error": "Account not authenticated",
290
- "message": f"Account '{target_email}' is not authenticated. Please authenticate first.",
291
- "target_email": target_email,
292
- "authenticated_accounts": list(oauth_manager.list_accounts().keys())
293
- }, indent=2)
294
-
295
- # Switch account
296
- success = oauth_manager.switch_account(target_email)
297
-
298
- if success:
299
- result = {
300
- "success": True,
301
- "message": f"Successfully switched to account: {target_email}",
302
- "current_account": oauth_manager.get_current_account(),
303
- "previous_account": None # Could track this if needed
304
- }
305
- else:
306
- result = {
307
- "success": False,
308
- "error": "Failed to switch account",
309
- "message": f"Could not switch to account: {target_email}",
310
- "current_account": oauth_manager.get_current_account()
311
- }
312
-
313
- return json.dumps(result, indent=2)
314
-
315
- except Exception as e:
316
- logger.error("Error switching account: %s", e)
317
- error_result = {
318
- "success": False,
319
- "error": str(e),
320
- "message": f"Failed to switch to account: {target_email}"
321
- }
322
- return json.dumps(error_result, indent=2)
323
-
324
- def list_accounts() -> str:
325
- """
326
- List all authenticated Gmail accounts and their status.
327
-
328
- Returns:
329
- str: JSON string containing all accounts and their authentication status
330
- """
331
- try:
332
- logger.info("Listing all accounts")
333
-
334
- accounts = oauth_manager.list_accounts()
335
- current_account = oauth_manager.get_current_account()
336
-
337
- result = {
338
- "accounts": accounts,
339
- "current_account": current_account,
340
- "total_accounts": len(accounts),
341
- "authenticated_accounts": [email for email, is_auth in accounts.items() if is_auth],
342
- "message": f"Found {len(accounts)} stored accounts, currently using: {current_account or 'None'}"
343
- }
344
-
345
- return json.dumps(result, indent=2)
346
-
347
- except Exception as e:
348
- logger.error("Error listing accounts: %s", e)
349
- error_result = {
350
- "error": str(e),
351
- "message": "Failed to list accounts"
352
- }
353
- return json.dumps(error_result, indent=2)
354
-
355
- def remove_account(email_to_remove: str) -> str:
356
- """
357
- Remove an authenticated Gmail account and its stored credentials.
358
-
359
- Args:
360
- email_to_remove (str): Email address to remove
361
-
362
- Returns:
363
- str: JSON string containing removal result
364
- """
365
- try:
366
- logger.info("Removing account: %s", email_to_remove)
367
-
368
- # Check if account exists
369
- accounts = oauth_manager.list_accounts()
370
- if email_to_remove not in accounts:
371
- return json.dumps({
372
- "error": "Account not found",
373
- "message": f"Account '{email_to_remove}' not found in stored accounts.",
374
- "available_accounts": list(accounts.keys())
375
- }, indent=2)
376
-
377
- # Remove account
378
- oauth_manager.remove_account(email_to_remove)
379
-
380
- result = {
381
- "success": True,
382
- "message": f"Successfully removed account: {email_to_remove}",
383
- "removed_account": email_to_remove,
384
- "current_account": oauth_manager.get_current_account(),
385
- "remaining_accounts": list(oauth_manager.list_accounts().keys())
386
- }
387
-
388
- return json.dumps(result, indent=2)
389
-
390
- except Exception as e:
391
- logger.error("Error removing account: %s", e)
392
- error_result = {
393
- "success": False,
394
- "error": str(e),
395
- "message": f"Failed to remove account: {email_to_remove}"
396
- }
397
- return json.dumps(error_result, indent=2)
398
-
399
- def search_emails(sender_keyword: str, start_date: str = "", end_date: str = "") -> str:
400
- """
401
- Search for emails from a specific sender within a date range using OAuth authentication.
402
-
403
- Args:
404
- sender_keyword (str): The sender/company keyword to search for (e.g., "apple", "amazon")
405
- start_date (str): Start date in DD-MMM-YYYY format (e.g., "01-Jan-2025"). If empty, defaults to 7 days ago.
406
- end_date (str): End date in DD-MMM-YYYY format (e.g., "07-Jan-2025"). If empty, defaults to today.
407
-
408
- Returns:
409
- str: JSON string containing email search results and analysis
410
- """
411
- try:
412
- logger.info("OAuth Email search tool called with sender: %s, dates: %s to %s", sender_keyword, start_date, end_date)
413
-
414
- # Check authentication
415
- is_auth, auth_info = check_authentication()
416
- if not is_auth:
417
- return json.dumps({
418
- "error": "Not authenticated",
419
- "message": "Please authenticate first using the authenticate_user tool or run 'python setup_oauth.py'",
420
- "auth_status": auth_info
421
- }, indent=2)
422
-
423
- # Set default date range if not provided
424
- if not start_date or not end_date:
425
- today = datetime.today()
426
- if not end_date:
427
- end_date = today.strftime("%d-%b-%Y")
428
- if not start_date:
429
- start_date = (today - timedelta(days=7)).strftime("%d-%b-%Y")
430
-
431
- logger.info(f"Searching for emails with keyword '{sender_keyword}' between {start_date} and {end_date}")
432
-
433
- # Use Gmail API scraper with OAuth
434
- full_emails = gmail_scraper.search_emails(sender_keyword, start_date, end_date)
435
-
436
- if not full_emails:
437
- result = {
438
- "sender_keyword": sender_keyword,
439
- "date_range": f"{start_date} to {end_date}",
440
- "email_summary": [],
441
- "analysis": {"summary": f"No emails found for '{sender_keyword}' in the specified date range.", "insights": []},
442
- "email_count": 0,
443
- "user_email": auth_info
444
- }
445
- return json.dumps(result, indent=2)
446
-
447
- # Create summary version without full content
448
- email_summary = []
449
- for email in full_emails:
450
- summary_email = {
451
- "date": email.get("date"),
452
- "time": email.get("time"),
453
- "subject": email.get("subject"),
454
- "from": email.get("from", "Unknown Sender"),
455
- "message_id": email.get("message_id"),
456
- "gmail_id": email.get("gmail_id")
457
- }
458
- email_summary.append(summary_email)
459
-
460
- # Auto-analyze the emails for insights (no OpenAI)
461
- analysis = simple_analyze_emails(full_emails)
462
-
463
- # Return summary info with analysis
464
- result = {
465
- "sender_keyword": sender_keyword,
466
- "date_range": f"{start_date} to {end_date}",
467
- "email_summary": email_summary,
468
- "analysis": analysis,
469
- "email_count": len(full_emails),
470
- "user_email": auth_info
471
- }
472
-
473
- return json.dumps(result, indent=2)
474
-
475
- except Exception as e:
476
- logger.error("Error in search_emails: %s", e)
477
- error_result = {
478
- "error": str(e),
479
- "sender_keyword": sender_keyword,
480
- "message": "Failed to search emails."
481
- }
482
- return json.dumps(error_result, indent=2)
483
-
484
- def get_email_details(message_id: str) -> str:
485
- """
486
- Get full details of a specific email by its message ID using OAuth authentication.
487
-
488
- Args:
489
- message_id (str): The message ID of the email to retrieve
490
-
491
- Returns:
492
- str: JSON string containing the full email details
493
- """
494
- try:
495
- logger.info("Getting email details for message_id: %s", message_id)
496
-
497
- # Check authentication
498
- is_auth, auth_info = check_authentication()
499
- if not is_auth:
500
- return json.dumps({
501
- "error": "Not authenticated",
502
- "message": "Please authenticate first using the authenticate_user tool or run 'python setup_oauth.py'",
503
- "auth_status": auth_info
504
- }, indent=2)
505
-
506
- # Get email using Gmail API
507
- email = gmail_scraper.get_email_by_id(message_id)
508
-
509
- if email:
510
- email["user_email"] = auth_info
511
- return json.dumps(email, indent=2)
512
- else:
513
- error_result = {
514
- "error": f"No email found with message_id '{message_id}'",
515
- "message": "Email may not exist or you may not have access to it.",
516
- "user_email": auth_info
517
- }
518
- return json.dumps(error_result, indent=2)
519
-
520
- except Exception as e:
521
- logger.error("Error in get_email_details: %s", e)
522
- error_result = {
523
- "error": str(e),
524
- "message_id": message_id,
525
- "message": "Failed to retrieve email details."
526
- }
527
- return json.dumps(error_result, indent=2)
528
-
529
- def analyze_email_patterns(sender_keyword: str, days_back: str = "30") -> str:
530
- """
531
- Analyze email patterns from a specific sender over a given time period using OAuth authentication.
532
-
533
- Args:
534
- sender_keyword (str): The sender/company keyword to analyze (e.g., "amazon", "google")
535
- days_back (str): Number of days to look back (default: "30")
536
-
537
- Returns:
538
- str: JSON string containing email pattern analysis
539
- """
540
- try:
541
- logger.info("Analyzing email patterns for sender: %s, days_back: %s", sender_keyword, days_back)
542
-
543
- # Check authentication
544
- is_auth, auth_info = check_authentication()
545
- if not is_auth:
546
- return json.dumps({
547
- "error": "Not authenticated",
548
- "message": "Please authenticate first using the authenticate_user tool or run 'python setup_oauth.py'",
549
- "auth_status": auth_info
550
- }, indent=2)
551
-
552
- # Calculate date range
553
- days_int = int(days_back)
554
- end_date = datetime.today()
555
- start_date = end_date - timedelta(days=days_int)
556
-
557
- start_date_str = start_date.strftime("%d-%b-%Y")
558
- end_date_str = end_date.strftime("%d-%b-%Y")
559
-
560
- # Search for emails using Gmail API
561
- full_emails = gmail_scraper.search_emails(sender_keyword, start_date_str, end_date_str)
562
-
563
- if not full_emails:
564
- result = {
565
- "sender_keyword": sender_keyword,
566
- "date_range": f"{start_date_str} to {end_date_str}",
567
- "analysis": {"summary": f"No emails found from '{sender_keyword}' in the last {days_back} days.", "insights": []},
568
- "email_count": 0,
569
- "user_email": auth_info
570
- }
571
- return json.dumps(result, indent=2)
572
-
573
- # Analyze the emails (no OpenAI)
574
- analysis = simple_analyze_emails(full_emails)
575
-
576
- result = {
577
- "sender_keyword": sender_keyword,
578
- "date_range": f"{start_date_str} to {end_date_str}",
579
- "analysis": analysis,
580
- "email_count": len(full_emails),
581
- "user_email": auth_info
582
- }
583
-
584
- return json.dumps(result, indent=2)
585
-
586
- except Exception as e:
587
- logger.error("Error in analyze_email_patterns: %s", e)
588
- error_result = {
589
- "error": str(e),
590
- "sender_keyword": sender_keyword,
591
- "message": "Failed to analyze email patterns."
592
- }
593
- return json.dumps(error_result, indent=2)
594
-
595
- def get_authentication_status() -> str:
596
- """
597
- Get current authentication status and account information.
598
-
599
- Returns:
600
- str: JSON string containing authentication status
601
- """
602
- try:
603
- current_account = oauth_manager.get_current_account()
604
- is_auth = oauth_manager.is_authenticated() if current_account else False
605
- all_accounts = oauth_manager.list_accounts()
606
-
607
- result = {
608
- "authenticated": is_auth,
609
- "current_account": current_account,
610
- "status": "authenticated" if is_auth else "not_authenticated",
611
- "message": f"Current account: {current_account}" if is_auth else "No account selected or not authenticated",
612
- "all_accounts": all_accounts,
613
- "total_accounts": len(all_accounts),
614
- "authenticated_accounts": [email for email, auth in all_accounts.items() if auth]
615
- }
616
-
617
- if not is_auth and not oauth_manager.client_secrets_file.exists():
618
- result["setup_required"] = True
619
- result["message"] = "OAuth not configured. Please run 'python setup_oauth.py' first."
620
- elif not is_auth and current_account:
621
- result["message"] = f"Account {current_account} needs re-authentication"
622
- elif not current_account and all_accounts:
623
- result["message"] = "Accounts available but none selected. Use switch_account to select one."
624
-
625
- return json.dumps(result, indent=2)
626
-
627
- except Exception as e:
628
- logger.error("Error checking authentication status: %s", e)
629
- return json.dumps({
630
- "error": str(e),
631
- "message": "Failed to check authentication status"
632
- }, indent=2)
633
-
634
- def send_email(recipient: str, subject: str, body: str) -> str:
635
- """
636
- Send a plain-text email via the authenticated Gmail account.
637
- Returns JSON with either:
638
- {"success": true, "message_id": "..."}
639
- or
640
- {"success": false, "error": "..."}
641
- """
642
- # Use the correct method on your OAuth manager:
643
- service = oauth_manager.get_gmail_service()
644
- if service is None:
645
- return json.dumps(
646
- {"success": False, "error": "Not authenticated or failed to build service."},
647
- indent=2,
648
- )
649
-
650
- # Build the MIME message
651
- mime_msg = MIMEText(body, "plain", "utf-8")
652
- mime_msg["to"] = recipient
653
- mime_msg["subject"] = subject
654
-
655
- # Base64-encode and send
656
- raw_msg = base64.urlsafe_b64encode(mime_msg.as_bytes()).decode()
657
- try:
658
- sent = (
659
- service.users()
660
- .messages()
661
- .send(userId="me", body={"raw": raw_msg})
662
- .execute()
663
- )
664
- return json.dumps(
665
- {"success": True, "message_id": sent.get("id")}, indent=2
666
- )
667
- except HttpError as err:
668
- logger.error(f"Error sending email: {err}")
669
- # err.error_details may be None; fallback to string
670
- error_detail = getattr(err, "error_details", None) or str(err)
671
- return json.dumps(
672
- {"success": False, "error": error_detail},
673
- indent=2,
674
- )
675
-
676
-
677
- # Create Gradio interfaces
678
- search_interface = gr.Interface(
679
- fn=search_emails,
680
- inputs=[
681
- gr.Textbox(label="Sender Keyword", placeholder="apple, amazon, etc."),
682
- gr.Textbox(label="Start Date (Optional)", placeholder="01-Jan-2025 (leave empty for last 7 days)"),
683
- gr.Textbox(label="End Date (Optional)", placeholder="07-Jan-2025 (leave empty for today)")
684
- ],
685
- outputs=gr.Textbox(label="Search Results", lines=20),
686
- title="Email Search (OAuth)",
687
- description="Search your emails by sender keyword and date range with OAuth authentication"
688
- )
689
-
690
- details_interface = gr.Interface(
691
- fn=get_email_details,
692
- inputs=[
693
- gr.Textbox(label="Message ID", placeholder="Email message ID from search results")
694
- ],
695
- outputs=gr.Textbox(label="Email Details", lines=20),
696
- title="Email Details (OAuth)",
697
- description="Get full details of a specific email by message ID with OAuth authentication"
698
- )
699
-
700
- analysis_interface = gr.Interface(
701
- fn=analyze_email_patterns,
702
- inputs=[
703
- gr.Textbox(label="Sender Keyword", placeholder="amazon, google, linkedin, etc."),
704
- gr.Textbox(label="Days Back", value="30", placeholder="Number of days to analyze")
705
- ],
706
- outputs=gr.Textbox(label="Analysis Results", lines=20),
707
- title="Email Pattern Analysis (OAuth)",
708
- description="Analyze email patterns from a specific sender over time with OAuth authentication"
709
- )
710
-
711
- auth_interface = gr.Interface(
712
- fn=authenticate_user,
713
- inputs=[],
714
- outputs=gr.Textbox(label="Authentication Result", lines=10),
715
- title="Authenticate with Gmail",
716
- description="Click Submit to start OAuth authentication flow with Gmail"
717
- )
718
-
719
- status_interface = gr.Interface(
720
- fn=get_authentication_status,
721
- inputs=[],
722
- outputs=gr.Textbox(label="Authentication Status", lines=15),
723
- title="Authentication Status",
724
- description="Check current authentication status and view all accounts"
725
- )
726
-
727
- switch_interface = gr.Interface(
728
- fn=switch_account,
729
- inputs=[
730
- gr.Textbox(label="Target Email", placeholder="email@gmail.com")
731
- ],
732
- outputs=gr.Textbox(label="Switch Result", lines=10),
733
- title="Switch Account",
734
- description="Switch to a different authenticated Gmail account"
735
- )
736
-
737
- accounts_interface = gr.Interface(
738
- fn=list_accounts,
739
- inputs=[],
740
- outputs=gr.Textbox(label="Accounts List", lines=15),
741
- title="List All Accounts",
742
- description="View all authenticated Gmail accounts and their status"
743
- )
744
-
745
- remove_interface = gr.Interface(
746
- fn=remove_account,
747
- inputs=[
748
- gr.Textbox(label="Email to Remove", placeholder="email@gmail.com")
749
- ],
750
- outputs=gr.Textbox(label="Removal Result", lines=10),
751
- title="Remove Account",
752
- description="Remove an authenticated Gmail account and its credentials"
753
- )
754
-
755
- send_interface = gr.Interface(
756
- fn=send_email,
757
- inputs=[
758
- gr.Textbox(label="Recipient Email", placeholder="recipient@example.com"),
759
- gr.Textbox(label="Subject", placeholder="Email subject"),
760
- gr.Textbox(label="Body", placeholder="Email body text", lines=5)
761
- ],
762
- outputs=gr.Textbox(label="Send Result", lines=10),
763
- title="✉️ Send Email",
764
- description="Send an email via Gmail using OAuth authenticated account"
765
- )
766
-
767
- # Combine interfaces into a tabbed interface
768
- demo = gr.TabbedInterface(
769
- [auth_interface, status_interface, accounts_interface, switch_interface, remove_interface, search_interface, details_interface, analysis_interface, send_interface],
770
- ["🔐 Authenticate", "📊 Status", "👥 All Accounts", "🔄 Switch Account", "🗑️ Remove Account", "📧 Email Search", "📄 Email Details", "📈 Pattern Analysis", "✉️ Send Email"],
771
- title="📧 Gmail Assistant MCP Server (Multi-Account OAuth)"
772
- )
773
-
774
- app = FastAPI()
775
- # Add your OAuth callback route
776
- @app.get("/oauth2callback")
777
- async def google_oauth_cb(request: Request):
778
- code = request.query_params.get("code")
779
- print("code:", code)
780
- return HTMLResponse(handle_oauth_callback(code))
781
-
782
- app = gr.mount_gradio_app(app, demo, path="/")
783
-
784
- if __name__ == "__main__":
785
- import uvicorn
786
- uvicorn.run(app, host="0.0.0.0", port=7860)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
agentic_implementation/email_scraper.py ADDED
@@ -0,0 +1,469 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Enhanced Email Scraper with Intelligent Caching
4
+ """
5
+
6
+ import os
7
+ import imaplib
8
+ import json
9
+ from email import message_from_bytes
10
+ from bs4 import BeautifulSoup
11
+ from datetime import datetime, timedelta
12
+ from dotenv import load_dotenv
13
+ from zoneinfo import ZoneInfo
14
+ from email.utils import parsedate_to_datetime
15
+ from typing import List, Dict
16
+
17
+ load_dotenv()
18
+
19
+ # Email credentials
20
+ APP_PASSWORD = os.getenv("APP_PASSWORD")
21
+ EMAIL_ID = os.getenv("EMAIL_ID")
22
+ print("EMAIL_ID: ", EMAIL_ID)
23
+ EMAIL_DB_FILE = "email_db.json"
24
+
25
+ def validate_email_setup():
26
+ """Validate email setup and credentials"""
27
+ print("=== Email Setup Validation ===")
28
+
29
+ # Check .env file existence
30
+ # env_file_exists = os.path.exists('.env')
31
+ # print(f".env file exists: {'✅ Yes' if env_file_exists else '❌ No'}")
32
+
33
+ # if not env_file_exists:
34
+ # print("❌ No .env file found! Create one with:")
35
+ # print(" EMAIL_ID=your_email@gmail.com")
36
+ # print(" APP_PASSWORD=your_16_char_app_password")
37
+ # print(" OPENAI_API_KEY=your_openai_key")
38
+ # return False
39
+
40
+ # Check environment variables
41
+ issues = []
42
+
43
+ # if not EMAIL_ID:
44
+ # issues.append("EMAIL_ID not set or empty")
45
+ # elif '@' not in EMAIL_ID:
46
+ # issues.append("EMAIL_ID doesn't look like an email address")
47
+ # elif not EMAIL_ID.endswith('@gmail.com'):
48
+ # issues.append("EMAIL_ID should be a Gmail address (@gmail.com)")
49
+
50
+ # if not APP_PASSWORD:
51
+ # issues.append("APP_PASSWORD not set or empty")
52
+ # elif len(APP_PASSWORD) != 16:
53
+ # issues.append(f"APP_PASSWORD should be 16 characters, got {len(APP_PASSWORD)}")
54
+ # elif ' ' in APP_PASSWORD:
55
+ # issues.append("APP_PASSWORD should not contain spaces (remove spaces from app password)")
56
+
57
+ if not os.getenv("OPENAI_API_KEY"):
58
+ issues.append("OPENAI_API_KEY not set (needed for query processing)")
59
+
60
+ if issues:
61
+ print("❌ Issues found:")
62
+ for issue in issues:
63
+ print(f" - {issue}")
64
+ return False
65
+ else:
66
+ print("✅ All credentials look good!")
67
+ return True
68
+
69
+ def _imap_connect():
70
+ """Connect to Gmail IMAP server"""
71
+ print("=== IMAP Connection Debug ===")
72
+
73
+ # Check if environment variables are loaded
74
+ print(f"EMAIL_ID loaded: {'✅ Yes' if EMAIL_ID else '❌ No (None/Empty)'}")
75
+ print(f"APP_PASSWORD loaded: {'✅ Yes' if APP_PASSWORD else '❌ No (None/Empty)'}")
76
+
77
+ if EMAIL_ID:
78
+ print(f"Email ID: {EMAIL_ID[:5]}...@{EMAIL_ID.split('@')[1] if '@' in EMAIL_ID else 'INVALID'}")
79
+ # if APP_PASSWORD:
80
+ # print(f"App Password length: {len(APP_PASSWORD)} characters")
81
+ # print(f"App Password format: {'✅ Looks correct (16 chars)' if len(APP_PASSWORD) == 16 else f'❌ Expected 16 chars, got {len(APP_PASSWORD)}'}")
82
+
83
+ if not EMAIL_ID or not APP_PASSWORD:
84
+ error_msg = "Missing credentials in environment variables!"
85
+ print(f"❌ {error_msg}")
86
+ raise Exception(error_msg)
87
+
88
+ try:
89
+ print("🔄 Attempting IMAP SSL connection to imap.gmail.com:993...")
90
+ mail = imaplib.IMAP4_SSL("imap.gmail.com")
91
+ print("✅ SSL connection established")
92
+
93
+ print("🔄 Attempting login...")
94
+ result = mail.login(EMAIL_ID, APP_PASSWORD)
95
+ print(f"✅ Login successful: {result}")
96
+
97
+ print("🔄 Selecting mailbox: [Gmail]/All Mail...")
98
+ result = mail.select('"[Gmail]/All Mail"')
99
+ print(f"✅ Mailbox selected: {result}")
100
+
101
+ print("=== IMAP Connection Successful ===")
102
+ return mail
103
+
104
+ except imaplib.IMAP4.error as e:
105
+ print(f"❌ IMAP Error: {e}")
106
+ print("💡 Possible causes:")
107
+ print(" - App Password is incorrect or expired")
108
+ print(" - 2FA not enabled on Gmail account")
109
+ print(" - IMAP access not enabled in Gmail settings")
110
+ print(" - Gmail account locked or requires security verification")
111
+ raise
112
+ except Exception as e:
113
+ print(f"❌ Connection Error: {e}")
114
+ print("💡 Possible causes:")
115
+ print(" - Network connectivity issues")
116
+ print(" - Gmail IMAP server temporarily unavailable")
117
+ print(" - Firewall blocking IMAP port 993")
118
+ raise
119
+
120
+ def _email_to_clean_text(msg):
121
+ """Extract clean text from email message"""
122
+ # Try HTML first
123
+ html_content = None
124
+ text_content = None
125
+
126
+ if msg.is_multipart():
127
+ for part in msg.walk():
128
+ content_type = part.get_content_type()
129
+ if content_type == "text/html":
130
+ try:
131
+ html_content = part.get_payload(decode=True).decode(errors="ignore")
132
+ except:
133
+ continue
134
+ elif content_type == "text/plain":
135
+ try:
136
+ text_content = part.get_payload(decode=True).decode(errors="ignore")
137
+ except:
138
+ continue
139
+ else:
140
+ # Non-multipart message
141
+ content_type = msg.get_content_type()
142
+ try:
143
+ content = msg.get_payload(decode=True).decode(errors="ignore")
144
+ if content_type == "text/html":
145
+ html_content = content
146
+ else:
147
+ text_content = content
148
+ except:
149
+ pass
150
+
151
+ # Clean HTML content
152
+ if html_content:
153
+ soup = BeautifulSoup(html_content, "html.parser")
154
+ # Remove script and style elements
155
+ for script in soup(["script", "style"]):
156
+ script.decompose()
157
+ return soup.get_text(separator=' ', strip=True)
158
+ elif text_content:
159
+ return text_content.strip()
160
+ else:
161
+ return ""
162
+
163
+ def _load_email_db() -> Dict:
164
+ """Load email database from file"""
165
+ if not os.path.exists(EMAIL_DB_FILE):
166
+ return {}
167
+ try:
168
+ with open(EMAIL_DB_FILE, "r") as f:
169
+ return json.load(f)
170
+ except (json.JSONDecodeError, IOError):
171
+ print(f"Warning: Could not load {EMAIL_DB_FILE}, starting with empty database")
172
+ return {}
173
+
174
+ def _save_email_db(db: Dict):
175
+ """Save email database to file"""
176
+ try:
177
+ with open(EMAIL_DB_FILE, "w") as f:
178
+ json.dump(db, f, indent=2)
179
+ except IOError as e:
180
+ print(f"Error saving database: {e}")
181
+ raise
182
+
183
+ def _date_to_imap_format(date_str: str) -> str:
184
+ """Convert DD-MMM-YYYY to IMAP date format"""
185
+ try:
186
+ dt = datetime.strptime(date_str, "%d-%b-%Y")
187
+ return dt.strftime("%d-%b-%Y")
188
+ except ValueError:
189
+ raise ValueError(f"Invalid date format: {date_str}. Expected DD-MMM-YYYY")
190
+
191
+ def _is_date_in_range(email_date: str, start_date: str, end_date: str) -> bool:
192
+ """Check if email date is within the specified range"""
193
+ try:
194
+ email_dt = datetime.strptime(email_date, "%d-%b-%Y")
195
+ start_dt = datetime.strptime(start_date, "%d-%b-%Y")
196
+ end_dt = datetime.strptime(end_date, "%d-%b-%Y")
197
+ return start_dt <= email_dt <= end_dt
198
+ except ValueError:
199
+ return False
200
+
201
+ def scrape_emails_from_sender(sender_email: str, start_date: str, end_date: str) -> List[Dict]:
202
+ """
203
+ Scrape emails from specific sender within date range
204
+ Uses intelligent caching to avoid re-scraping
205
+ """
206
+ print(f"Scraping emails from {sender_email} between {start_date} and {end_date}")
207
+
208
+ # Load existing database
209
+ db = _load_email_db()
210
+ sender_email = sender_email.lower().strip()
211
+
212
+ # Check if we have cached emails for this sender
213
+ if sender_email in db:
214
+ cached_emails = db[sender_email].get("emails", [])
215
+
216
+ # Filter cached emails by date range
217
+ filtered_emails = [
218
+ email for email in cached_emails
219
+ if _is_date_in_range(email["date"], start_date, end_date)
220
+ ]
221
+
222
+ # Check if we need to scrape more recent emails
223
+ last_scraped = db[sender_email].get("last_scraped", "01-Jan-2020")
224
+ today = datetime.today().strftime("%d-%b-%Y")
225
+
226
+ if last_scraped == today and filtered_emails:
227
+ print(f"Using cached emails (last scraped: {last_scraped})")
228
+ return filtered_emails
229
+
230
+ # Need to scrape emails
231
+ try:
232
+ mail = _imap_connect()
233
+
234
+ # Prepare IMAP search criteria
235
+ start_imap = _date_to_imap_format(start_date)
236
+ # Add one day to end_date for BEFORE criteria (IMAP BEFORE is exclusive)
237
+ end_dt = datetime.strptime(end_date, "%d-%b-%Y") + timedelta(days=1)
238
+ end_imap = end_dt.strftime("%d-%b-%Y")
239
+
240
+ search_criteria = f'(FROM "{sender_email}") SINCE "{start_imap}" BEFORE "{end_imap}"'
241
+ print(f"IMAP search: {search_criteria}")
242
+
243
+ # Search for emails
244
+ status, data = mail.search(None, search_criteria)
245
+ if status != 'OK':
246
+ raise Exception(f"IMAP search failed: {status}")
247
+
248
+ email_ids = data[0].split()
249
+ print(f"Found {len(email_ids)} emails")
250
+
251
+ scraped_emails = []
252
+
253
+ # Process each email
254
+ for i, email_id in enumerate(email_ids):
255
+ try:
256
+ print(f"Processing email {i+1}/{len(email_ids)}")
257
+
258
+ # Fetch email
259
+ status, msg_data = mail.fetch(email_id, "(RFC822)")
260
+ if status != 'OK':
261
+ continue
262
+
263
+ # Parse email
264
+ msg = message_from_bytes(msg_data[0][1])
265
+
266
+ # Extract information
267
+ subject = msg.get("Subject", "No Subject")
268
+ content = _email_to_clean_text(msg)
269
+
270
+ # Parse date
271
+ date_header = msg.get("Date", "")
272
+ if date_header:
273
+ try:
274
+ dt_obj = parsedate_to_datetime(date_header)
275
+ # Convert to IST
276
+ ist_dt = dt_obj.astimezone(ZoneInfo("Asia/Kolkata"))
277
+ email_date = ist_dt.strftime("%d-%b-%Y")
278
+ email_time = ist_dt.strftime("%H:%M:%S")
279
+ except:
280
+ email_date = datetime.today().strftime("%d-%b-%Y")
281
+ email_time = "00:00:00"
282
+ else:
283
+ email_date = datetime.today().strftime("%d-%b-%Y")
284
+ email_time = "00:00:00"
285
+
286
+ # Get message ID for deduplication
287
+ message_id = msg.get("Message-ID", f"missing-{email_id.decode()}")
288
+
289
+ scraped_emails.append({
290
+ "date": email_date,
291
+ "time": email_time,
292
+ "subject": subject,
293
+ "content": content[:2000], # Limit content length
294
+ "message_id": message_id
295
+ })
296
+
297
+ except Exception as e:
298
+ print(f"Error processing email {email_id}: {e}")
299
+ continue
300
+
301
+ mail.logout()
302
+
303
+ # Update database
304
+ if sender_email not in db:
305
+ db[sender_email] = {"emails": [], "last_scraped": ""}
306
+
307
+ # Merge with existing emails (avoid duplicates)
308
+ existing_emails = db[sender_email].get("emails", [])
309
+ existing_ids = {email.get("message_id") for email in existing_emails}
310
+
311
+ new_emails = [
312
+ email for email in scraped_emails
313
+ if email["message_id"] not in existing_ids
314
+ ]
315
+
316
+ # Update database
317
+ db[sender_email]["emails"] = existing_emails + new_emails
318
+ db[sender_email]["last_scraped"] = datetime.today().strftime("%d-%b-%Y")
319
+
320
+ # Save database
321
+ _save_email_db(db)
322
+
323
+ # Return filtered results
324
+ all_emails = db[sender_email]["emails"]
325
+ filtered_emails = [
326
+ email for email in all_emails
327
+ if _is_date_in_range(email["date"], start_date, end_date)
328
+ ]
329
+
330
+ print(f"Scraped {len(new_emails)} new emails, returning {len(filtered_emails)} in date range")
331
+ return filtered_emails
332
+
333
+ except Exception as e:
334
+ print(f"Email scraping failed: {e}")
335
+ raise
336
+
337
+ def scrape_emails_by_text_search(keyword: str, start_date: str, end_date: str) -> List[Dict]:
338
+ """
339
+ Scrape emails containing a specific keyword (like company name) within date range.
340
+ Uses IMAP text search to find emails from senders containing the keyword.
341
+ """
342
+ print(f"Searching emails containing '{keyword}' between {start_date} and {end_date}")
343
+
344
+ # Validate setup first
345
+ if not validate_email_setup():
346
+ raise Exception("Email setup validation failed. Please check your .env file and credentials.")
347
+
348
+ try:
349
+ mail = _imap_connect()
350
+
351
+ # Prepare IMAP search criteria with text search
352
+ start_imap = _date_to_imap_format(start_date)
353
+ # Add one day to end_date for BEFORE criteria (IMAP BEFORE is exclusive)
354
+ end_dt = datetime.strptime(end_date, "%d-%b-%Y") + timedelta(days=1)
355
+ end_imap = end_dt.strftime("%d-%b-%Y")
356
+
357
+ # Search for emails containing the keyword in FROM field or SUBJECT or BODY
358
+ # We'll search multiple criteria and combine results
359
+ search_criteria_list = [
360
+ f'FROM "{keyword}" SINCE "{start_imap}" BEFORE "{end_imap}"',
361
+ f'SUBJECT "{keyword}" SINCE "{start_imap}" BEFORE "{end_imap}"',
362
+ f'BODY "{keyword}" SINCE "{start_imap}" BEFORE "{end_imap}"'
363
+ ]
364
+
365
+ all_email_ids = set()
366
+
367
+ # Search with multiple criteria to catch emails containing the keyword
368
+ for search_criteria in search_criteria_list:
369
+ try:
370
+ print(f"IMAP search: {search_criteria}")
371
+ status, data = mail.search(None, search_criteria)
372
+ if status == 'OK' and data[0]:
373
+ email_ids = data[0].split()
374
+ all_email_ids.update(email_ids)
375
+ print(f"Found {len(email_ids)} emails with this criteria")
376
+ except Exception as e:
377
+ print(f"Search criteria failed: {search_criteria}, error: {e}")
378
+ continue
379
+
380
+ print(f"Total unique emails found: {len(all_email_ids)}")
381
+ scraped_emails = []
382
+
383
+ # Process each email
384
+ for i, email_id in enumerate(all_email_ids):
385
+ try:
386
+ print(f"Processing email {i+1}/{len(all_email_ids)}")
387
+
388
+ # Fetch email
389
+ status, msg_data = mail.fetch(email_id, "(RFC822)")
390
+ if status != 'OK':
391
+ continue
392
+
393
+ # Parse email
394
+ msg = message_from_bytes(msg_data[0][1])
395
+
396
+ # Extract information
397
+ subject = msg.get("Subject", "No Subject")
398
+ from_header = msg.get("From", "Unknown Sender")
399
+ content = _email_to_clean_text(msg)
400
+
401
+ # Check if the keyword is actually present (case-insensitive)
402
+ keyword_lower = keyword.lower()
403
+ if not any(keyword_lower in text.lower() for text in [subject, from_header, content]):
404
+ continue
405
+
406
+ # Parse date
407
+ date_header = msg.get("Date", "")
408
+ if date_header:
409
+ try:
410
+ dt_obj = parsedate_to_datetime(date_header)
411
+ # Convert to IST
412
+ ist_dt = dt_obj.astimezone(ZoneInfo("Asia/Kolkata"))
413
+ email_date = ist_dt.strftime("%d-%b-%Y")
414
+ email_time = ist_dt.strftime("%H:%M:%S")
415
+ except:
416
+ email_date = datetime.today().strftime("%d-%b-%Y")
417
+ email_time = "00:00:00"
418
+ else:
419
+ email_date = datetime.today().strftime("%d-%b-%Y")
420
+ email_time = "00:00:00"
421
+
422
+ # Double-check date range
423
+ if not _is_date_in_range(email_date, start_date, end_date):
424
+ continue
425
+
426
+ # Get message ID for deduplication
427
+ message_id = msg.get("Message-ID", f"missing-{email_id.decode()}")
428
+
429
+ scraped_emails.append({
430
+ "date": email_date,
431
+ "time": email_time,
432
+ "subject": subject,
433
+ "from": from_header,
434
+ "content": content[:2000], # Limit content length
435
+ "message_id": message_id
436
+ })
437
+
438
+ except Exception as e:
439
+ print(f"Error processing email {email_id}: {e}")
440
+ continue
441
+
442
+ mail.logout()
443
+
444
+ # Sort by date (newest first)
445
+ scraped_emails.sort(key=lambda x: datetime.strptime(f"{x['date']} {x['time']}", "%d-%b-%Y %H:%M:%S"), reverse=True)
446
+
447
+ print(f"Successfully processed {len(scraped_emails)} emails containing '{keyword}'")
448
+ return scraped_emails
449
+
450
+ except Exception as e:
451
+ print(f"Email text search failed: {e}")
452
+ raise
453
+
454
+ # Test the scraper
455
+ if __name__ == "__main__":
456
+ # Test scraping
457
+ try:
458
+ emails = scrape_emails_from_sender(
459
+ "noreply@example.com",
460
+ "01-Jun-2025",
461
+ "07-Jun-2025"
462
+ )
463
+
464
+ print(f"\nFound {len(emails)} emails:")
465
+ for email in emails[:3]: # Show first 3
466
+ print(f"- {email['date']} {email['time']}: {email['subject']}")
467
+
468
+ except Exception as e:
469
+ print(f"Test failed: {e}")
agentic_implementation/gmail_api_scraper.py DELETED
@@ -1,301 +0,0 @@
1
- #!/usr/bin/env python3
2
- """
3
- Gmail API-based Email Scraper with OAuth Authentication
4
- """
5
-
6
- import base64
7
- import re
8
- from datetime import datetime, timedelta
9
- from typing import List, Dict, Optional
10
- from email.mime.text import MIMEText
11
- import googleapiclient.errors
12
- from oauth_manager import oauth_manager
13
- from logger import logger
14
-
15
- class GmailAPIScraper:
16
- """Gmail API-based email scraper using OAuth authentication"""
17
-
18
- def __init__(self):
19
- """Initialize the Gmail API scraper"""
20
- self.oauth_manager = oauth_manager
21
-
22
- def _parse_date_string(self, date_str: str) -> datetime:
23
- """Parse date string in DD-MMM-YYYY format to datetime object"""
24
- try:
25
- return datetime.strptime(date_str, "%d-%b-%Y")
26
- except ValueError:
27
- raise ValueError(f"Invalid date format: {date_str}. Expected DD-MMM-YYYY")
28
-
29
- def _format_date_for_query(self, date_obj: datetime) -> str:
30
- """Format datetime object for Gmail API query"""
31
- return date_obj.strftime("%Y/%m/%d")
32
-
33
- def _decode_message_part(self, part: Dict) -> str:
34
- """Decode message part content"""
35
- data = part.get('body', {}).get('data', '')
36
- if data:
37
- # Decode base64url
38
- data += '=' * (4 - len(data) % 4) # Add padding if needed
39
- decoded_bytes = base64.urlsafe_b64decode(data)
40
- try:
41
- return decoded_bytes.decode('utf-8')
42
- except UnicodeDecodeError:
43
- return decoded_bytes.decode('utf-8', errors='ignore')
44
- return ''
45
-
46
- def _extract_email_content(self, message: Dict) -> str:
47
- """Extract readable content from Gmail API message"""
48
- content = ""
49
-
50
- if 'payload' not in message:
51
- return content
52
-
53
- payload = message['payload']
54
-
55
- # Handle multipart messages
56
- if 'parts' in payload:
57
- for part in payload['parts']:
58
- mime_type = part.get('mimeType', '')
59
-
60
- if mime_type == 'text/plain':
61
- content += self._decode_message_part(part)
62
- elif mime_type == 'text/html':
63
- html_content = self._decode_message_part(part)
64
- # Simple HTML tag removal
65
- clean_text = re.sub(r'<[^>]+>', '', html_content)
66
- content += clean_text
67
- elif mime_type.startswith('multipart/'):
68
- # Handle nested multipart
69
- if 'parts' in part:
70
- for nested_part in part['parts']:
71
- nested_mime = nested_part.get('mimeType', '')
72
- if nested_mime == 'text/plain':
73
- content += self._decode_message_part(nested_part)
74
- else:
75
- # Handle single part messages
76
- mime_type = payload.get('mimeType', '')
77
- if mime_type in ['text/plain', 'text/html']:
78
- raw_content = self._decode_message_part(payload)
79
- if mime_type == 'text/html':
80
- # Simple HTML tag removal
81
- content = re.sub(r'<[^>]+>', '', raw_content)
82
- else:
83
- content = raw_content
84
-
85
- return content.strip()
86
-
87
- def _get_header_value(self, headers: List[Dict], name: str) -> str:
88
- """Get header value by name"""
89
- for header in headers:
90
- if header.get('name', '').lower() == name.lower():
91
- return header.get('value', '')
92
- return ''
93
-
94
- def _parse_email_message(self, message: Dict) -> Dict:
95
- """Parse Gmail API message into structured format"""
96
- headers = message.get('payload', {}).get('headers', [])
97
-
98
- # Extract headers
99
- subject = self._get_header_value(headers, 'Subject') or 'No Subject'
100
- from_header = self._get_header_value(headers, 'From') or 'Unknown Sender'
101
- date_header = self._get_header_value(headers, 'Date')
102
- message_id = self._get_header_value(headers, 'Message-ID') or message.get('id', '')
103
-
104
- # Parse date
105
- email_date = datetime.now().strftime("%d-%b-%Y")
106
- email_time = "00:00:00"
107
-
108
- if date_header:
109
- try:
110
- # Parse RFC 2822 date format
111
- from email.utils import parsedate_to_datetime
112
- dt_obj = parsedate_to_datetime(date_header)
113
- # Convert to IST (Indian Standard Time)
114
- from zoneinfo import ZoneInfo
115
- ist_dt = dt_obj.astimezone(ZoneInfo("Asia/Kolkata"))
116
- email_date = ist_dt.strftime("%d-%b-%Y")
117
- email_time = ist_dt.strftime("%H:%M:%S")
118
- except Exception as e:
119
- logger.warning(f"Failed to parse date {date_header}: {e}")
120
-
121
- # Extract content
122
- content = self._extract_email_content(message)
123
-
124
- return {
125
- "date": email_date,
126
- "time": email_time,
127
- "subject": subject,
128
- "from": from_header,
129
- "content": content[:2000], # Limit content length
130
- "message_id": message_id,
131
- "gmail_id": message.get('id', '')
132
- }
133
-
134
- def search_emails(self, keyword: str, start_date: str, end_date: str) -> List[Dict]:
135
- """Search emails containing keyword within date range using Gmail API
136
-
137
- Args:
138
- keyword: Keyword to search for in emails
139
- start_date: Start date in DD-MMM-YYYY format
140
- end_date: End date in DD-MMM-YYYY format
141
-
142
- Returns:
143
- List of email dictionaries
144
- """
145
- logger.info(f"Searching emails containing '{keyword}' between {start_date} and {end_date}")
146
-
147
- # Get Gmail service
148
- service = self.oauth_manager.get_gmail_service()
149
- if not service:
150
- raise Exception("Not authenticated. Please authenticate first using the setup tool.")
151
-
152
- try:
153
- # Parse dates
154
- start_dt = self._parse_date_string(start_date)
155
- end_dt = self._parse_date_string(end_date)
156
-
157
- # Format dates for Gmail API query
158
- after_date = self._format_date_for_query(start_dt)
159
- before_date = self._format_date_for_query(end_dt + timedelta(days=1)) # Add 1 day for inclusive end
160
-
161
- # Build search query
162
- # Gmail API search syntax: https://developers.google.com/gmail/api/guides/filtering
163
- query_parts = [
164
- f'after:{after_date}',
165
- f'before:{before_date}',
166
- f'({keyword})' # Search in all fields
167
- ]
168
- query = ' '.join(query_parts)
169
-
170
- logger.info(f"Gmail API query: {query}")
171
-
172
- # Search for messages
173
- results = service.users().messages().list(
174
- userId='me',
175
- q=query,
176
- maxResults=500 # Limit to 500 results
177
- ).execute()
178
-
179
- messages = results.get('messages', [])
180
- logger.info(f"Found {len(messages)} messages")
181
-
182
- if not messages:
183
- return []
184
-
185
- # Fetch full message details
186
- scraped_emails = []
187
-
188
- for i, msg_ref in enumerate(messages):
189
- try:
190
- logger.info(f"Processing email {i+1}/{len(messages)}")
191
-
192
- # Get full message
193
- message = service.users().messages().get(
194
- userId='me',
195
- id=msg_ref['id'],
196
- format='full'
197
- ).execute()
198
-
199
- # Parse message
200
- parsed_email = self._parse_email_message(message)
201
-
202
- # Verify date range (double-check since Gmail search might be inclusive)
203
- email_dt = self._parse_date_string(parsed_email['date'])
204
- if start_dt <= email_dt <= end_dt:
205
- # Verify keyword presence (case-insensitive)
206
- keyword_lower = keyword.lower()
207
- if any(keyword_lower in text.lower() for text in [
208
- parsed_email['subject'],
209
- parsed_email['from'],
210
- parsed_email['content']
211
- ]):
212
- scraped_emails.append(parsed_email)
213
-
214
- except googleapiclient.errors.HttpError as e:
215
- logger.error(f"Error fetching message {msg_ref['id']}: {e}")
216
- continue
217
- except Exception as e:
218
- logger.error(f"Error processing message {msg_ref['id']}: {e}")
219
- continue
220
-
221
- # Sort by date (newest first)
222
- scraped_emails.sort(
223
- key=lambda x: datetime.strptime(f"{x['date']} {x['time']}", "%d-%b-%Y %H:%M:%S"),
224
- reverse=True
225
- )
226
-
227
- logger.info(f"Successfully processed {len(scraped_emails)} emails containing '{keyword}'")
228
- return scraped_emails
229
-
230
- except googleapiclient.errors.HttpError as e:
231
- logger.error(f"Gmail API error: {e}")
232
- raise Exception(f"Gmail API error: {e}")
233
- except Exception as e:
234
- logger.error(f"Email search failed: {e}")
235
- raise
236
-
237
- def get_email_by_id(self, message_id: str) -> Optional[Dict]:
238
- """Get email details by message ID or Gmail ID
239
-
240
- Args:
241
- message_id: Either the Message-ID header or Gmail message ID
242
-
243
- Returns:
244
- Email dictionary or None if not found
245
- """
246
- service = self.oauth_manager.get_gmail_service()
247
- if not service:
248
- raise Exception("Not authenticated. Please authenticate first using the setup tool.")
249
-
250
- try:
251
- # Try to get message directly by Gmail ID first
252
- try:
253
- message = service.users().messages().get(
254
- userId='me',
255
- id=message_id,
256
- format='full'
257
- ).execute()
258
- return self._parse_email_message(message)
259
- except googleapiclient.errors.HttpError:
260
- # If direct ID lookup fails, search by Message-ID header
261
- pass
262
-
263
- # Search by Message-ID header
264
- query = f'rfc822msgid:{message_id}'
265
- results = service.users().messages().list(
266
- userId='me',
267
- q=query,
268
- maxResults=1
269
- ).execute()
270
-
271
- messages = results.get('messages', [])
272
- if not messages:
273
- return None
274
-
275
- # Get the message
276
- message = service.users().messages().get(
277
- userId='me',
278
- id=messages[0]['id'],
279
- format='full'
280
- ).execute()
281
-
282
- return self._parse_email_message(message)
283
-
284
- except Exception as e:
285
- logger.error(f"Failed to get email {message_id}: {e}")
286
- return None
287
-
288
- def is_authenticated(self) -> bool:
289
- """Check if user is authenticated"""
290
- return self.oauth_manager.is_authenticated()
291
-
292
- def get_user_email(self) -> Optional[str]:
293
- """Get authenticated user's email address"""
294
- return self.oauth_manager.get_user_email()
295
-
296
- def authenticate(self) -> bool:
297
- """Trigger interactive authentication"""
298
- return self.oauth_manager.authenticate_interactive()
299
-
300
- # Global scraper instance
301
- gmail_scraper = GmailAPIScraper()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
agentic_implementation/name_mapping.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "dev agarwal": "agarwal.27@iitj.ac.in",
3
+ "axis bank": "alerts@axisbank.com"
4
+ }
agentic_implementation/oauth_manager.py DELETED
@@ -1,631 +0,0 @@
1
- import os
2
- import json
3
- import pickle
4
- import base64
5
- from pathlib import Path
6
- from typing import Optional, Dict, Any
7
- from cryptography.fernet import Fernet
8
- import google.auth.transport.requests
9
- import google_auth_oauthlib.flow
10
- import googleapiclient.discovery
11
- from google.oauth2.credentials import Credentials
12
- from google.auth.transport.requests import Request
13
- import webbrowser
14
- import threading
15
- import time
16
- from http.server import HTTPServer, BaseHTTPRequestHandler
17
- from urllib.parse import urlparse,parse_qs
18
- from logger import logger
19
- from dotenv import load_dotenv
20
- load_dotenv()
21
-
22
-
23
- redirect_uri=os.getenv("GOOGLE_REDIRECT_URI")
24
-
25
- class OAuthCallbackHandler(BaseHTTPRequestHandler):
26
- """HTTP request handler for OAuth callback"""
27
-
28
- def do_GET(self):
29
- """Handle GET request (OAuth callback)"""
30
- # Parse the callback URL to extract authorization code
31
- parsed_path = urlparse.urlparse(self.path)
32
- query_params = urlparse.parse_qs(parsed_path.query)
33
-
34
- if 'code' in query_params:
35
- # Store the authorization code
36
- self.server.auth_code = query_params['code'][0]
37
-
38
- # Send success response
39
- self.send_response(200)
40
- self.send_header('Content-type', 'text/html')
41
- self.end_headers()
42
-
43
- success_html = """
44
- <html>
45
- <head><title>Authentication Successful</title></head>
46
- <body style="font-family: Arial, sans-serif; text-align: center; padding: 50px;">
47
- <h1 style="color: #4CAF50;">✅ Authentication Successful!</h1>
48
- <p>You have successfully authenticated with Gmail.</p>
49
- <p>You can now close this window and return to Claude Desktop.</p>
50
- <script>
51
- setTimeout(function() {
52
- window.close();
53
- }, 3000);
54
- </script>
55
- </body>
56
- </html>
57
- """
58
- self.wfile.write(success_html.encode())
59
- else:
60
- # Send error response
61
- self.send_response(400)
62
- self.send_header('Content-type', 'text/html')
63
- self.end_headers()
64
-
65
- error_html = """
66
- <html>
67
- <head><title>Authentication Error</title></head>
68
- <body style="font-family: Arial, sans-serif; text-align: center; padding: 50px;">
69
- <h1 style="color: #f44336;">❌ Authentication Failed</h1>
70
- <p>There was an error during authentication.</p>
71
- <p>Please try again.</p>
72
- </body>
73
- </html>
74
- """
75
- self.wfile.write(error_html.encode())
76
-
77
- def log_message(self, format, *args):
78
- """Suppress server log messages"""
79
- pass
80
-
81
- class GmailOAuthManager:
82
- """Manages Gmail OAuth 2.0 authentication and token storage for multiple accounts"""
83
-
84
- # Gmail API scopes
85
- SCOPES = [
86
- 'https://www.googleapis.com/auth/gmail.readonly',
87
- 'https://www.googleapis.com/auth/gmail.modify'
88
- ]
89
-
90
- def __init__(self, credentials_dir: str = None):
91
- """Initialize OAuth manager
92
-
93
- Args:
94
- credentials_dir: Directory to store credentials (defaults to ~/.mailquery_oauth)
95
- """
96
- if credentials_dir is None:
97
- credentials_dir = os.path.expanduser("~/.mailquery_oauth")
98
-
99
- self.credentials_dir = Path(credentials_dir)
100
- self.credentials_dir.mkdir(exist_ok=True)
101
-
102
- # File paths
103
- self.client_secrets_file = self.credentials_dir / "client_secret.json"
104
- self.accounts_file = self.credentials_dir / "accounts.json"
105
- self.encryption_key_file = self.credentials_dir / "key.key"
106
- self.current_account_file = self.credentials_dir / "current_account.txt"
107
-
108
- # Initialize encryption
109
- self._init_encryption()
110
-
111
- # OAuth flow settings
112
- self.redirect_uri = redirect_uri
113
-
114
- # Current account
115
- self.current_account_email = self._load_current_account()
116
-
117
- def _init_encryption(self):
118
- """Initialize encryption for secure credential storage"""
119
- if self.encryption_key_file.exists():
120
- with open(self.encryption_key_file, 'rb') as key_file:
121
- self.encryption_key = key_file.read()
122
- else:
123
- self.encryption_key = Fernet.generate_key()
124
- with open(self.encryption_key_file, 'wb') as key_file:
125
- key_file.write(self.encryption_key)
126
- # Make key file readable only by owner
127
- os.chmod(self.encryption_key_file, 0o600)
128
-
129
- self.cipher_suite = Fernet(self.encryption_key)
130
-
131
- def _load_current_account(self) -> Optional[str]:
132
- """Load the currently selected account"""
133
- if self.current_account_file.exists():
134
- try:
135
- with open(self.current_account_file, 'r') as f:
136
- return f.read().strip()
137
- except Exception as e:
138
- logger.error(f"Failed to load current account: {e}")
139
- return None
140
-
141
- def _save_current_account(self, email: str):
142
- """Save the currently selected account"""
143
- try:
144
- with open(self.current_account_file, 'w') as f:
145
- f.write(email)
146
- self.current_account_email = email
147
- logger.info(f"Set current account to: {email}")
148
- except Exception as e:
149
- logger.error(f"Failed to save current account: {e}")
150
-
151
- def setup_client_secrets(self, client_id: str, client_secret: str):
152
- """Setup OAuth client secrets
153
-
154
- Args:
155
- client_id: Google OAuth 2.0 client ID
156
- client_secret: Google OAuth 2.0 client secret
157
- """
158
- client_config = {
159
- "web": {
160
- "client_id": client_id,
161
- "client_secret": client_secret,
162
- "auth_uri": "https://accounts.google.com/o/oauth2/auth",
163
- "token_uri": "https://oauth2.googleapis.com/token",
164
- "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
165
- "redirect_uris": [self.redirect_uri]
166
- }
167
- }
168
-
169
- with open(self.client_secrets_file, 'w') as f:
170
- json.dump(client_config, f, indent=2)
171
-
172
- logger.info("Client secrets saved successfully")
173
-
174
- def _encrypt_data(self, data: Any) -> bytes:
175
- """Encrypt data using Fernet encryption"""
176
- serialized_data = pickle.dumps(data)
177
- return self.cipher_suite.encrypt(serialized_data)
178
-
179
- def _decrypt_data(self, encrypted_data: bytes) -> Any:
180
- """Decrypt data using Fernet encryption"""
181
- decrypted_data = self.cipher_suite.decrypt(encrypted_data)
182
- return pickle.loads(decrypted_data)
183
-
184
- def get_authorization_url(self) -> str:
185
- """Get the authorization URL for OAuth flow
186
-
187
- Returns:
188
- Authorization URL that user should visit
189
- """
190
- if not self.client_secrets_file.exists():
191
- raise ValueError("Client secrets not found. Please run setup first.")
192
-
193
- flow = google_auth_oauthlib.flow.Flow.from_client_secrets_file(
194
- str(self.client_secrets_file),
195
- scopes=self.SCOPES
196
- )
197
- flow.redirect_uri = self.redirect_uri
198
- print("👉 redirect_uri being sent to Google:", self.redirect_uri, flush=True)
199
- auth_url, _ = flow.authorization_url(
200
- access_type='offline',
201
- include_granted_scopes='true',
202
- prompt='consent' # Force consent to get refresh token
203
- )
204
-
205
- return auth_url
206
-
207
- def authenticate_interactive(self) -> bool:
208
- """Interactive authentication flow for Hugging Face Spaces
209
-
210
- Returns:
211
- True if authentication successful, False otherwise
212
- """
213
- try:
214
- # Check if already authenticated
215
- if self.is_authenticated():
216
- logger.info("Already authenticated")
217
- return True
218
-
219
-
220
- # Get authorization URL
221
- auth_url = self.get_authorization_url()
222
-
223
- logger.info("Running on Hugging Face Spaces")
224
- logger.info(f"Authentication URL generated: {auth_url}")
225
- logger.info("User must visit the URL manually to complete authentication")
226
-
227
- # Store the auth URL for the Gradio interface to use
228
- self._pending_auth_url = auth_url
229
- self._auth_completed = False
230
-
231
- # For setup_oauth.py and testing contexts, we'll print the URL
232
- # and wait briefly to see if authentication completes
233
- print(f"\n🌐 Please visit this URL to authenticate:")
234
- print(f" {auth_url}")
235
- print("\n⏳ Waiting for authentication completion...")
236
-
237
- # Wait for a reasonable time to see if auth completes
238
- # This allows the callback to potentially complete the auth
239
- timeout = 10 # 1 minute for manual completion
240
- start_time = time.time()
241
-
242
- while (time.time() - start_time) < timeout:
243
- # Check if authentication was completed via callback
244
- if getattr(self, '_auth_completed', False):
245
- logger.info("Authentication completed successfully!")
246
- return True
247
-
248
- # Check if user is now authenticated (credentials were saved)
249
- if self.is_authenticated():
250
- self._auth_completed = True
251
- logger.info("Authentication verified successful!")
252
- return True
253
-
254
- time.sleep(2) # Check every 2 seconds
255
-
256
- # Timeout reached - authentication not completed
257
- logger.info("Authentication timeout. Please complete authentication via the provided URL.")
258
- return False
259
-
260
- except Exception as e:
261
- logger.error(f"Authentication failed: {e}")
262
- return False
263
-
264
- def complete_hf_spaces_auth(self, auth_code: str) -> bool:
265
- """Complete authentication for HF Spaces with received auth code
266
-
267
- Args:
268
- auth_code: Authorization code received from OAuth callback
269
-
270
- Returns:
271
- True if authentication successful, False otherwise
272
- """
273
- try:
274
- success = self._exchange_code_for_credentials(auth_code)
275
-
276
- if success:
277
- # Mark authentication as completed
278
- self._auth_completed = True
279
- logger.info("HF Spaces authentication marked as completed")
280
-
281
- return success
282
-
283
- except Exception as e:
284
- logger.error(f"Failed to complete HF Spaces authentication: {e}")
285
- return False
286
-
287
- def _exchange_code_for_credentials(self, auth_code: str) -> bool:
288
- """Exchange authorization code for credentials
289
-
290
- Args:
291
- auth_code: Authorization code from OAuth flow
292
-
293
- Returns:
294
- True if successful, False otherwise
295
- """
296
- try:
297
- # Exchange authorization code for credentials
298
- flow = google_auth_oauthlib.flow.Flow.from_client_secrets_file(
299
- str(self.client_secrets_file),
300
- scopes=self.SCOPES
301
- )
302
- flow.redirect_uri = self.redirect_uri
303
-
304
- flow.fetch_token(code=auth_code)
305
- credentials = flow.credentials
306
-
307
- # Get user email from credentials
308
- user_email = self._get_email_from_credentials(credentials)
309
- if not user_email:
310
- logger.error("Failed to get user email from credentials")
311
- return False
312
-
313
- # Save encrypted credentials for this account
314
- self._save_credentials(user_email, credentials)
315
-
316
- # Set as current account
317
- self._save_current_account(user_email)
318
-
319
- logger.info("Authentication successful!")
320
- return True
321
-
322
- except Exception as e:
323
- logger.error(f"Failed to exchange code for credentials: {e}")
324
- return False
325
-
326
- def get_pending_auth_url(self) -> str:
327
- """Get the pending authentication URL for manual completion
328
-
329
- Returns:
330
- Authentication URL string or None if not available
331
- """
332
- return getattr(self, '_pending_auth_url', None)
333
-
334
- def get_hf_redirect_uri(self) -> str:
335
- """Get the Hugging Face Spaces redirect URI
336
-
337
- Returns:
338
- Redirect URI string
339
- # """
340
- # space_id = os.getenv('SPACE_ID')
341
- # space_author = os.getenv('SPACE_AUTHOR', 'username')
342
- return redirect_uri
343
-
344
- # For running in a local machine, use this method instead
345
-
346
- # def authenticate_interactive(self) -> bool:
347
- # """Interactive authentication flow that opens browser
348
-
349
- # Returns:
350
- # True if authentication successful, False otherwise
351
- # """
352
- # try:
353
- # # Start local HTTP server for OAuth callback
354
- # server = HTTPServer(('localhost', 8080), OAuthCallbackHandler)
355
- # server.auth_code = None
356
-
357
- # # Get authorization URL
358
- # auth_url = self.get_authorization_url()
359
-
360
- # logger.info("Opening browser for authentication...")
361
- # logger.info(f"If browser doesn't open, visit: {auth_url}")
362
-
363
- # # Open browser
364
- # webbrowser.open(auth_url)
365
-
366
- # # Start server in background thread
367
- # server_thread = threading.Thread(target=server.handle_request)
368
- # server_thread.daemon = True
369
- # server_thread.start()
370
-
371
- # # Wait for callback (max 5 minutes)
372
- # timeout = 300 # 5 minutes
373
- # start_time = time.time()
374
-
375
- # while server.auth_code is None and (time.time() - start_time) < timeout:
376
- # time.sleep(1)
377
-
378
- # if server.auth_code is None:
379
- # logger.error("Authentication timed out")
380
- # return False
381
-
382
- # # Exchange authorization code for credentials
383
- # flow = google_auth_oauthlib.flow.Flow.from_client_secrets_file(
384
- # str(self.client_secrets_file),
385
- # scopes=self.SCOPES
386
- # )
387
- # flow.redirect_uri = self.redirect_uri
388
-
389
- # flow.fetch_token(code=server.auth_code)
390
- # credentials = flow.credentials
391
-
392
- # # Get user email from credentials
393
- # user_email = self._get_email_from_credentials(credentials)
394
- # if not user_email:
395
- # logger.error("Failed to get user email from credentials")
396
- # return False
397
-
398
- # # Save encrypted credentials for this account
399
- # self._save_credentials(user_email, credentials)
400
-
401
- # # Set as current account
402
- # self._save_current_account(user_email)
403
-
404
- # logger.info("Authentication successful!")
405
- # return True
406
-
407
- # except Exception as e:
408
- # logger.error(f"Authentication failed: {e}")
409
- # return False
410
-
411
- def _get_email_from_credentials(self, credentials: Credentials) -> Optional[str]:
412
- """Get email address from credentials"""
413
- try:
414
- service = googleapiclient.discovery.build(
415
- 'gmail', 'v1', credentials=credentials
416
- )
417
- profile = service.users().getProfile(userId='me').execute()
418
- return profile.get('emailAddress')
419
- except Exception as e:
420
- logger.error(f"Failed to get email from credentials: {e}")
421
- return None
422
-
423
- def _save_credentials(self, email: str, credentials: Credentials):
424
- """Save encrypted credentials for a specific account"""
425
- try:
426
- # Load existing accounts
427
- accounts = self._load_accounts()
428
-
429
- # Encrypt and store credentials
430
- encrypted_credentials = self._encrypt_data(credentials)
431
- accounts[email] = base64.b64encode(encrypted_credentials).decode('utf-8')
432
-
433
- # Save accounts file
434
- with open(self.accounts_file, 'w') as f:
435
- json.dump(accounts, f, indent=2)
436
-
437
- # Make accounts file readable only by owner
438
- os.chmod(self.accounts_file, 0o600)
439
-
440
- logger.info(f"Credentials saved for account: {email}")
441
- except Exception as e:
442
- logger.error(f"Failed to save credentials for {email}: {e}")
443
- raise
444
-
445
- def _load_accounts(self) -> Dict[str, str]:
446
- """Load accounts data"""
447
- if not self.accounts_file.exists():
448
- return {}
449
-
450
- try:
451
- with open(self.accounts_file, 'r') as f:
452
- return json.load(f)
453
- except Exception as e:
454
- logger.error(f"Failed to load accounts: {e}")
455
- return {}
456
-
457
- def _load_credentials(self, email: str) -> Optional[Credentials]:
458
- """Load and decrypt credentials for a specific account"""
459
- accounts = self._load_accounts()
460
-
461
- if email not in accounts:
462
- return None
463
-
464
- try:
465
- encrypted_credentials = base64.b64decode(accounts[email])
466
- credentials = self._decrypt_data(encrypted_credentials)
467
- return credentials
468
- except Exception as e:
469
- logger.error(f"Failed to load credentials for {email}: {e}")
470
- return None
471
-
472
- def get_valid_credentials(self, email: str = None) -> Optional[Credentials]:
473
- """Get valid credentials for an account, refreshing if necessary
474
-
475
- Args:
476
- email: Email address of account (uses current account if None)
477
-
478
- Returns:
479
- Valid Credentials object or None if authentication required
480
- """
481
- if email is None:
482
- email = self.current_account_email
483
-
484
- if not email:
485
- logger.warning("No current account set")
486
- return None
487
-
488
- credentials = self._load_credentials(email)
489
-
490
- if not credentials:
491
- logger.warning(f"No stored credentials found for {email}")
492
- return None
493
-
494
- # Refresh if expired
495
- if credentials.expired and credentials.refresh_token:
496
- try:
497
- logger.info(f"Refreshing expired credentials for {email}...")
498
- credentials.refresh(Request())
499
- self._save_credentials(email, credentials)
500
- logger.info("Credentials refreshed successfully")
501
- except Exception as e:
502
- logger.error(f"Failed to refresh credentials for {email}: {e}")
503
- return None
504
-
505
- if not credentials.valid:
506
- logger.warning(f"Credentials are not valid for {email}")
507
- return None
508
-
509
- return credentials
510
-
511
- def is_authenticated(self, email: str = None) -> bool:
512
- """Check if user is authenticated
513
-
514
- Args:
515
- email: Email address to check (uses current account if None)
516
-
517
- Returns:
518
- True if valid credentials exist, False otherwise
519
- """
520
- return self.get_valid_credentials(email) is not None
521
-
522
- def switch_account(self, email: str) -> bool:
523
- """Switch to a different authenticated account
524
-
525
- Args:
526
- email: Email address to switch to
527
-
528
- Returns:
529
- True if switch successful, False if account not found or not authenticated
530
- """
531
- if self.is_authenticated(email):
532
- self._save_current_account(email)
533
- logger.info(f"Switched to account: {email}")
534
- return True
535
- else:
536
- logger.error(f"Account {email} is not authenticated")
537
- return False
538
-
539
- def list_accounts(self) -> Dict[str, bool]:
540
- """List all stored accounts and their authentication status
541
-
542
- Returns:
543
- Dictionary mapping email addresses to authentication status
544
- """
545
- accounts = self._load_accounts()
546
- result = {}
547
-
548
- for email in accounts.keys():
549
- result[email] = self.is_authenticated(email)
550
-
551
- return result
552
-
553
- def remove_account(self, email: str):
554
- """Remove an account and its credentials
555
-
556
- Args:
557
- email: Email address to remove
558
- """
559
- accounts = self._load_accounts()
560
-
561
- if email in accounts:
562
- del accounts[email]
563
-
564
- # Save updated accounts
565
- with open(self.accounts_file, 'w') as f:
566
- json.dump(accounts, f, indent=2)
567
-
568
- # If this was the current account, clear it
569
- if self.current_account_email == email:
570
- if self.current_account_file.exists():
571
- self.current_account_file.unlink()
572
- self.current_account_email = None
573
-
574
- logger.info(f"Removed account: {email}")
575
- else:
576
- logger.warning(f"Account {email} not found")
577
-
578
- def clear_credentials(self):
579
- """Clear all stored credentials"""
580
- if self.accounts_file.exists():
581
- self.accounts_file.unlink()
582
- if self.current_account_file.exists():
583
- self.current_account_file.unlink()
584
- self.current_account_email = None
585
- logger.info("All credentials cleared")
586
-
587
- def get_gmail_service(self, email: str = None):
588
- """Get authenticated Gmail service object
589
-
590
- Args:
591
- email: Email address (uses current account if None)
592
-
593
- Returns:
594
- Gmail service object or None if not authenticated
595
- """
596
- credentials = self.get_valid_credentials(email)
597
- if not credentials:
598
- return None
599
-
600
- try:
601
- service = googleapiclient.discovery.build(
602
- 'gmail', 'v1', credentials=credentials
603
- )
604
- return service
605
- except Exception as e:
606
- logger.error(f"Failed to build Gmail service: {e}")
607
- return None
608
-
609
- def get_user_email(self, email: str = None) -> Optional[str]:
610
- """Get the authenticated user's email address
611
-
612
- Args:
613
- email: Email address (uses current account if None)
614
-
615
- Returns:
616
- User's email address or None if not authenticated
617
- """
618
- if email is None:
619
- return self.current_account_email
620
- return email if self.is_authenticated(email) else None
621
-
622
- def get_current_account(self) -> Optional[str]:
623
- """Get the currently selected account
624
-
625
- Returns:
626
- Current account email or None if no account selected
627
- """
628
- return self.current_account_email
629
-
630
- # Global OAuth manager instance
631
- oauth_manager = GmailOAuthManager(credentials_dir="secure_data")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
agentic_implementation/re_act.py ADDED
@@ -0,0 +1,229 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # orchestrator.py
2
+
3
+ import os
4
+ import json
5
+ import re
6
+ from typing import Any, Dict, Tuple, Optional
7
+ from datetime import datetime
8
+
9
+ from dotenv import load_dotenv
10
+ from openai import OpenAI
11
+
12
+ from schemas import Plan, PlanStep, FetchEmailsParams
13
+ from tools import TOOL_MAPPING
14
+
15
+ # Load .env and initialize OpenAI client
16
+ load_dotenv()
17
+ api_key = os.getenv("OPENAI_API_KEY")
18
+ if not api_key:
19
+ raise RuntimeError("Missing OPENAI_API_KEY in environment")
20
+ client = OpenAI(api_key=api_key)
21
+
22
+ # File paths for name mapping
23
+ NAME_MAPPING_FILE = "name_mapping.json"
24
+
25
+ # === Hard-coded list of available actions ===
26
+ SYSTEM_PLAN_PROMPT = """
27
+ You are an email assistant agent. You have access to the following actions:
28
+
29
+ • fetch_emails - fetch emails using text search with sender keywords and date extraction (e.g., "swiggy emails last week")
30
+ • show_email - display specific email content
31
+ • analyze_emails - analyze email patterns or content
32
+ • draft_reply - create a reply to an email
33
+ • send_reply - send a drafted reply
34
+ • done - complete the task
35
+
36
+ When the user gives you a query, output _only_ valid JSON of this form:
37
+
38
+ {
39
+ "plan": [
40
+ "fetch_emails",
41
+ ...,
42
+ "done"
43
+ ]
44
+ }
45
+
46
+ Rules:
47
+ - Use "fetch_emails" for text-based email search (automatically extracts sender keywords and dates)
48
+ - The final entry _must_ be "done"
49
+ - If no tool is needed, return `{"plan":["done"]}`
50
+
51
+ Example: For "show me emails from swiggy today" → ["fetch_emails", "done"]
52
+ """
53
+
54
+ SYSTEM_VALIDATOR_TEMPLATE = """
55
+ You are a plan validator.
56
+ Context (results so far):
57
+ {context}
58
+
59
+ Next action:
60
+ {action}
61
+
62
+ Reply _only_ with JSON:
63
+ {{
64
+ "should_execute": <true|false>,
65
+ "parameters": <null or a JSON object with this action's parameters>
66
+ }}
67
+ """
68
+
69
+
70
+ def _load_name_mapping() -> Dict[str, str]:
71
+ """Load name to email mapping from JSON file"""
72
+ if not os.path.exists(NAME_MAPPING_FILE):
73
+ return {}
74
+ try:
75
+ with open(NAME_MAPPING_FILE, "r") as f:
76
+ return json.load(f)
77
+ except (json.JSONDecodeError, IOError):
78
+ return {}
79
+
80
+
81
+ def _save_name_mapping(mapping: Dict[str, str]):
82
+ """Save name to email mapping to JSON file"""
83
+ with open(NAME_MAPPING_FILE, "w") as f:
84
+ json.dump(mapping, f, indent=2)
85
+
86
+
87
+ def store_name_email_mapping(name: str, email: str):
88
+ """Store new name to email mapping"""
89
+ name_mapping = _load_name_mapping()
90
+ name_mapping[name.lower().strip()] = email.lower().strip()
91
+ _save_name_mapping(name_mapping)
92
+
93
+
94
+ def extract_sender_info(query: str) -> Dict:
95
+ """
96
+ Extract sender information from user query using LLM
97
+ """
98
+ system_prompt = """
99
+ You are an email query parser that extracts sender information.
100
+
101
+ Given a user query, extract the sender intent - the person/entity they want emails from.
102
+ This could be:
103
+ - A person's name (e.g., "dev", "john smith", "dev agarwal")
104
+ - A company/service (e.g., "amazon", "google", "linkedin")
105
+ - An email address (e.g., "john@company.com")
106
+
107
+ Examples:
108
+ - "emails from dev agarwal last week" → "dev agarwal"
109
+ - "show amazon emails from last month" → "amazon"
110
+ - "emails from john@company.com yesterday" → "john@company.com"
111
+ - "get messages from sarah" → "sarah"
112
+
113
+ Return ONLY valid JSON:
114
+ {
115
+ "sender_intent": "extracted name, company, or email"
116
+ }
117
+ """
118
+
119
+ response = client.chat.completions.create(
120
+ model="gpt-4o-mini",
121
+ temperature=0.0,
122
+ messages=[
123
+ {"role": "system", "content": system_prompt},
124
+ {"role": "user", "content": query}
125
+ ],
126
+ )
127
+
128
+ result = json.loads(response.choices[0].message.content)
129
+ return result
130
+
131
+
132
+ def resolve_sender_email(sender_intent: str) -> Tuple[Optional[str], bool]:
133
+ """
134
+ Resolve sender intent to actual email address
135
+ Returns: (email_address, needs_user_input)
136
+ """
137
+ # Check if it's already an email address
138
+ if "@" in sender_intent:
139
+ return sender_intent.lower(), False
140
+
141
+ # Load name mapping
142
+ name_mapping = _load_name_mapping()
143
+
144
+ # Normalize the intent (lowercase for comparison)
145
+ normalized_intent = sender_intent.lower().strip()
146
+
147
+ # Check direct match
148
+ if normalized_intent in name_mapping:
149
+ return name_mapping[normalized_intent], False
150
+
151
+ # Check partial matches (fuzzy matching)
152
+ for name, email in name_mapping.items():
153
+ if normalized_intent in name.lower() or name.lower() in normalized_intent:
154
+ return email, False
155
+
156
+ # No match found
157
+ return None, True
158
+
159
+
160
+ def get_plan_from_llm(user_query: str) -> Plan:
161
+ """
162
+ Ask the LLM which actions to run, in order. No parameters here.
163
+ """
164
+ response = client.chat.completions.create(
165
+ model="gpt-4o-mini",
166
+ temperature=0.0,
167
+ messages=[
168
+ {"role": "system", "content": SYSTEM_PLAN_PROMPT},
169
+ {"role": "user", "content": user_query},
170
+ ],
171
+ )
172
+
173
+ plan_json = json.loads(response.choices[0].message.content)
174
+ steps = [PlanStep(action=a) for a in plan_json["plan"]]
175
+ return Plan(plan=steps)
176
+
177
+
178
+ def think(
179
+ step: PlanStep,
180
+ context: Dict[str, Any],
181
+ user_query: str
182
+ ) -> Tuple[bool, Optional[PlanStep], Optional[str]]:
183
+ """
184
+ Fill in parameters or skip based on the action:
185
+ - fetch_emails: pass the raw query for text-based search and date extraction
186
+ - others: ask the LLM validator for params
187
+
188
+ Returns: (should_execute, updated_step, user_prompt_if_needed)
189
+ """
190
+ # 1) fetch_emails → pass the full query for text-based search and date extraction
191
+ if step.action == "fetch_emails":
192
+ params = FetchEmailsParams(
193
+ query=user_query # Pass the full query for keyword and date extraction
194
+ )
195
+ return True, PlanStep(action="fetch_emails", parameters=params), None
196
+
197
+ # 2) everything else → validate & supply params via LLM
198
+ prompt = SYSTEM_VALIDATOR_TEMPLATE.format(
199
+ context=json.dumps(context, indent=2),
200
+ action=step.action,
201
+ )
202
+ response = client.chat.completions.create(
203
+ model="gpt-4o-mini",
204
+ temperature=0.0,
205
+ messages=[
206
+ {"role": "system", "content": "Validate or supply parameters for this action."},
207
+ {"role": "user", "content": prompt},
208
+ ],
209
+ )
210
+ verdict = json.loads(response.choices[0].message.content)
211
+ if not verdict.get("should_execute", False):
212
+ return False, None, None
213
+
214
+ return True, PlanStep(
215
+ action=step.action,
216
+ parameters=verdict.get("parameters")
217
+ ), None
218
+
219
+
220
+ def act(step: PlanStep) -> Any:
221
+ """
222
+ Dispatch to the actual implementation in tools.py.
223
+ """
224
+ fn = TOOL_MAPPING.get(step.action)
225
+ if fn is None:
226
+ raise ValueError(f"Unknown action '{step.action}'")
227
+
228
+ kwargs = step.parameters.model_dump() if step.parameters else {}
229
+ return fn(**kwargs)
agentic_implementation/schemas.py ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # schemas.py
2
+
3
+ from pydantic import BaseModel, EmailStr
4
+ from typing import List, Literal, Optional, Union
5
+
6
+
7
+
8
+ class FetchEmailsParams(BaseModel):
9
+ query: str # Natural language query with sender and date info (e.g., "show me mails for last week from swiggy")
10
+
11
+
12
+ class ShowEmailParams(BaseModel):
13
+ message_id: str
14
+
15
+ class AnalyzeEmailsParams(BaseModel):
16
+ emails: List[dict]
17
+
18
+ class DraftReplyParams(BaseModel):
19
+ email: dict
20
+ tone: Optional[Literal["formal", "informal"]] = "formal"
21
+
22
+ class SendReplyParams(BaseModel):
23
+ message_id: str
24
+ reply_body: str
25
+
26
+
27
+ ToolParams = Union[
28
+ FetchEmailsParams,
29
+ ShowEmailParams,
30
+ AnalyzeEmailsParams,
31
+ DraftReplyParams,
32
+ SendReplyParams
33
+ ]
34
+
35
+ class PlanStep(BaseModel):
36
+ action: Literal[
37
+ "fetch_emails",
38
+ "show_email",
39
+ "analyze_emails",
40
+ "draft_reply",
41
+ "send_reply",
42
+ "done",
43
+ ]
44
+ parameters: Optional[ToolParams] = None
45
+
46
+ class Plan(BaseModel):
47
+ plan: List[PlanStep]
agentic_implementation/setup_oauth.py DELETED
@@ -1,263 +0,0 @@
1
- #!/usr/bin/env python3
2
- """
3
- OAuth Setup Utility for Gmail MCP Server
4
-
5
- This script helps users set up OAuth authentication for the Gmail MCP server.
6
- """
7
-
8
- import sys
9
- import os
10
- import json
11
- from pathlib import Path
12
- from oauth_manager import oauth_manager
13
- from logger import logger
14
- from dotenv import load_dotenv
15
- load_dotenv()
16
- import os
17
-
18
-
19
- def print_banner():
20
- """Print setup banner"""
21
- print("=" * 60)
22
- print("📧 Gmail MCP Server - OAuth Setup")
23
- print("=" * 60)
24
- print()
25
-
26
- def print_step(step_num: int, title: str):
27
- """Print step header"""
28
- print(f"\n🔹 Step {step_num}: {title}")
29
- print("-" * 50)
30
-
31
- def check_dependencies():
32
- """Check if required dependencies are installed"""
33
- try:
34
- import google.auth
35
- import google_auth_oauthlib
36
- import googleapiclient
37
- import cryptography
38
- print("✅ All required dependencies are installed")
39
- return True
40
- except ImportError as e:
41
- print(f"❌ Missing dependency: {e}")
42
- print("\nPlease install the required dependencies:")
43
- print("pip install google-auth google-auth-oauthlib google-api-python-client cryptography")
44
- return False
45
-
46
- def setup_google_cloud_project():
47
- """Guide user through Google Cloud project setup"""
48
- print_step(1, "Google Cloud Project Setup")
49
-
50
- print("You need to create a Google Cloud project and enable the Gmail API.")
51
- print("\n📋 Follow these steps:")
52
- print("1. Go to: https://console.cloud.google.com/")
53
- print("2. Create a new project or select an existing one")
54
- print("3. Enable the Gmail API:")
55
- print(" - Go to 'APIs & Services' > 'Library'")
56
- print(" - Search for 'Gmail API'")
57
- print(" - Click 'Enable'")
58
-
59
- input("\n✅ Press Enter when you've completed these steps...")
60
-
61
- def setup_oauth_consent():
62
- """Guide user through OAuth consent screen setup"""
63
- print_step(2, "OAuth Consent Screen Setup")
64
-
65
- print("Now you need to configure the OAuth consent screen.")
66
- print("\n📋 Follow these steps:")
67
- print("1. Go to: https://console.cloud.google.com/apis/credentials/consent")
68
- print("2. Choose 'External' user type (unless using Google Workspace)")
69
- print("3. Fill in the app information:")
70
- print(" - App name: 'Gmail MCP Server' (or your preferred name)")
71
- print(" - User support email: Your email address")
72
- print(" - Developer contact: Your email address")
73
- print("4. Add these scopes:")
74
- print(" - https://www.googleapis.com/auth/gmail.readonly")
75
- print(" - https://www.googleapis.com/auth/gmail.modify")
76
- print("5. Add your email as a test user")
77
- print("6. Complete the setup")
78
-
79
- input("\n✅ Press Enter when you've completed these steps...")
80
-
81
- def setup_oauth_credentials():
82
- """Guide user through OAuth credentials setup"""
83
- print_step(3, "OAuth Client Credentials Setup")
84
-
85
- client_id = os.getenv("GOOGLE_CLIENT_ID")
86
- client_secret = os.getenv("GOOGLE_CLIENT_SECRET")
87
-
88
- if not client_id or not client_secret:
89
- print("❌ Missing GOOGLE_CLIENT_ID or GOOGLE_CLIENT_SECRET in your .env")
90
- print(" Please add:")
91
- print(" GOOGLE_CLIENT_ID=your-client-id")
92
- print(" GOOGLE_CLIENT_SECRET=your-client-secret")
93
- return False
94
-
95
- try:
96
- oauth_manager.setup_client_secrets(client_id, client_secret)
97
- print("✅ OAuth credentials saved successfully")
98
- return True
99
- except Exception as e:
100
- print(f"❌ Failed to save credentials: {e}")
101
- return False
102
-
103
- def test_authentication():
104
- """Test the OAuth authentication flow"""
105
- print_step(4, "Authentication Test")
106
-
107
- print("Now let's test the authentication flow.")
108
- print("This will open your web browser for authentication.")
109
-
110
- confirm = input("\n🌐 Ready to open browser for authentication? (y/n): ").strip().lower()
111
- if confirm != 'y':
112
- print("Authentication test skipped.")
113
- return False
114
-
115
- try:
116
- print("\n🔄 Starting authentication flow...")
117
- success = oauth_manager.authenticate_interactive()
118
-
119
- if success:
120
- print("✅ Authentication successful!")
121
-
122
- # Test getting user info
123
- user_email = oauth_manager.get_user_email()
124
- if user_email:
125
- print(f"✅ Authenticated as: {user_email}")
126
-
127
- return True
128
- else:
129
- print("❌ Authentication failed")
130
- return False
131
-
132
- except Exception as e:
133
- print(f"❌ Authentication error: {e}")
134
- return False
135
-
136
- def show_completion_info():
137
- """Show completion information and next steps"""
138
- print("\n" + "=" * 60)
139
- print("🎉 Setup Complete!")
140
- print("=" * 60)
141
-
142
- print("\n✅ Your Gmail MCP server is now configured with OAuth authentication!")
143
- print("\n📝 Next steps:")
144
- print("1. Start the MCP server:")
145
- print(" python email_mcp_server_oauth.py")
146
- print("\n2. Configure Claude Desktop:")
147
- print(' Add this to your MCP configuration:')
148
- print(' {')
149
- print(' "mcpServers": {')
150
- print(' "gmail-oauth": {')
151
- print(' "command": "npx",')
152
- print(' "args": ["mcp-remote", "http://localhost:7860/gradio_api/mcp/sse"]')
153
- print(' }')
154
- print(' }')
155
- print(' }')
156
-
157
- print("\n🔐 Security notes:")
158
- print("- Your credentials are encrypted and stored locally")
159
- print("- Tokens are automatically refreshed when needed")
160
- print("- You can revoke access anytime from Google Account settings")
161
-
162
- credentials_dir = oauth_manager.credentials_dir
163
- print(f"\n📁 Credentials stored in: {credentials_dir}")
164
-
165
- def show_help():
166
- """Show help information"""
167
- print("Gmail MCP Server OAuth Setup")
168
- print("\nUsage:")
169
- print(" python setup_oauth.py # Full interactive setup")
170
- print(" python setup_oauth.py --help # Show this help")
171
- print(" python setup_oauth.py --auth # Re-authenticate only")
172
- print(" python setup_oauth.py --status # Check authentication status")
173
- print(" python setup_oauth.py --clear # Clear stored credentials")
174
-
175
- def check_status():
176
- """Check authentication status"""
177
- print("🔍 Checking authentication status...")
178
-
179
- if oauth_manager.is_authenticated():
180
- user_email = oauth_manager.get_user_email()
181
- print(f"✅ Authenticated as: {user_email}")
182
- return True
183
- else:
184
- print("❌ Not authenticated")
185
- return False
186
-
187
- def clear_credentials():
188
- """Clear stored credentials"""
189
- confirm = input("⚠️ This will clear all stored credentials. Continue? (y/n): ").strip().lower()
190
- if confirm == 'y':
191
- oauth_manager.clear_credentials()
192
- print("✅ Credentials cleared")
193
- else:
194
- print("Operation cancelled")
195
-
196
- def main():
197
- """Main setup function"""
198
- if len(sys.argv) > 1:
199
- arg = sys.argv[1].lower()
200
-
201
- if arg in ['--help', '-h', 'help']:
202
- show_help()
203
- return
204
- elif arg == '--status':
205
- check_status()
206
- return
207
- elif arg == '--auth':
208
- print("🔄 Starting re-authentication...")
209
- if test_authentication():
210
- print("✅ Re-authentication successful")
211
- else:
212
- print("❌ Re-authentication failed")
213
- return
214
- elif arg == '--clear':
215
- clear_credentials()
216
- return
217
- else:
218
- print(f"Unknown argument: {arg}")
219
- show_help()
220
- return
221
-
222
- # Full interactive setup
223
- print_banner()
224
-
225
- # Check if already authenticated
226
- if oauth_manager.is_authenticated():
227
- user_email = oauth_manager.get_user_email()
228
- print(f"✅ Already authenticated as: {user_email}")
229
-
230
- choice = input("\n🔄 Do you want to re-authenticate? (y/n): ").strip().lower()
231
- if choice == 'y':
232
- if test_authentication():
233
- show_completion_info()
234
- else:
235
- print("Setup complete - you're already authenticated!")
236
- return
237
-
238
- # Check dependencies
239
- if not check_dependencies():
240
- return
241
-
242
- # Full setup flow
243
- try:
244
- setup_google_cloud_project()
245
- setup_oauth_consent()
246
-
247
- if not setup_oauth_credentials():
248
- print("❌ Setup failed at credentials step")
249
- return
250
-
251
- if test_authentication():
252
- show_completion_info()
253
- else:
254
- print("❌ Setup completed but authentication test failed")
255
- print("You can try authentication later with: python setup_oauth.py --auth")
256
-
257
- except KeyboardInterrupt:
258
- print("\n\n⚠️ Setup interrupted by user")
259
- except Exception as e:
260
- print(f"\n❌ Setup failed: {e}")
261
-
262
- if __name__ == "__main__":
263
- main()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
agentic_implementation/tools.py ADDED
@@ -0,0 +1,244 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from schemas import (
2
+ FetchEmailsParams,
3
+ ShowEmailParams,
4
+ AnalyzeEmailsParams,
5
+ DraftReplyParams,
6
+ SendReplyParams,
7
+ )
8
+ from typing import Any, Dict
9
+ from email_scraper import scrape_emails_from_sender, scrape_emails_by_text_search, _load_email_db, _save_email_db, _is_date_in_range
10
+ from datetime import datetime, timedelta
11
+ from typing import List
12
+ from openai import OpenAI
13
+ import json
14
+ from dotenv import load_dotenv
15
+ import os
16
+
17
+ # Load environment variables from .env file
18
+ load_dotenv()
19
+
20
+ # Initialize OpenAI client
21
+ OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
22
+ client = OpenAI(api_key=OPENAI_API_KEY)
23
+
24
+
25
+ def extract_query_info(query: str) -> Dict[str, str]:
26
+ """
27
+ Use an LLM to extract sender information and date range from a user query.
28
+ Returns {"sender_keyword": "company/sender name", "start_date":"DD-MMM-YYYY","end_date":"DD-MMM-YYYY"}.
29
+ """
30
+ today_str = datetime.today().strftime("%d-%b-%Y")
31
+ five_days_ago = (datetime.today() - timedelta(days=5)).strftime("%d-%b-%Y")
32
+
33
+ system_prompt = f"""
34
+ You are a query parser for email search. Today is {today_str}.
35
+
36
+ Given a user query, extract the sender/company keyword and date range. Return _only_ valid JSON with:
37
+ {{
38
+ "sender_keyword": "keyword or company name to search for",
39
+ "start_date": "DD-MMM-YYYY",
40
+ "end_date": "DD-MMM-YYYY"
41
+ }}
42
+
43
+ Rules:
44
+ 1. Extract sender keywords from phrases like "from swiggy", "swiggy emails", "mails from amazon", etc.
45
+ 2. If no time is mentioned, use last 5 days: {five_days_ago} to {today_str}
46
+ 3. Interpret relative dates as:
47
+ - "today" → {today_str} to {today_str}
48
+ - "yesterday" → 1 day ago to 1 day ago
49
+ - "last week" → 7 days ago to {today_str}
50
+ - "last month" → 30 days ago to {today_str}
51
+ - "last N days" → N days ago to {today_str}
52
+
53
+ Examples:
54
+ - "show me mails for last week from swiggy"
55
+ → {{"sender_keyword": "swiggy", "start_date": "01-Jun-2025", "end_date": "{today_str}"}}
56
+ - "emails from amazon yesterday"
57
+ → {{"sender_keyword": "amazon", "start_date": "06-Jun-2025", "end_date": "06-Jun-2025"}}
58
+ - "show flipkart emails"
59
+ → {{"sender_keyword": "flipkart", "start_date": "{five_days_ago}", "end_date": "{today_str}"}}
60
+
61
+ Return _only_ the JSON object—no extra text.
62
+ """
63
+
64
+ messages = [
65
+ {"role": "system", "content": system_prompt},
66
+ {"role": "user", "content": query}
67
+ ]
68
+ resp = client.chat.completions.create(
69
+ model="gpt-4o-mini",
70
+ temperature=0.0,
71
+ messages=messages
72
+ )
73
+ content = resp.choices[0].message.content.strip()
74
+
75
+ # Try direct parse; if the model added fluff, strip to the JSON block.
76
+ try:
77
+ return json.loads(content)
78
+ except json.JSONDecodeError:
79
+ start = content.find("{")
80
+ end = content.rfind("}") + 1
81
+ return json.loads(content[start:end])
82
+
83
+
84
+ def fetch_emails(query: str) -> Dict:
85
+ """
86
+ Fetch emails based on a natural language query that contains sender information and date range.
87
+ Now uses text-based search and returns only summary information, not full content.
88
+
89
+ Args:
90
+ query: The natural language query (e.g., "show me mails for last week from swiggy")
91
+
92
+ Returns:
93
+ Dict with query_info, email_summary, analysis, and email_count
94
+ """
95
+ # Extract sender keyword and date range from query
96
+ query_info = extract_query_info(query)
97
+ sender_keyword = query_info.get("sender_keyword", "")
98
+ start_date = query_info.get("start_date")
99
+ end_date = query_info.get("end_date")
100
+
101
+ print(f"Searching for emails with keyword '{sender_keyword}' between {start_date} and {end_date}")
102
+
103
+ # Use the new text-based search function
104
+ full_emails = scrape_emails_by_text_search(sender_keyword, start_date, end_date)
105
+
106
+ if not full_emails:
107
+ return {
108
+ "query_info": query_info,
109
+ "email_summary": [],
110
+ "analysis": {"summary": f"No emails found for '{sender_keyword}' in the specified date range.", "insights": []},
111
+ "email_count": 0
112
+ }
113
+
114
+ # Create summary version without full content
115
+ email_summary = []
116
+ for email in full_emails:
117
+ summary_email = {
118
+ "date": email.get("date"),
119
+ "time": email.get("time"),
120
+ "subject": email.get("subject"),
121
+ "from": email.get("from", "Unknown Sender"),
122
+ "message_id": email.get("message_id")
123
+ # Note: Removed 'content' to keep response clean
124
+ }
125
+ email_summary.append(summary_email)
126
+
127
+ # Auto-analyze the emails for insights
128
+ analysis = analyze_emails(full_emails) # Use full emails for analysis but don't return them
129
+
130
+ # Return summary info with analysis
131
+ return {
132
+ "query_info": query_info,
133
+ "email_summary": email_summary,
134
+ "analysis": analysis,
135
+ "email_count": len(full_emails)
136
+ }
137
+
138
+
139
+ def show_email(message_id: str) -> Dict:
140
+ """
141
+ Retrieve the full email record (date, time, subject, content, etc.)
142
+ from the local cache by message_id.
143
+ """
144
+ db = _load_email_db() # returns { sender_email: { "emails": [...], "last_scraped": ... }, ... }
145
+
146
+ # Search each sender's email list
147
+ for sender_data in db.values():
148
+ for email in sender_data.get("emails", []):
149
+ if email.get("message_id") == message_id:
150
+ return email
151
+
152
+ # If we didn't find it, raise or return an error structure
153
+ raise ValueError(f"No email found with message_id '{message_id}'")
154
+
155
+
156
+ def draft_reply(email: Dict, tone: str) -> str:
157
+ # call LLM to generate reply
158
+ # return a dummy reply for now
159
+ print(f"Drafting reply for email {email['id']} with tone: {tone}")
160
+ return f"Drafted reply for email {email['id']} with tone {tone}."
161
+ ...
162
+
163
+
164
+ def send_reply(message_id: str, reply_body: str) -> Dict:
165
+ # SMTP / Gmail API send
166
+ print(f"Sending reply to message {message_id} with body: {reply_body}")
167
+ ...
168
+
169
+
170
+ def analyze_emails(emails: List[Dict]) -> Dict:
171
+ """
172
+ Summarize and extract insights from a list of emails.
173
+ Returns a dict with this schema:
174
+ {
175
+ "summary": str, # a concise overview of all emails
176
+ "insights": [str, ...] # list of key observations or stats
177
+ }
178
+ """
179
+ if not emails:
180
+ return {"summary": "No emails to analyze.", "insights": []}
181
+
182
+ # 1) Create a simplified email summary for analysis (without full content)
183
+ simplified_emails = []
184
+ for email in emails:
185
+ simplified_email = {
186
+ "date": email.get("date"),
187
+ "time": email.get("time"),
188
+ "subject": email.get("subject"),
189
+ "from": email.get("from", "Unknown Sender"),
190
+ "content_preview": email.get("content", "")[:200] + "..." if email.get("content") else ""
191
+ }
192
+ simplified_emails.append(simplified_email)
193
+
194
+ emails_payload = json.dumps(simplified_emails, ensure_ascii=False)
195
+
196
+ # 2) Build the LLM prompt
197
+ system_prompt = """
198
+ You are an expert email analyst. You will be given a JSON array of email objects,
199
+ each with keys: date, time, subject, from, content_preview.
200
+
201
+ Your job is to produce _only_ valid JSON with two fields:
202
+ 1. summary: a 1–2 sentence high-level overview of these emails.
203
+ 2. insights: a list of 3–5 bullet-style observations or statistics
204
+ (e.g. "5 emails from Swiggy", "mostly promotional content", "received over 3 days").
205
+
206
+ Focus on metadata like senders, subjects, dates, and patterns rather than detailed content analysis.
207
+
208
+ Output exactly:
209
+
210
+ {
211
+ "summary": "...",
212
+ "insights": ["...", "...", ...]
213
+ }
214
+ """
215
+ messages = [
216
+ {"role": "system", "content": system_prompt},
217
+ {"role": "user", "content": f"Here are the emails:\n{emails_payload}"}
218
+ ]
219
+
220
+ # 3) Call the LLM
221
+ response = client.chat.completions.create(
222
+ model="gpt-4o-mini",
223
+ temperature=0.0,
224
+ messages=messages
225
+ )
226
+
227
+ # 4) Parse and return
228
+ content = response.choices[0].message.content.strip()
229
+ try:
230
+ return json.loads(content)
231
+ except json.JSONDecodeError:
232
+ # In case the model outputs extra text, extract the JSON block
233
+ start = content.find('{')
234
+ end = content.rfind('}') + 1
235
+ return json.loads(content[start:end])
236
+
237
+
238
+ TOOL_MAPPING = {
239
+ "fetch_emails": fetch_emails,
240
+ "show_email": show_email,
241
+ "analyze_emails": analyze_emails,
242
+ "draft_reply": draft_reply,
243
+ "send_reply": send_reply,
244
+ }
app.py ADDED
@@ -0,0 +1,64 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import gradio as gr
2
+ from huggingface_hub import InferenceClient
3
+
4
+ """
5
+ For more information on `huggingface_hub` Inference API support, please check the docs: https://huggingface.co/docs/huggingface_hub/v0.22.2/en/guides/inference
6
+ """
7
+ client = InferenceClient("HuggingFaceH4/zephyr-7b-beta")
8
+
9
+
10
+ def respond(
11
+ message,
12
+ history: list[tuple[str, str]],
13
+ system_message,
14
+ max_tokens,
15
+ temperature,
16
+ top_p,
17
+ ):
18
+ messages = [{"role": "system", "content": system_message}]
19
+
20
+ for val in history:
21
+ if val[0]:
22
+ messages.append({"role": "user", "content": val[0]})
23
+ if val[1]:
24
+ messages.append({"role": "assistant", "content": val[1]})
25
+
26
+ messages.append({"role": "user", "content": message})
27
+
28
+ response = ""
29
+
30
+ for message in client.chat_completion(
31
+ messages,
32
+ max_tokens=max_tokens,
33
+ stream=True,
34
+ temperature=temperature,
35
+ top_p=top_p,
36
+ ):
37
+ token = message.choices[0].delta.content
38
+
39
+ response += token
40
+ yield response
41
+
42
+
43
+ """
44
+ For information on how to customize the ChatInterface, peruse the gradio docs: https://www.gradio.app/docs/chatinterface
45
+ """
46
+ demo = gr.ChatInterface(
47
+ respond,
48
+ additional_inputs=[
49
+ gr.Textbox(value="You are a friendly Chatbot.", label="System message"),
50
+ gr.Slider(minimum=1, maximum=2048, value=512, step=1, label="Max new tokens"),
51
+ gr.Slider(minimum=0.1, maximum=4.0, value=0.7, step=0.1, label="Temperature"),
52
+ gr.Slider(
53
+ minimum=0.1,
54
+ maximum=1.0,
55
+ value=0.95,
56
+ step=0.05,
57
+ label="Top-p (nucleus sampling)",
58
+ ),
59
+ ],
60
+ )
61
+
62
+
63
+ if __name__ == "__main__":
64
+ demo.launch()
client/main.py ADDED
@@ -0,0 +1,183 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+
3
+ import requests
4
+ import sys
5
+ from typing import Dict, Any
6
+
7
+ API_BASE = "http://127.0.0.1:8000/api/v1"
8
+
9
+ class EmailQueryCLI:
10
+ def __init__(self):
11
+ self.session = requests.Session()
12
+
13
+ def check_connection(self) -> bool:
14
+ """Check if API server is running"""
15
+ try:
16
+ response = self.session.get(f"{API_BASE}/health")
17
+ response.raise_for_status()
18
+ return True
19
+ except:
20
+ return False
21
+
22
+ def pretty_print_email(self, email: Dict) -> str:
23
+ """Format email for display"""
24
+ return f"""
25
+ 📧 {email['subject']}
26
+ 📅 {email['date']} {email['time']}
27
+ 💬 {email['content'][:200]}...
28
+ 🆔 {email['message_id'][:20]}...
29
+ {"─" * 60}"""
30
+
31
+ def handle_query(self, query: str):
32
+ """Handle a natural language query"""
33
+ print(f"\n🔍 Processing: '{query}'")
34
+
35
+ try:
36
+ # Try to get emails directly
37
+ response = self.session.post(
38
+ f"{API_BASE}/get_emails",
39
+ json={"query": query}
40
+ )
41
+
42
+ if response.status_code == 200:
43
+ data = response.json()
44
+ self.display_email_results(data)
45
+ return True
46
+
47
+ elif response.status_code == 400:
48
+ error_detail = response.json()["detail"]
49
+
50
+ # Check if we need email mapping
51
+ if isinstance(error_detail, dict) and error_detail.get("type") == "need_email_input":
52
+ mapping_success = self.handle_missing_mapping(error_detail)
53
+ if mapping_success and hasattr(self, '_retry_query'):
54
+ # Retry the query after successful mapping
55
+ print(f"🔄 Retrying query...")
56
+ delattr(self, '_retry_query')
57
+ return self.handle_query(query) # Recursive retry
58
+ return mapping_success
59
+ else:
60
+ print(f"❌ Error: {error_detail}")
61
+ return False
62
+ else:
63
+ print(f"❌ API Error: {response.status_code}")
64
+ return False
65
+
66
+ except Exception as e:
67
+ print(f"❌ Connection Error: {e}")
68
+ return False
69
+
70
+ def handle_missing_mapping(self, error_detail: Dict) -> bool:
71
+ """Handle case where email mapping is needed"""
72
+ sender_intent = error_detail["sender_intent"]
73
+ print(f"\n❓ {error_detail['message']}")
74
+
75
+ try:
76
+ email = input(f"📧 Enter email for '{sender_intent}': ").strip()
77
+ if not email or "@" not in email:
78
+ print("❌ Invalid email address")
79
+ return False
80
+
81
+ # Add the mapping
82
+ mapping_response = self.session.post(
83
+ f"{API_BASE}/add_email_mapping",
84
+ json={"name": sender_intent, "email": email}
85
+ )
86
+
87
+ if mapping_response.status_code == 200:
88
+ print(f"✅ Mapping saved: '{sender_intent}' → '{email}'")
89
+ self._retry_query = True # Flag to retry the original query
90
+ return True
91
+ else:
92
+ print(f"❌ Failed to save mapping: {mapping_response.text}")
93
+ return False
94
+
95
+ except KeyboardInterrupt:
96
+ print("\n❌ Cancelled")
97
+ return False
98
+
99
+ def display_email_results(self, data: Dict):
100
+ """Display email search results"""
101
+ print(f"\n✅ Found {data['total_emails']} emails")
102
+ print(f"📤 From: {data['resolved_email']}")
103
+ print(f"📅 Period: {data['start_date']} to {data['end_date']}")
104
+
105
+ if data['emails']:
106
+ print(f"\n📧 Emails:")
107
+ for email in data['emails'][:10]: # Show first 10
108
+ print(self.pretty_print_email(email))
109
+
110
+ if len(data['emails']) > 10:
111
+ print(f"\n... and {len(data['emails']) - 10} more emails")
112
+ else:
113
+ print("\n📭 No emails found in this date range")
114
+
115
+ def show_mappings(self):
116
+ """Display all stored name-to-email mappings"""
117
+ try:
118
+ response = self.session.get(f"{API_BASE}/view_mappings")
119
+ if response.status_code == 200:
120
+ data = response.json()
121
+ mappings = data["mappings"]
122
+
123
+ print(f"\n📇 Stored Mappings ({len(mappings)}):")
124
+ if mappings:
125
+ for name, email in mappings.items():
126
+ print(f" 👤 {name} → 📧 {email}")
127
+ else:
128
+ print(" (No mappings stored)")
129
+ else:
130
+ print(f"❌ Failed to load mappings: {response.text}")
131
+ except Exception as e:
132
+ print(f"❌ Error: {e}")
133
+
134
+ def run(self):
135
+ """Main CLI loop"""
136
+ if not self.check_connection():
137
+ print("❌ Cannot connect to API server at http://127.0.0.1:8000")
138
+ print(" Make sure to run: uvicorn main:app --reload")
139
+ sys.exit(1)
140
+
141
+ print("✅ Connected to Email Query System")
142
+ print("💡 Try queries like:")
143
+ print(" • 'emails from john last week'")
144
+ print(" • 'show amazon emails from last month'")
145
+ print(" • 'get dev@iitj.ac.in emails yesterday'")
146
+ print("\n📋 Commands:")
147
+ print(" • 'mappings' - View stored name-to-email mappings")
148
+ print(" • 'quit' or Ctrl+C - Exit")
149
+ print("=" * 60)
150
+
151
+ while True:
152
+ try:
153
+ query = input("\n🗨️ You: ").strip()
154
+
155
+ if not query:
156
+ continue
157
+
158
+ if query.lower() in ['quit', 'exit', 'q']:
159
+ break
160
+ elif query.lower() in ['mappings', 'map', 'm']:
161
+ self.show_mappings()
162
+ elif query.lower() in ['help', 'h']:
163
+ print("\n💡 Examples:")
164
+ print(" • emails from amazon last 5 days")
165
+ print(" • show john smith emails this week")
166
+ print(" • get notifications from google yesterday")
167
+ else:
168
+ self.handle_query(query)
169
+
170
+ except KeyboardInterrupt:
171
+ break
172
+ except Exception as e:
173
+ print(f"❌ Unexpected error: {e}")
174
+
175
+ print("\n👋 Goodbye!")
176
+
177
+ def main():
178
+ """Entry point for CLI"""
179
+ cli = EmailQueryCLI()
180
+ cli.run()
181
+
182
+ if __name__ == "__main__":
183
+ main()
requirements.txt CHANGED
@@ -1,24 +1,9 @@
1
- # Core OAuth Gmail MCP Server Dependencies
2
- gradio[mcp]
3
- google-auth
4
- google-auth-oauthlib
5
- google-auth-httplib2
6
- google-api-python-client
7
- cryptography
8
  requests
9
- loguru
10
  python-dateutil
11
-
12
- uvicorn
13
-
14
- # MCP server support
15
- mcp
16
-
17
- # Email processing
18
- email-validator
19
  beautifulsoup4
20
- html2text
21
-
22
- # Development (optional)
23
- pytest
24
- black
 
1
+ huggingface_hub==0.25.2
2
+ uvicorn
3
+ fastapi
4
+ openai
 
 
 
5
  requests
 
6
  python-dateutil
 
 
 
 
 
 
 
 
7
  beautifulsoup4
8
+ python-dotenv
9
+ pydantic[email]
 
 
 
server/email_db.json ADDED
@@ -0,0 +1,135 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "director@iitj.ac.in": {
3
+ "emails": [
4
+ {
5
+ "date": "01-Jan-2025",
6
+ "time": "11:00:26",
7
+ "subject": "2025: A Year to Unite, Innovate, and Lead with Integrity: Let's make\r\n IIT Jodhpur a beacon of excellence and sustainability!",
8
+ "content": "Dear Members of the IIT Jodhpur Fraternity, Students, Staff Members, and Faculty Colleagues, As we stand on the brink of a new year, I take this opportunity to reflect on\u00a0the remarkable journey of IIT Jodhpur so far and the collective efforts that\u00a0have brought us to where we are today over the last\u00a016\u00a0years or so. Each one of you - our faculty,\u00a0students, staff, alumni, and partners, has played a vital role in shaping\u00a0this institute and its evolution as a hub of learning, innovation, and excellence. It is\u00a0your commitment, resilience, and unwavering dedication that inspires all of us to aim higher and envision a future of much greater achievements,\u00a0breaking the\u00a0moulds of conventional thinking and defining excellence at a new\u00a0level. The year 2025 holds immense promise. It is a year when our shared vision of growth, innovation, and sustainability will take centre stage. Our institution stands at the confluence of tradition and modernity, of\u00a0ambition and responsibility. Let this be the year where we align our\u00a0efforts to not only push the boundaries of research and education but\u00a0also embrace a balanced approach toward sustainability in every aspect. Together, we can make IIT Jodhpur a beacon of innovation,\u00a0collaboration, and societal impact.\u00a0Our strength lies in our unity. As a diverse and dynamic community, we\u00a0can bring together varied perspectives, talents, and\u00a0disciplines to create something extraordinary. It is through this synergy that we will achieve the breakthroughs needed to address global\u00a0challenges and contribute meaningfully to the world. Let us continue to\u00a0foster an environment of mutual respect, inclusivity, and shared\u00a0purpose, ensuring that every member of our community feels\u00a0empowered to contribute their best. The journey ahead requires us to focus on strengthening collaborations\u00a0that amplify our impact. By working closely with industry, academia, and\u00a0governmental organizations, we can create pathways to meaningful\u00a0innovation and transformative research. O",
9
+ "message_id": "<CADCv5Wjqd09OCsR0fT2YXv5zzyJT=+xsuvsu4rhLTKbDBznYNw@mail.gmail.com>"
10
+ },
11
+ {
12
+ "date": "14-Jan-2025",
13
+ "time": "09:52:47",
14
+ "subject": "Makar Sankranti, Lohri, Bihu, Pongal: Festivals of Progress and New Aspirations",
15
+ "content": "Dear Members of the IIT Jodhpur Fraternity, \u0906\u092a\u0915\u094b \u0932\u094b\u0939\u093f\u095c\u0940, \u092c\u093f\u0939\u0942, \u092a\u094b\u0902\u0917\u0932 \u0914\u0930 \u092e\u0915\u0930 \u0938\u0902\u0915\u094d\u0930\u093e\u0902\u0924\u093f \u0915\u0940 \u0939\u093e\u0930\u094d\u0926\u093f\u0915 \u0936\u0941\u092d\u0915\u093e\u092e\u0928\u093e\u090f\u0901! \u092d\u093e\u0938\u094d\u0915\u0930\u0938\u094d\u092f \u092f\u0925\u093e \u0924\u0947\u091c\u094b \u092e\u0915\u0930\u0938\u094d\u0925\u0938\u094d\u092f \u0935\u0930\u094d\u0927\u0924\u0947\u0964 \u0924\u0925\u0948\u0935 \u092d\u0935\u0924\u093e\u0902 \u0924\u0947\u091c\u094b \u0935\u0930\u094d\u0927\u0924\u093e\u092e\u093f\u0924\u093f \u0915\u093e\u092e\u092f\u0947\u0964\u0964 \u092d\u0917\u0935\u093e\u0928 \u0938\u0942\u0930\u094d\u092f \u0915\u0947 \u092e\u0915\u0930 \u0930\u093e\u0936\u093f \u092e\u0947\u0902 \u092a\u094d\u0930\u0935\u0947\u0936 \u0924\u0925\u093e \u0909\u0924\u094d\u0924\u0930\u093e\u092f\u0923 \u0939\u094b\u0928\u0947 \u092a\u0930 \u0938\u0902\u092a\u0942\u0930\u094d\u0923 \u092d\u093e\u0930\u0924\u0935\u0930\u094d\u0937 \u092e\u0947\u0902 \u090a\u0930\u094d\u091c\u093e \u0935 \u0909\u0937\u094d\u092e\u093e \u092e\u0947\u0902 \u0935\u0943\u0926\u094d\u0927\u093f \u0915\u0947 \u092a\u094d\u0930\u0924\u0940\u0915-\u092a\u0930\u094d\u0935 \u00a0\"\u092e\u0915\u0930 \u0938\u0902\u0915\u094d\u0930\u093e\u0902\u0924\u093f\" \u092a\u0930 \u0906\u092a\u0915\u093e \u090f\u0935\u0902 \u0906\u092a\u0915\u0947 \u092a\u0930\u093f\u0935\u093e\u0930 \u092e\u0947\u0902 \u0938\u092d\u0940 \u0915\u093e \u091c\u0940\u0935\u0928 \u0905\u0924\u094d\u092f\u0902\u0924 \u092a\u094d\u0930\u0915\u093e\u0936\u092e\u093e\u0928 \u0939\u094b! \u0906\u092a \u0938\u092d\u0940 \u0938\u094d\u0935\u0938\u094d\u0925 \u0930\u0939\u0947\u0902, \u092a\u094d\u0930\u0938\u0928\u094d\u0928 \u0930\u0939\u0947\u0902 \u0914\u0930 \u0938\u0942\u0930\u094d\u092f \u0915\u0940 \u092d\u093e\u0901\u0924\u093f \u0905\u092a\u0928\u0947 \u092a\u094d\u0930\u0915\u093e\u0936 \u0938\u0947 \u0935\u093f\u0936\u094d\u0935 \u0915\u094b \u0906\u0932\u094b\u0915\u093f\u0924 \u0915\u0930\u0947\u0902! \u0906\u0907\u090f, \u0905\u092a\u0928\u0940 \u0906\u0932\u094b\u0915\u0927\u0930\u094d\u092e\u0940 \u0938\u0902\u0938\u094d\u0915\u0943\u0924\u093f \u0915\u0940 \u0935\u093f\u0930\u093e\u0938\u0924 \u0915\u0947 \u0935\u093e\u0939\u0915 \u092c\u0928\u0947\u0902\u0964 \u0909\u0938\u0915\u0947 \u092e\u0939\u0924\u094d\u0924\u094d\u0935 \u090f\u0935\u0902 \u0935\u0948\u091c\u094d\u091e\u093e\u0928\u093f\u0915\u0924\u093e \u0915\u094b \u092a\u0939\u0932\u0947 \u0938\u094d\u0935\u092f\u0902 \u0938\u092e\u091d\u0947\u0902, \u092b\u093f\u0930 \u0905\u092a\u0928\u0940 \u0938\u0902\u0924\u0924\u093f\u092f\u094b\u0902 \u0915\u094b \u092d\u0940 \u0938\u092e\u091d\u093e\u090f\u0901\u0964 \u0939\u092e\u093e\u0930\u0947 \u0924\u094d\u092f\u094b\u0939\u093e\u0930, \u0939\u092e\u093e\u0930\u0940 \u092d\u093e\u0937\u093e, \u0939\u092e\u093e\u0930\u0940 \u092a\u0930\u0902\u092a\u0930\u093e, \u0939\u092e\u093e\u0930\u0940 \u0938\u0902\u0938\u094d\u0915\u0943\u0924\u093f - \u0939\u0940\u0928\u0924\u093e \u0928\u0939\u0940\u0902, \u0917\u0930\u094d\u0935 \u0915\u0940 \u0935\u093f\u0937\u092f\u0935\u0938\u094d\u0924\u0941 \u0939\u0948\u0902\u0964 As we celebrate the auspicious occasion of Makar Sankranti, when the sun begins its northward journey (Uttarayan), let us draw inspiration from this symbol of progress, renewal, and growth. This festival reminds us to embrace change, rise above challenges, and strive for new aspirations. Much like the sun\u2019s steady path, our commitment to advancing knowledge, innovation, and societal impact continues to illuminate the way forward. At IIT Jodhpur, we are driven to explore transformative solutions, foster excellence, and shape a sustainable and inclusive future for ourselves, and the nation. Let us make use of this occasion to reflect on our achievements and renew our dedication to the goals that lie ahead. Together, as a community, we can reach greater heights and leave an enduring legacy for generations to come. May this festive season bring joy, prosperity, and inspiration to you and your families. Let us soar higher, united in our purpose and vision. -- Affectionately Yours..... With warm regards..... Prof. Avinash Kumar Agarwal, FTWAS, FAAAS, FCI, FSAE, FASME, FRSC, FNAE, FNASc, FISEES Director, IIT Jodhpur & Sir J C Bose National Fellow Tel: +91 291 2801011 (Off) Wikipedia: tinyurl.com/bdhe89ew | Scopus: https://tinyurl.com/mwccdcc4 | Google Scholar: https://tinyurl.com/mtbyv7w4 | FUE",
16
+ "message_id": "<CADCv5WgfRV2jFQf2=gfVHw1xfyE87tdqHSVtGQq3S4dhNmwEdA@mail.gmail.com>"
17
+ },
18
+ {
19
+ "date": "25-Jan-2025",
20
+ "time": "19:48:33",
21
+ "subject": "Happy Republic Day-2025",
22
+ "content": "My Dear Students, Faculty and Staff members, \ud83c\uddee\ud83c\uddf3 Greetings on the occasion of the 76th Republic Day! \ud83c\uddee\ud83c\uddf3 As we approach the 26th of January, we unite to celebrate not only the adoption of our Constitution but also the enduring principles of democracy, justice, and equality that define us as individuals and as an institution. This momentous day inspires us to reaffirm our collective commitment to shaping the future of our nation. At IIT Jodhpur, we hold a pivotal role in this journey of progress and innovation. As proud members of this esteemed institution, we bear the responsibility of fostering a culture rooted in innovation, academic excellence, and ethical leadership. Republic Day serves as a powerful reminder of our individual and collective contributions to IIT Jodhpur and the nation\u2014 not only through our collective professional accomplishments but also through the values we instil in our students and the spirit of collaboration and excellence we cultivate among ourselves. On this Republic Day, let us focus on: \u2705 Strengthening research and teaching excellence in our Institute, \u2705 Enhancing our infrastructure, and \u2705 Building a more inclusive and supportive environment for all members of our community. It is through these efforts that we will continue to push the frontiers of knowledge, innovation and excellence, contributing meaningfully to our nation and beyond. I encourage everyone to actively participate in the Republic Day celebrations tomorrow morning and reflect on how we can collectively elevate IIT Jodhpur\u2019s legacy. Together, let us uphold the values of integrity, diversity, and excellence\u2014the core pillars of our nation and our Institute. Wishing you and yours a thoughtful, inspiring, and joyous Republic Day 2025. Jai Hind. Jai Bharat. With warm regards and affection, Prof. Avinash Kumar Agarwal, FTWAS, FAAAS, FCI, FSAE, FASME, FRSC, FNAE, FNASc, FISEES Director, IIT Jodhpur & Sir J C Bose National Fellow Tel: +91 291 2801011 (Off) Wikipedia: tinyurl.com/bd",
23
+ "message_id": "<CADCv5WiULZvioxbVYrmR7mdmZHtB4jQ2P2cG_+S-Sdas1hNyew@mail.gmail.com>"
24
+ },
25
+ {
26
+ "date": "13-Feb-2025",
27
+ "time": "11:37:07",
28
+ "subject": "=?UTF-8?Q?Re=3A_=5Bfaculty=5D_Invitation_to_Hamira_Manganiyar_Group=27?=\r\n\t=?UTF-8?Q?s_Rajasthani_Folk_Music_Performance_Today_=E2=80=93_VIRASAT_2025?=",
29
+ "content": "Dear All, The Institute is organising Virasat 2025, and renowned artists will descend on our campus over the next five days. We must take this opportunity to learn about our cultural heritage and musical\u00a0performances during this period. I will strongly encourage all constituents of our campus community, including students, faculty and staff members and their families, and project staff members, to attend all these programs in the evening over the next five days and enjoy the cultural performances. Best wishes Avinash Kumar Agarwal On Thu, Feb 13, 2025 at 11:26\u202fAM Sherin Sabu < sherinsabu@iitj.ac.in > wrote: Dear all, We are delighted to invite you to an enchanting evening of Rajasthani folk music as part of VIRASAT 2025 , organized by IIT Jodhpur in collaboration with SPIC MACAY. \ud83c\udfb6 Performance Details: \ud83d\udccd Venue: Jodhpur Club, IIT Jodhpur \ud83c\udfa4 Artist: Hamira Manganiyar Group (Rajasthani Folk Music) \ud83d\udcc5 Date: Today: 13th February 2025 \u23f0 Time: 7:30 PM Immerse yourself in the vibrant and soulful rhythms of Rajasthan as the Hamira Manganiyar Group brings to life the rich musical traditions of the desert. This performance is a rare opportunity to experience the deep-rooted heritage of folk music passed down through generations. We warmly invite you to join us for this unforgettable musical evening.\u00a0\r\n\r\nPlease bring your family along to share in this cultural celebration! \ud83d\udccc Find attached the official event poster for more details. Looking forward to your presence! Warm Regards, Team Virasat 2025 IIT Jodhpur -- Dr Sherin Sabu Assistant Professor (Sociology), School of Liberal Arts (SoLA) Affiliate Faculty, Center for Emerging Technologies for Sustainable Development (CETSD) IIT Jodhpur",
30
+ "message_id": "<CADCv5WiZNk6BBrYPXhaaN0K_9N-L27ufUfg5JyWTrrJdbnwM=w@mail.gmail.com>"
31
+ },
32
+ {
33
+ "date": "26-Feb-2025",
34
+ "time": "19:54:24",
35
+ "subject": "Greetings on Mahashivratri!",
36
+ "content": "Dear all, Wishing you all a blessed and joyous Mahashivratri! I extend my warmest greetings to all of you and your family members. Mahashivratri is a time of deep spiritual reflection, inner growth, and devotion. This sacred festival symbolizes the triumph of wisdom, devotion, and inner strength, inspiring us to pursue knowledge and morality in all our endeavors. As we celebrate this day with devotion and reflection, let us also reaffirm our commitment to excellence, innovation, and the collective growth of IIT Jodhpur. Together, through dedication and hard work, we should continue to make meaningful contributions to knowledge, technology, and society. With warm regards, Prof. Avinash Kumar Agarwal ..",
37
+ "message_id": "<CADCv5WhSG91tOjiv_+XUUrxvqeOQv4xMocQNsgAC_EuUTQ87jw@mail.gmail.com>"
38
+ },
39
+ {
40
+ "date": "28-Feb-2025",
41
+ "time": "12:05:48",
42
+ "subject": "Re: [faculty] Invitation to celebrate \"National Science Day\" on 28th\r\n February 2025 at IIT Jodhpur",
43
+ "content": "Dear All, Hearty Congratulations to all of you on the occasion of National Science Day 2025. I urge all of you to attend this celebration of National Science Day. Sh Sharad Sarraf, BoG Chairman of IIT Mumbai and Jammu, is the Speaker and the chief guest. He is a strong well-wisher of IIT Jodhpur and we will enrich ourselves by listening to his words, full of wisdom. Best regards Avinash Kumar Agarwal On Wed, Feb 26, 2025 at 6:02\u202fPM Committee for Celebration of Commemorative Days < cccd@iitj.ac.in > wrote: Dear All, \u092e\u0939\u093e\u0936\u093f\u0935\u0930\u093e\u0924\u094d\u0930\u093f\u00a0 \u0915\u0940 \u0939\u093e\u0930\u094d\u0926\u093f\u0915 \u0936\u0941\u092d\u0915\u093e\u092e\u0928\u093e\u090f\u0902 / Happy MahaShivratri....! \u0938\u094d\u092e\u093e\u0930\u0915 \u0926\u093f\u0935\u0938 \u0938\u092e\u093e\u0930\u094b\u0939 \u0938\u092e\u093f\u0924\u093f (\u0938\u0940\u0938\u0940\u0938\u0940\u0921\u0940) \u0915\u0940 \u0913\u0930 \u0938\u0947, \u092d\u093e\u0930\u0924\u0940\u092f \u092a\u094d\u0930\u094c\u0926\u094d\u092f\u094b\u0917\u093f\u0915\u0940 \u0938\u0902\u0938\u094d\u0925\u093e\u0928 \u091c\u094b\u0927\u092a\u0941\u0930 \u092e\u0947\u0902 28 \u092b\u0930\u0935\u0930\u0940, 2025 (\u0936\u0941\u0915\u094d\u0930\u0935\u093e\u0930) \u0915\u094b \u0930\u093e\u0937\u094d\u091f\u094d\u0930\u0940\u092f \u0935\u093f\u091c\u094d\u091e\u093e\u0928 \u0926\u093f\u0935\u0938\u00a0 2025 \u0915\u0947 \u0905\u0935\u0938\u0930 \u092a\u0930 \u0939\u092e \u0906\u092a\u0915\u094b \u00a0\u0939\u093e\u0930\u094d\u0926\u093f\u0915 \u0928\u093f\u092e\u0902\u0924\u094d\u0930\u0923 \u0926\u0947\u0924\u0947\u00a0 \u0939\u0948\u0902 \u0964 \u0939\u092e\u00a0\u00a0\u0907\u0938 \u092e\u0939\u0924\u094d\u0935\u092a\u0942\u0930\u094d\u0923 \u0915\u093e\u0930\u094d\u092f\u0915\u094d\u0930\u092e \u092e\u0947\u0902 \u0906\u092a\u0915\u0947 \u0936\u093e\u092e\u093f\u0932 \u0939\u094b\u0928\u0947 \u0915\u0947 \u0938\u092e\u094d\u092e\u093e\u0928 \u0915\u0940 \u0909\u0924\u094d\u0938\u0941\u0915\u0924\u093e \u0938\u0947 \u092a\u094d\u0930\u0924\u0940\u0915\u094d\u0937\u093e \u0915\u0930\u0947\u0902\u0917\u0947\u0964 On behalf of the Committee for Celebration of Commemorative Days (CCCD) at IIT Jodhpur, we cordially invite you to join us in commemorating the National Science Day 2025 on February 28, 2025 (Friday) . We eagerly anticipate the honour of having you at this momentous event. Program: National Science Day 2025 Date: 28th February 2025 (Friday) Venue: Jodhpur Club Time: 5:45 PM onwards Program details: Time Event 05:45 \u2013 06:05 PM Scientific Demonstration & Tea -Refreshments 06:05 \u2013 06:10 PM Lamp Lighting & felicitation 06:10 \u2013 06:20 PM Welcome address by the Director 06:20 \u2013 06:45 PM Talk and interaction by the Chief Guest 06:45 \u2013 06:50 PM Felicitation of Guests 06:50 \u2013 07:00 PM Library App Release 07:00 \u2013 07:05 PM Quiz Session 07:05 \u2013 07:15 PM Felicitation to ACAC students 07:15 \u2013 07:20 PM Vote of Thanks 07:20 PM National Anthem Your presence and active participation will contribute significantly to the success of this celebration. With warm regards, Himmat Singh Assistant Registrar ___________________________________________________ \u0938\u094d\u092e\u093e\u0930\u0915 \u0926\u093f\u0935\u0938 \u0938\u092e\u093e\u0930\u094b\u0939 \u0938\u092e\u093f\u0924\u093f / Committee for Celebration of Commemorative Days (CCCD) \u0906\u0908\u0906\u0908",
44
+ "message_id": "<CADCv5WjLQTBkWxUB7XuOmQEKmNJftW5Orw8rnnjX-cAQ4bPFuw@mail.gmail.com>"
45
+ },
46
+ {
47
+ "date": "01-Apr-2025",
48
+ "time": "10:53:52",
49
+ "subject": "Fwd: FY Closure and Updates",
50
+ "content": "Dear Students, Today, April 1, marks an important day because we are taking two important steps in the direction of our evolution as a mature institute of higher learning. 1. Today, our Health Center starts working in autonomous mode, managed and operated by OUR OWN team. 2. Today, our transport services also start operating in autonomous mode, managed and operated by OUR OWN team. These two are definitely two big steps in the evolution of our institute. I would also like to put on record my deep appreciation of the teams of the Health Center, led by Prof. Anil Tiwari and Dr Neha Sharma, and the transport team, led by Prof. Shree Prakash Tiwari and Sandeep Chandel. Please join me in congratulating them for a good start. While these are big transitions, it is\u00a0possible that there might be some perturbations in services in the initial period. Please give your feedback to the process owners, and actions will be taken to minimise the inconveniences and meet all genuine expectations. Best regards -- With warm regards..... Prof. Avinash Kumar Agarwal, FTWAS, FAAAS, FCI, FSAE, FASME, FRSC, FNAE, FNASc, FISEES Director, IIT Jodhpur & Sir J C Bose National Fellow Tel: +91 291 2801011 (Off) Wikipedia: tinyurl.com/bdhe89ew | Scopus: https://tinyurl.com/mwccdcc4 | Google Scholar: https://tinyurl.com/mtbyv7w4 | FUEL: https://tinyurl.com/bdzn4r28 | Orcid: https://tinyurl.com/537m3tad ------------------------------ ------------------------------ ---------------- \u2022\u00a0 \u00a0 \u00a0 \u00a0Fellow of The World Academy of Science \u2022\u00a0 \u00a0 \u00a0 \u00a0Fellow of Combustion Institute, USA \u2022\u00a0 \u00a0 \u00a0 \u00a0Fellow of American Association for the Advancement of Science \u2022\u00a0 \u00a0 \u00a0 \u00a0Fellow of American Society of Mechanical Engineers \u2022\u00a0 \u00a0 \u00a0 \u00a0Fellow of Society of Automotive Engineers International, USA \u2022\u00a0 \u00a0 \u00a0 \u00a0Fellow of World Society for Sustainable Energy Technologies, UK \u2022\u00a0 \u00a0 \u00a0 \u00a0Fellow of Royal Society of Chemistry, UK \u2022\u00a0 \u00a0 \u00a0 \u00a0Fellow of National Academy of Sciences India \u2022\u00a0 \u00a0 \u00a0 \u00a0Fellow of Indian National Academy of Engineering \u2022\u00a0 \u00a0 \u00a0 \u00a0F",
51
+ "message_id": "<CADCv5WhGgVj0LCjvkTKda_mNxQ6WqQdGmP1afV=sj2v=WSBwow@mail.gmail.com>"
52
+ },
53
+ {
54
+ "date": "23-Apr-2025",
55
+ "time": "20:27:50",
56
+ "subject": "Directorate Shifted to Chankya Complex",
57
+ "content": "Dear All, This is to inform you that all the offices of the Deans, Registrar, DD and D have moved back to Chanamkya\u00a0Complex. -- With warm regards..... Prof. Avinash Kumar Agarwal, FTWAS, FAAAS, FCI, FSAE, FASME, FRSC, FNAE, FNASc, FISEES Director, IIT Jodhpur & Sir J C Bose National Fellow Tel: +91 291 2801011 (Off) Wikipedia: tinyurl.com/bdhe89ew | Scopus: https://tinyurl.com/mwccdcc4 | Google Scholar: https://tinyurl.com/mtbyv7w4 | FUEL: https://tinyurl.com/bdzn4r28 | Orcid: https://tinyurl.com/537m3tad ------------------------------ ------------------------------ ---------------- \u2022\u00a0 \u00a0 \u00a0 \u00a0Fellow of The World Academy of Science \u2022\u00a0 \u00a0 \u00a0 \u00a0Fellow of Combustion Institute, USA \u2022\u00a0 \u00a0 \u00a0 \u00a0Fellow of American Association for the Advancement of Science \u2022\u00a0 \u00a0 \u00a0 \u00a0Fellow of American Society of Mechanical Engineers \u2022\u00a0 \u00a0 \u00a0 \u00a0Fellow of Society of Automotive Engineers International, USA \u2022\u00a0 \u00a0 \u00a0 \u00a0Fellow of World Society for Sustainable Energy Technologies, UK \u2022\u00a0 \u00a0 \u00a0 \u00a0Fellow of Royal Society of Chemistry, UK \u2022\u00a0 \u00a0 \u00a0 \u00a0Fellow of National Academy of Sciences India \u2022\u00a0 \u00a0 \u00a0 \u00a0Fellow of Indian National Academy of Engineering \u2022\u00a0 \u00a0 \u00a0 \u00a0Fellow of International Society of Energy, Environment, and Sustainability ------------------------------ ------------------------------ ------------- \u2022\u00a0 \u00a0 \u00a0 \u00a0Shanti Swarup Bhatnagar Award-2016 \u2022\u00a0 \u00a0 \u00a0 \u00a0Editor of FUEL \u2022\u00a0 \u00a0 \u00a0 \u00a0Associate Editor of ASME Open Journal of Engineering \u2022\u00a0 \u00a0 \u00a0 \u00a0Associate Editor of SAE International Journal of Engines ------------------------------ ------------------------------ --------------",
58
+ "message_id": "<CADCv5WhB=aoNjykLwPj9wY-ZNTCxnBp5EFsPriDs+mpH9Fi-WA@mail.gmail.com>"
59
+ },
60
+ {
61
+ "date": "01-May-2025",
62
+ "time": "12:20:20",
63
+ "subject": "Thank You",
64
+ "content": "Dear\r\nColleagues I want to thank all the stakeholders for their kind cooperation,\r\nenabling me to complete one year as Director of IIT Jodhpur. I joined the\r\nInstitute on 1 st May 2024. I realised this Institute has great potential and can break into the top echelons of\r\nranking among engineering institutions in the country and the world. However,\r\nto achieve this, all of us must work as a unified team. From my\r\nside, I assure you that I will make all possible efforts to ensure that fair\r\nand transparent governance processes are in place and we, as a team, make all\r\nthe efforts in the right direction. In the last\r\nyear, extra-mural research grants to IIT jodhpur have doubled, and project endorsements\r\nand publications have significantly increased; however, there are miles to go. I hope we\r\nall continue to work relentlessly to pursue excellence in our activities, be\r\nloyal to the Institute, and do all our duties with dedication, sincerity and\r\nhonesty. This Institute cannot have any room for corruption, nepotism and\r\nregionalism. As IIT Jodhpur stakeholders, we must commit to having excellent\r\nconduct and setting an example for others to follow. Wishing you\r\nall the very best Affectionately\r\nyours Avinash Kumar Agarwal, FTWAS, FAAAS, FCI, FSAE, FASME, FRSC, FNAE, FNASc, FISEES Director, IIT Jodhpur & Sir J C Bose National Fellow Tel: +91 291 2801011 (Off) Wikipedia: tinyurl.com/bdhe89ew | Scopus: https://tinyurl.com/mwccdcc4 | Google Scholar: https://tinyurl.com/mtbyv7w4 | FUEL: https://tinyurl.com/bdzn4r28 | Orcid: https://tinyurl.com/537m3tad ------------------------------ ------------------------------ ---------------- \u2022\u00a0 \u00a0 \u00a0 \u00a0Fellow of The World Academy of Science \u2022\u00a0 \u00a0 \u00a0 \u00a0Fellow of Combustion Institute, USA \u2022\u00a0 \u00a0 \u00a0 \u00a0Fellow of American Association for the Advancement of Science \u2022\u00a0 \u00a0 \u00a0 \u00a0Fellow of American Society of Mechanical Engineers \u2022\u00a0 \u00a0 \u00a0 \u00a0Fellow of Society of Automotive Engineers International, USA \u2022\u00a0 \u00a0 \u00a0 \u00a0Fellow of World Society for Sustainable Energy Technol",
65
+ "message_id": "<CADCv5Wjxn7Pj2XXBf4n2S0PyJQapU1aOOvfHfjjtr4xfSWbeKw@mail.gmail.com>"
66
+ },
67
+ {
68
+ "date": "07-May-2025",
69
+ "time": "20:56:02",
70
+ "subject": "Instruction for Next few days: Urgent attention",
71
+ "content": "Dear Faculty Members, Staff Members, Students and Other Constituents of our campus community You are aware that we are passing through a tough time, as far as national security is concerned. Today, we have done a drill for the evacuation of the campus community in the event of an air strike by our nemesis. We will have less than 5 minutes for blackout and evacuation into the tunnels. The next few days and nights are very critical, and we may be one of the soft targets. Hence, adequate care and precaution are important to all of us. Please note: 1. There will be no more drills. In case you hear a siren (Oscillating), it means an impending attack, and we will have less than 5 minutes to get into the tunnel hideouts.\u00a0The flat siren will mean that the danger has passed, and now it is safe to venture out. Enemy fighters\u00a0and missiles will reach us in 5-10 minutes after crossing the border, and that's all we would have to ensure our safety. 2. Today evening, we will do black out\u00a0drill between 10-10.15 PM. There will be no siren; hence, all of you are to follow the procedure voluntarily on your own. The power will be cut (If it is resumed by JVVNL). Every possible light source must be turned off. The lights can be put on again after 15 minutes, at 10.15 PM tonight. 3. In the event of an impending attack, the lights will be cut off centrally after the siren goes off, in the next 2-3 minutes.\u00a0That's all the time, you will have to come down the roads and move into the tunnels. It is\u00a0advised to carry your own water bottle in such an event. 4. There should be no live streaming, photographs posted on social media for these or sharing of this email on any platform. This will put all of us in danger. On each tunnel entry point, we will post security guards to\u00a0guide\u00a0you safely. Please ensure that you do not panic and move in a\u00a0disciplined manner into the tunnels,\u00a0when and if required. If you have already posted the photos and videos of tunnels on your social media accounts, please d",
72
+ "message_id": "<CADCv5WiK10w-aj0Vn2vq+bT1qViromAsfpwd+DPWGeYx2zspXA@mail.gmail.com>"
73
+ },
74
+ {
75
+ "date": "07-May-2025",
76
+ "time": "22:28:02",
77
+ "subject": "Re: Instruction for Next few days: Urgent attention",
78
+ "content": "Dear All, Thanks. Black out drill was an outstanding success and we figured out some lapses, which have been fixed. Be alert and all of us know the steps, in case required, to keep us safe. Be calm and hope that our forces will keep our nemesis at bay and we are not required to be in the hideout. In any case, now we all know the next steps and hopefully we will sleep peacefully. Best regards Avinash Kumar Agarwal On Wed, 7 May, 2025, 20:56 Director, IIT Jodhpur, < director@iitj.ac.in > wrote: Dear Faculty Members, Staff Members, Students and Other Constituents of our campus community You are aware that we are passing through a tough time, as far as national security is concerned. Today, we have done a drill for the evacuation of the campus community in the event of an air strike by our nemesis. We will have less than 5 minutes for blackout and evacuation into the tunnels. The next few days and nights are very critical, and we may be one of the soft targets. Hence, adequate care and precaution are important to all of us. Please note: 1. There will be no more drills. In case you hear a siren (Oscillating), it means an impending attack, and we will have less than 5 minutes to get into the tunnel hideouts.\u00a0The flat siren will mean that the danger has passed, and now it is safe to venture out. Enemy fighters\u00a0and missiles will reach us in 5-10 minutes after crossing the border, and that's all we would have to ensure our safety. 2. Today evening, we will do black out\u00a0drill between 10-10.15 PM. There will be no siren; hence, all of you are to follow the procedure voluntarily on your own. The power will be cut (If it is resumed by JVVNL). Every possible light source must be turned off. The lights can be put on again after 15 minutes, at 10.15 PM tonight. 3. In the event of an impending attack, the lights will be cut off centrally after the siren goes off, in the next 2-3 minutes.\u00a0That's all the time, you will have to come down the roads and move into the tunnels. It is\u00a0advis",
79
+ "message_id": "<CADCv5Wh8x5L+Z=5bNYG_f=tbX=Lo0HH=1cT3yTw9Huv4Kr+iWQ@mail.gmail.com>"
80
+ },
81
+ {
82
+ "date": "08-May-2025",
83
+ "time": "11:06:17",
84
+ "subject": "Re: [faculty] Re: Instruction for Next few days: Urgent attention",
85
+ "content": "Dear All, There was a complete blackout in the entire city last night from 12-4 AM as all feeders were shut down by the district administration, and there was no power supply anywhere in the city. This might have led to some inconveniences\u00a0for all of us in these difficult times. These directions of complete blackout\u00a0are likely to be given again by the district administration over the next few days, depending on threat perception and intel inputs. I am trying to get our electricity supplies uninterrupted by discussing with the district admin so that the campus community stays indoors during these long and declared blackout periods. It is likely that we will keep all our street lights and public lights off starting the evenings over the next few days. The campus community is advised to ensure that they have all lights off during the declared blackout periods,\u00a0without any defaults. Any defaults may lead to our staying without electricity,\u00a0at par with the rest of the city. In addition, in the event of a siren going off, everyone needs to rush to the hideouts, as per our previous drill. Siren will indicate an upcoming aerial raid. I would also like to reiterate that there is no specific additional threat to the IITJ community. The threat to us is similar to that of any other part of the country, and there is no specific need for any panic or concern. We are all in this situation, as a united Bharat, and we must all face it bravely. There is no need for any anxiety or nervousness by seeing our emails about the safety protocols. These are just to ensure that in the event of any adverse action by our nemesis, our campus community stays safe, and all these measures taken by IITJ and drills were part of precautionary measures taken on the directions of the district administration. You may please contact Prof. Bhabani Satapathi, Prof. S R Vadera or Col Virendra Singh Rathore in case of any genuine concerns. Best wishes Avinash Kumar Agarwal On Wed, May 7, 2025 at 11:32\u202fPM Avin",
86
+ "message_id": "<CADCv5WixMeCadxAfjoOoNrBR2WotkVyw-678FsLfEODX5KRisA@mail.gmail.com>"
87
+ },
88
+ {
89
+ "date": "08-May-2025",
90
+ "time": "21:42:20",
91
+ "subject": "Re: Important Notice: Citywide Blackout and Campus Power Supply Instructions",
92
+ "content": "Evacuation to tunnels immediately On Thu, 8 May, 2025, 21:32 Deputy Director IIT Jodhpur, < dydir@iitj.ac.in > wrote: Dear All, As per instructions from the District Administration, there will be a blackout and no power supply tonight across the entire city of Jodhpur. However, following discussions between our Director and the city administration, a special provision has been made to allow limited power supply within our campus\u2014only to Type C, Type B, and Hostel areas\u2014provided that the campus community strictly adheres to the following: \u2022\tRemain indoors throughout the blackout period. \u2022\tKeep all lights switched off; only fans may be used. \u2022\tNo lights should be visible from outside under any circumstance. Please note, no power supply will be provided to any other areas of the campus apart from the three mentioned above. Your cooperation is essential in ensuring compliance with this directive and maintaining safety for all. Warm regards, Prof. Bhabani Kumar Satapathy",
93
+ "message_id": "<CADCv5WiFEi5Qbim6PqHk7E-fu=qv4XP5ZDbg3uQgBAHou3Tzmw@mail.gmail.com>"
94
+ },
95
+ {
96
+ "date": "10-May-2025",
97
+ "time": "18:33:45",
98
+ "subject": "Updates",
99
+ "content": "Dear All, We should be aware that the threat is now over, and we can resume our \"Business as usual\". Those who are planning to go should not, and those who have already left the campus can make their plans to return, as per their convenience. Congratulations to all for showing an absolute resolve to tackle this national threat and showing that we are Bharat of the 21st century, a \"Naya Bharat\". This also calls for all IITJ constituents to work actively towards the national defence and offence capabilities. Jai Hind and Jai Bharat. Best regards Avinash Kumar AGarwal",
100
+ "message_id": "<CADCv5Wg8YogC8kG2DH7w=GmpZiKb80_nXMhBFRJjNTUZBVGQVQ@mail.gmail.com>"
101
+ },
102
+ {
103
+ "date": "24-May-2025",
104
+ "time": "21:46:02",
105
+ "subject": "Great News",
106
+ "content": "Dear All, I am delighted to share with you some fantastic news. Our Jaipur campus has come one step closer to realisation with the Government of Rajasthan agreeing \"in principle\" to allocate us land and buildings. Now we have secured a letter of intent from the GoR, which now needs to be taken up with the Ministry of Education and the Ministry of Finance, Government of India. Once these approvals are secured, we will realise our dream of having a Jaipur campus, apart from our main campus in Jodhpur. We are also beginning to work on our small footprint\u00a0campus in Jaisalmer to\u00a0complete our dream of IITJ3. This is a big feat for us as an institute to get the GoR to agree to our proposal. Hopefully, more good things will follow. -- With warm regards..... Prof. Avinash Kumar Agarwal, FTWAS, FAAAS, FCI, FSAE, FASME, FRSC, FNAE, FNASc, FISEES Director, IIT Jodhpur & Sir J C Bose National Fellow Tel: +91 291 2801011 (Off) Wikipedia: tinyurl.com/bdhe89ew | Scopus: https://tinyurl.com/mwccdcc4 | Google Scholar: https://tinyurl.com/mtbyv7w4 | FUEL: https://tinyurl.com/bdzn4r28 | Orcid: https://tinyurl.com/537m3tad ------------------------------ ------------------------------ ---------------- \u2022\u00a0 \u00a0 \u00a0 \u00a0Fellow of The World Academy of Science \u2022\u00a0 \u00a0 \u00a0 \u00a0Fellow of Combustion Institute, USA \u2022\u00a0 \u00a0 \u00a0 \u00a0Fellow of American Association for the Advancement of Science \u2022\u00a0 \u00a0 \u00a0 \u00a0Fellow of American Society of Mechanical Engineers \u2022\u00a0 \u00a0 \u00a0 \u00a0Fellow of Society of Automotive Engineers International, USA \u2022\u00a0 \u00a0 \u00a0 \u00a0Fellow of World Society for Sustainable Energy Technologies, UK \u2022\u00a0 \u00a0 \u00a0 \u00a0Fellow of Royal Society of Chemistry, UK \u2022\u00a0 \u00a0 \u00a0 \u00a0Fellow of National Academy of Sciences India \u2022\u00a0 \u00a0 \u00a0 \u00a0Fellow of Indian National Academy of Engineering \u2022\u00a0 \u00a0 \u00a0 \u00a0Fellow of International Society of Energy, Environment, and Sustainability ------------------------------ ------------------------------ ------------- \u2022\u00a0 \u00a0 \u00a0 \u00a0Shanti Swarup Bhatnagar Award-2016 \u2022\u00a0 \u00a0 \u00a0 \u00a0Editor of FUEL \u2022\u00a0 \u00a0 \u00a0 \u00a0Associate Editor of ASME Open Journal of Enginee",
107
+ "message_id": "<CADCv5Wik-=UY6XFVqbtHBPNYXGe3gWkYE82qV286AySDp1qL_w@mail.gmail.com>"
108
+ },
109
+ {
110
+ "date": "07-Jun-2025",
111
+ "time": "12:44:03",
112
+ "subject": "Greetings on the occasion of Eid al-Adha!",
113
+ "content": "Dear All, On the joyous occasion of Eid al-Adha, I extend my warmest greetings to all members of IITJ. This festival, rooted in the values of personal sacrifices, compassion, empathy and unity, inspires us to strengthen our bonds and work together for the greater good. At IIT Jodhpur, we are committed to nurturing an environment of compassion, empathy, honesty, collaboration, innovation, and integrity. As we celebrate this auspicious day, let us reaffirm our dedication to positive growth, unite in our pursuit of excellence, and resolve to uphold transparency. May this festival bring peace, prosperity, and harmony to our vibrant campus community and its constituents. Best wishes, Affectionately Yours Avinash Kumar Agarwal Director",
114
+ "message_id": "<CADCv5Wj4J-FCNitA2r_m9uT5pFZNz-OQFXTwQM1em+ki69=9jQ@mail.gmail.com>"
115
+ }
116
+ ],
117
+ "last_scraped": "07-Jun-2025"
118
+ },
119
+ "agarwal.27@gmail.com": {
120
+ "emails": [],
121
+ "last_scraped": "07-Jun-2025"
122
+ },
123
+ "agarwal.27@iitj.ac.in": {
124
+ "emails": [
125
+ {
126
+ "date": "07-Jun-2025",
127
+ "time": "16:42:51",
128
+ "subject": "testing",
129
+ "content": "hi bro",
130
+ "message_id": "<CAPziNCaSuVqpqNNfsRjhVbx22jN_vos3EGK_Odt-8WiD0HRKKQ@mail.gmail.com>"
131
+ }
132
+ ],
133
+ "last_scraped": "07-Jun-2025"
134
+ }
135
+ }
server/email_scraper.py ADDED
@@ -0,0 +1,267 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Enhanced Email Scraper with Intelligent Caching
4
+ """
5
+
6
+ import os
7
+ import imaplib
8
+ import json
9
+ from email import message_from_bytes
10
+ from bs4 import BeautifulSoup
11
+ from datetime import datetime, timedelta
12
+ from dotenv import load_dotenv
13
+ from zoneinfo import ZoneInfo
14
+ from email.utils import parsedate_to_datetime
15
+ from typing import List, Dict
16
+
17
+ load_dotenv()
18
+
19
+ # Email credentials
20
+ APP_PASSWORD = os.getenv("APP_PASSWORD")
21
+ EMAIL_ID = os.getenv("EMAIL_ID")
22
+ EMAIL_DB_FILE = "email_db.json"
23
+
24
+ def _imap_connect():
25
+ """Connect to Gmail IMAP server"""
26
+ try:
27
+ mail = imaplib.IMAP4_SSL("imap.gmail.com")
28
+ mail.login(EMAIL_ID, APP_PASSWORD)
29
+ mail.select('"[Gmail]/All Mail"')
30
+ return mail
31
+ except Exception as e:
32
+ print(f"IMAP connection failed: {e}")
33
+ raise
34
+
35
+ def _email_to_clean_text(msg):
36
+ """Extract clean text from email message"""
37
+ # Try HTML first
38
+ html_content = None
39
+ text_content = None
40
+
41
+ if msg.is_multipart():
42
+ for part in msg.walk():
43
+ content_type = part.get_content_type()
44
+ if content_type == "text/html":
45
+ try:
46
+ html_content = part.get_payload(decode=True).decode(errors="ignore")
47
+ except:
48
+ continue
49
+ elif content_type == "text/plain":
50
+ try:
51
+ text_content = part.get_payload(decode=True).decode(errors="ignore")
52
+ except:
53
+ continue
54
+ else:
55
+ # Non-multipart message
56
+ content_type = msg.get_content_type()
57
+ try:
58
+ content = msg.get_payload(decode=True).decode(errors="ignore")
59
+ if content_type == "text/html":
60
+ html_content = content
61
+ else:
62
+ text_content = content
63
+ except:
64
+ pass
65
+
66
+ # Clean HTML content
67
+ if html_content:
68
+ soup = BeautifulSoup(html_content, "html.parser")
69
+ # Remove script and style elements
70
+ for script in soup(["script", "style"]):
71
+ script.decompose()
72
+ return soup.get_text(separator=' ', strip=True)
73
+ elif text_content:
74
+ return text_content.strip()
75
+ else:
76
+ return ""
77
+
78
+ def _load_email_db() -> Dict:
79
+ """Load email database from file"""
80
+ if not os.path.exists(EMAIL_DB_FILE):
81
+ return {}
82
+ try:
83
+ with open(EMAIL_DB_FILE, "r") as f:
84
+ return json.load(f)
85
+ except (json.JSONDecodeError, IOError):
86
+ print(f"Warning: Could not load {EMAIL_DB_FILE}, starting with empty database")
87
+ return {}
88
+
89
+ def _save_email_db(db: Dict):
90
+ """Save email database to file"""
91
+ try:
92
+ with open(EMAIL_DB_FILE, "w") as f:
93
+ json.dump(db, f, indent=2)
94
+ except IOError as e:
95
+ print(f"Error saving database: {e}")
96
+ raise
97
+
98
+ def _date_to_imap_format(date_str: str) -> str:
99
+ """Convert DD-MMM-YYYY to IMAP date format"""
100
+ try:
101
+ dt = datetime.strptime(date_str, "%d-%b-%Y")
102
+ return dt.strftime("%d-%b-%Y")
103
+ except ValueError:
104
+ raise ValueError(f"Invalid date format: {date_str}. Expected DD-MMM-YYYY")
105
+
106
+ def _is_date_in_range(email_date: str, start_date: str, end_date: str) -> bool:
107
+ """Check if email date is within the specified range"""
108
+ try:
109
+ email_dt = datetime.strptime(email_date, "%d-%b-%Y")
110
+ start_dt = datetime.strptime(start_date, "%d-%b-%Y")
111
+ end_dt = datetime.strptime(end_date, "%d-%b-%Y")
112
+ return start_dt <= email_dt <= end_dt
113
+ except ValueError:
114
+ return False
115
+
116
+ def scrape_emails_from_sender(sender_email: str, start_date: str, end_date: str) -> List[Dict]:
117
+ """
118
+ Scrape emails from specific sender within date range
119
+ Uses intelligent caching to avoid re-scraping
120
+ """
121
+ print(f"Scraping emails from {sender_email} between {start_date} and {end_date}")
122
+
123
+ # Load existing database
124
+ db = _load_email_db()
125
+ sender_email = sender_email.lower().strip()
126
+
127
+ # Check if we have cached emails for this sender
128
+ if sender_email in db:
129
+ cached_emails = db[sender_email].get("emails", [])
130
+
131
+ # Filter cached emails by date range
132
+ filtered_emails = [
133
+ email for email in cached_emails
134
+ if _is_date_in_range(email["date"], start_date, end_date)
135
+ ]
136
+
137
+ # Check if we need to scrape more recent emails
138
+ last_scraped = db[sender_email].get("last_scraped", "01-Jan-2020")
139
+ today = datetime.today().strftime("%d-%b-%Y")
140
+
141
+ if last_scraped == today and filtered_emails:
142
+ print(f"Using cached emails (last scraped: {last_scraped})")
143
+ return filtered_emails
144
+
145
+ # Need to scrape emails
146
+ try:
147
+ mail = _imap_connect()
148
+
149
+ # Prepare IMAP search criteria
150
+ start_imap = _date_to_imap_format(start_date)
151
+ # Add one day to end_date for BEFORE criteria (IMAP BEFORE is exclusive)
152
+ end_dt = datetime.strptime(end_date, "%d-%b-%Y") + timedelta(days=1)
153
+ end_imap = end_dt.strftime("%d-%b-%Y")
154
+
155
+ search_criteria = f'(FROM "{sender_email}") SINCE "{start_imap}" BEFORE "{end_imap}"'
156
+ print(f"IMAP search: {search_criteria}")
157
+
158
+ # Search for emails
159
+ status, data = mail.search(None, search_criteria)
160
+ if status != 'OK':
161
+ raise Exception(f"IMAP search failed: {status}")
162
+
163
+ email_ids = data[0].split()
164
+ print(f"Found {len(email_ids)} emails")
165
+
166
+ scraped_emails = []
167
+
168
+ # Process each email
169
+ for i, email_id in enumerate(email_ids):
170
+ try:
171
+ print(f"Processing email {i+1}/{len(email_ids)}")
172
+
173
+ # Fetch email
174
+ status, msg_data = mail.fetch(email_id, "(RFC822)")
175
+ if status != 'OK':
176
+ continue
177
+
178
+ # Parse email
179
+ msg = message_from_bytes(msg_data[0][1])
180
+
181
+ # Extract information
182
+ subject = msg.get("Subject", "No Subject")
183
+ content = _email_to_clean_text(msg)
184
+
185
+ # Parse date
186
+ date_header = msg.get("Date", "")
187
+ if date_header:
188
+ try:
189
+ dt_obj = parsedate_to_datetime(date_header)
190
+ # Convert to IST
191
+ ist_dt = dt_obj.astimezone(ZoneInfo("Asia/Kolkata"))
192
+ email_date = ist_dt.strftime("%d-%b-%Y")
193
+ email_time = ist_dt.strftime("%H:%M:%S")
194
+ except:
195
+ email_date = datetime.today().strftime("%d-%b-%Y")
196
+ email_time = "00:00:00"
197
+ else:
198
+ email_date = datetime.today().strftime("%d-%b-%Y")
199
+ email_time = "00:00:00"
200
+
201
+ # Get message ID for deduplication
202
+ message_id = msg.get("Message-ID", f"missing-{email_id.decode()}")
203
+
204
+ scraped_emails.append({
205
+ "date": email_date,
206
+ "time": email_time,
207
+ "subject": subject,
208
+ "content": content[:2000], # Limit content length
209
+ "message_id": message_id
210
+ })
211
+
212
+ except Exception as e:
213
+ print(f"Error processing email {email_id}: {e}")
214
+ continue
215
+
216
+ mail.logout()
217
+
218
+ # Update database
219
+ if sender_email not in db:
220
+ db[sender_email] = {"emails": [], "last_scraped": ""}
221
+
222
+ # Merge with existing emails (avoid duplicates)
223
+ existing_emails = db[sender_email].get("emails", [])
224
+ existing_ids = {email.get("message_id") for email in existing_emails}
225
+
226
+ new_emails = [
227
+ email for email in scraped_emails
228
+ if email["message_id"] not in existing_ids
229
+ ]
230
+
231
+ # Update database
232
+ db[sender_email]["emails"] = existing_emails + new_emails
233
+ db[sender_email]["last_scraped"] = datetime.today().strftime("%d-%b-%Y")
234
+
235
+ # Save database
236
+ _save_email_db(db)
237
+
238
+ # Return filtered results
239
+ all_emails = db[sender_email]["emails"]
240
+ filtered_emails = [
241
+ email for email in all_emails
242
+ if _is_date_in_range(email["date"], start_date, end_date)
243
+ ]
244
+
245
+ print(f"Scraped {len(new_emails)} new emails, returning {len(filtered_emails)} in date range")
246
+ return filtered_emails
247
+
248
+ except Exception as e:
249
+ print(f"Email scraping failed: {e}")
250
+ raise
251
+
252
+ # Test the scraper
253
+ if __name__ == "__main__":
254
+ # Test scraping
255
+ try:
256
+ emails = scrape_emails_from_sender(
257
+ "noreply@example.com",
258
+ "01-Jun-2025",
259
+ "07-Jun-2025"
260
+ )
261
+
262
+ print(f"\nFound {len(emails)} emails:")
263
+ for email in emails[:3]: # Show first 3
264
+ print(f"- {email['date']} {email['time']}: {email['subject']}")
265
+
266
+ except Exception as e:
267
+ print(f"Test failed: {e}")
server/main.py ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from fastapi import FastAPI
2
+ from fastapi.middleware.cors import CORSMiddleware
3
+ from routes import router
4
+
5
+ app = FastAPI(
6
+ title="Email Query System",
7
+ description="Natural language email querying with intent classification",
8
+ version="1.0.0"
9
+ )
10
+
11
+ # Add CORS middleware
12
+ app.add_middleware(
13
+ CORSMiddleware,
14
+ allow_origins=["*"],
15
+ allow_credentials=True,
16
+ allow_methods=["*"],
17
+ allow_headers=["*"],
18
+ )
19
+
20
+ # Include routes
21
+ app.include_router(router, prefix="/api/v1")
22
+
23
+ @app.get("/")
24
+ def root():
25
+ return {
26
+ "message": "Email Query System API",
27
+ "docs": "/docs",
28
+ "health": "/api/v1/health"
29
+ }
server/name_mapping.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "dev": "agarwal.27@iitj.ac.in"
3
+ }
server/query_parser.py ADDED
@@ -0,0 +1,189 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Query Parser with Intent Classification and Name-to-Email Resolution
4
+ """
5
+
6
+ import json
7
+ import os
8
+ from datetime import datetime, timedelta
9
+ from openai import OpenAI
10
+ from typing import Dict, Optional, Tuple
11
+ from dotenv import load_dotenv # <-- Add this
12
+
13
+ # Load environment variables from .env file
14
+ load_dotenv() # <-- Add this
15
+
16
+ # Initialize OpenAI client
17
+ client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
18
+ # File paths
19
+ NAME_MAPPING_FILE = "name_mapping.json"
20
+ EMAIL_DB_FILE = "email_db.json"
21
+
22
+ def _llm(messages, model="gpt-4o-mini", temperature=0):
23
+ """Helper function to call OpenAI API"""
24
+ rsp = client.chat.completions.create(
25
+ model=model,
26
+ temperature=temperature,
27
+ messages=messages,
28
+ )
29
+ return rsp.choices[0].message.content.strip()
30
+
31
+ def _load_name_mapping() -> Dict[str, str]:
32
+ """Load name to email mapping from JSON file"""
33
+ if not os.path.exists(NAME_MAPPING_FILE):
34
+ return {}
35
+ try:
36
+ with open(NAME_MAPPING_FILE, "r") as f:
37
+ return json.load(f)
38
+ except (json.JSONDecodeError, IOError):
39
+ return {}
40
+
41
+ def _save_name_mapping(mapping: Dict[str, str]):
42
+ """Save name to email mapping to JSON file"""
43
+ with open(NAME_MAPPING_FILE, "w") as f:
44
+ json.dump(mapping, f, indent=2)
45
+
46
+ def _load_email_db() -> Dict:
47
+ """Load email database"""
48
+ if not os.path.exists(EMAIL_DB_FILE):
49
+ return {}
50
+ try:
51
+ with open(EMAIL_DB_FILE, "r") as f:
52
+ return json.load(f)
53
+ except (json.JSONDecodeError, IOError):
54
+ return {}
55
+
56
+ def _save_email_db(db: Dict):
57
+ """Save email database"""
58
+ with open(EMAIL_DB_FILE, "w") as f:
59
+ json.dump(db, f, indent=2)
60
+
61
+ def extract_query_info(query: str) -> Dict:
62
+ """
63
+ Extract intent and date range from user query using LLM
64
+ """
65
+ today_str = datetime.today().strftime("%d-%b-%Y")
66
+
67
+ system_prompt = f"""
68
+ You are an email query parser. Today is {today_str}.
69
+
70
+ Given a user query, extract:
71
+ 1. sender_intent: The person/entity they want emails from (could be name or email)
72
+ 2. start_date and end_date: Date range in DD-MMM-YYYY format
73
+
74
+ For relative dates:
75
+ - "last week" = 7 days ago to today
76
+ - "yesterday" = yesterday only
77
+ - "last month" = 30 days ago to today
78
+ - "last 3 days" = 3 days ago to today
79
+
80
+ Examples:
81
+ - "emails from dev agarwal last week" → sender_intent: "dev agarwal"
82
+ - "show amazon emails from last month" → sender_intent: "amazon"
83
+ - "emails from john@company.com yesterday" → sender_intent: "john@company.com"
84
+
85
+ Return ONLY valid JSON:
86
+ {{
87
+ "sender_intent": "extracted name or email",
88
+ "start_date": "DD-MMM-YYYY",
89
+ "end_date": "DD-MMM-YYYY"
90
+ }}
91
+ """
92
+
93
+ messages = [
94
+ {"role": "system", "content": system_prompt},
95
+ {"role": "user", "content": query}
96
+ ]
97
+
98
+ result = _llm(messages)
99
+ return json.loads(result)
100
+
101
+ def resolve_sender_email(sender_intent: str) -> Tuple[Optional[str], bool]:
102
+ """
103
+ Resolve sender intent to actual email address
104
+ Returns: (email_address, needs_user_input)
105
+ """
106
+ # Check if it's already an email address
107
+ if "@" in sender_intent:
108
+ return sender_intent.lower(), False
109
+
110
+ # Load name mapping
111
+ name_mapping = _load_name_mapping()
112
+
113
+ # Normalize the intent (lowercase for comparison)
114
+ normalized_intent = sender_intent.lower().strip()
115
+
116
+ # Check direct match
117
+ if normalized_intent in name_mapping:
118
+ return name_mapping[normalized_intent], False
119
+
120
+ # Check partial matches (fuzzy matching)
121
+ for name, email in name_mapping.items():
122
+ if normalized_intent in name.lower() or name.lower() in normalized_intent:
123
+ return email, False
124
+
125
+ # No match found
126
+ return None, True
127
+
128
+ def store_name_email_mapping(name: str, email: str):
129
+ """Store new name to email mapping"""
130
+ name_mapping = _load_name_mapping()
131
+ name_mapping[name.lower().strip()] = email.lower().strip()
132
+ _save_name_mapping(name_mapping)
133
+
134
+ def parse_email_query(query: str) -> Dict:
135
+ """
136
+ Main function to parse email query
137
+ Returns structured response with next steps
138
+ """
139
+ try:
140
+ # Step 1: Extract intent and dates
141
+ query_info = extract_query_info(query)
142
+ sender_intent = query_info["sender_intent"]
143
+ start_date = query_info["start_date"]
144
+ end_date = query_info["end_date"]
145
+
146
+ # Step 2: Resolve sender email
147
+ email_address, needs_input = resolve_sender_email(sender_intent)
148
+
149
+ if needs_input:
150
+ # Need to ask user for email address
151
+ return {
152
+ "status": "need_email_input",
153
+ "sender_intent": sender_intent,
154
+ "start_date": start_date,
155
+ "end_date": end_date,
156
+ "message": f"I don't have an email address for '{sender_intent}'. Please provide the email address."
157
+ }
158
+ else:
159
+ # Ready to proceed with email scraping
160
+ return {
161
+ "status": "ready_to_scrape",
162
+ "sender_intent": sender_intent,
163
+ "resolved_email": email_address,
164
+ "start_date": start_date,
165
+ "end_date": end_date,
166
+ "message": f"Found email: {email_address} for '{sender_intent}'"
167
+ }
168
+
169
+ except Exception as e:
170
+ return {
171
+ "status": "error",
172
+ "error": str(e),
173
+ "message": "Failed to parse query"
174
+ }
175
+
176
+ # Test the parser
177
+ if __name__ == "__main__":
178
+ # Test cases
179
+ test_queries = [
180
+ "Show me emails from dev agarwal last week",
181
+ "emails from amazon in the last month",
182
+ "get john@company.com emails yesterday",
183
+ "emails from new person last 3 days"
184
+ ]
185
+
186
+ for query in test_queries:
187
+ print(f"\nQuery: {query}")
188
+ result = parse_email_query(query)
189
+ print(f"Result: {json.dumps(result, indent=2)}")
server/routes.py ADDED
@@ -0,0 +1,206 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ FastAPI Routes for Email Query System
4
+ """
5
+
6
+ from fastapi import APIRouter, HTTPException
7
+ from pydantic import BaseModel, EmailStr
8
+ from typing import List, Dict, Optional
9
+ import json
10
+
11
+ # Import our modules
12
+ from query_parser import parse_email_query, store_name_email_mapping
13
+ from email_scraper import scrape_emails_from_sender
14
+
15
+ router = APIRouter()
16
+
17
+ # Pydantic models
18
+ class NaturalQuery(BaseModel):
19
+ query: str
20
+
21
+ class EmailMappingInput(BaseModel):
22
+ name: str
23
+ email: EmailStr
24
+
25
+ class EmailResponse(BaseModel):
26
+ date: str
27
+ time: str
28
+ subject: str
29
+ content: str
30
+ message_id: str
31
+
32
+ class QueryParseResponse(BaseModel):
33
+ status: str
34
+ sender_intent: Optional[str] = None
35
+ resolved_email: Optional[str] = None
36
+ start_date: Optional[str] = None
37
+ end_date: Optional[str] = None
38
+ message: str
39
+ error: Optional[str] = None
40
+
41
+ class EmailsResponse(BaseModel):
42
+ status: str
43
+ sender_intent: str
44
+ resolved_email: str
45
+ start_date: str
46
+ end_date: str
47
+ total_emails: int
48
+ emails: List[EmailResponse]
49
+ message: str
50
+
51
+ @router.post("/parse_query", response_model=QueryParseResponse)
52
+ def parse_email_query_endpoint(input_data: NaturalQuery):
53
+ """
54
+ Parse natural language query to extract intent and dates
55
+ """
56
+ try:
57
+ result = parse_email_query(input_data.query)
58
+ return QueryParseResponse(**result)
59
+ except Exception as e:
60
+ raise HTTPException(status_code=400, detail=f"Query parsing failed: {str(e)}")
61
+
62
+ @router.post("/add_email_mapping")
63
+ def add_email_mapping(mapping: EmailMappingInput):
64
+ """
65
+ Add new name to email mapping
66
+ """
67
+ try:
68
+ store_name_email_mapping(mapping.name, mapping.email)
69
+ return {
70
+ "status": "success",
71
+ "message": f"Mapping added: '{mapping.name}' → '{mapping.email}'"
72
+ }
73
+ except Exception as e:
74
+ raise HTTPException(status_code=400, detail=f"Failed to add mapping: {str(e)}")
75
+
76
+ @router.post("/get_emails", response_model=EmailsResponse)
77
+ def get_emails_from_query(input_data: NaturalQuery):
78
+ """
79
+ Complete flow: Parse query → Resolve email → Scrape emails
80
+ """
81
+ try:
82
+ # Step 1: Parse the query
83
+ parsed_result = parse_email_query(input_data.query)
84
+
85
+ if parsed_result["status"] == "need_email_input":
86
+ raise HTTPException(
87
+ status_code=400,
88
+ detail={
89
+ "type": "need_email_input",
90
+ "sender_intent": parsed_result["sender_intent"],
91
+ "message": parsed_result["message"]
92
+ }
93
+ )
94
+ elif parsed_result["status"] == "error":
95
+ raise HTTPException(status_code=400, detail=parsed_result["message"])
96
+
97
+ # Step 2: Scrape emails
98
+ emails = scrape_emails_from_sender(
99
+ parsed_result["resolved_email"],
100
+ parsed_result["start_date"],
101
+ parsed_result["end_date"]
102
+ )
103
+
104
+ # Step 3: Format response
105
+ email_responses = [
106
+ EmailResponse(
107
+ date=email["date"],
108
+ time=email["time"],
109
+ subject=email["subject"],
110
+ content=email["content"],
111
+ message_id=email["message_id"]
112
+ )
113
+ for email in emails
114
+ ]
115
+
116
+ return EmailsResponse(
117
+ status="success",
118
+ sender_intent=parsed_result["sender_intent"],
119
+ resolved_email=parsed_result["resolved_email"],
120
+ start_date=parsed_result["start_date"],
121
+ end_date=parsed_result["end_date"],
122
+ total_emails=len(emails),
123
+ emails=email_responses,
124
+ message=f"Found {len(emails)} emails from {parsed_result['resolved_email']}"
125
+ )
126
+
127
+ except HTTPException:
128
+ raise
129
+ except Exception as e:
130
+ raise HTTPException(status_code=500, detail=f"Email retrieval failed: {str(e)}")
131
+
132
+ @router.get("/view_mappings")
133
+ def view_name_mappings():
134
+ """
135
+ View all stored name to email mappings
136
+ """
137
+ try:
138
+ from query_parser import _load_name_mapping
139
+ mappings = _load_name_mapping()
140
+ return {
141
+ "status": "success",
142
+ "total_mappings": len(mappings),
143
+ "mappings": mappings
144
+ }
145
+ except Exception as e:
146
+ raise HTTPException(status_code=500, detail=f"Failed to load mappings: {str(e)}")
147
+
148
+ @router.get("/health")
149
+ def health_check():
150
+ """
151
+ Health check endpoint
152
+ """
153
+ return {
154
+ "status": "healthy",
155
+ "message": "Email query system is running"
156
+ }
157
+
158
+ # For testing - manual endpoint to add mapping and then query
159
+ @router.post("/complete_flow")
160
+ def complete_email_flow(input_data: dict):
161
+ """
162
+ Test endpoint for complete flow with optional mapping
163
+ Expected input:
164
+ {
165
+ "query": "emails from john last week",
166
+ "mapping": {"name": "john", "email": "john@example.com"} # optional
167
+ }
168
+ """
169
+ try:
170
+ query = input_data.get("query")
171
+ mapping = input_data.get("mapping")
172
+
173
+ if not query:
174
+ raise HTTPException(status_code=400, detail="Query is required")
175
+
176
+ # Add mapping if provided
177
+ if mapping:
178
+ store_name_email_mapping(mapping["name"], mapping["email"])
179
+
180
+ # Parse and get emails
181
+ parsed_result = parse_email_query(query)
182
+
183
+ if parsed_result["status"] == "need_email_input":
184
+ return {
185
+ "status": "need_mapping",
186
+ "message": parsed_result["message"],
187
+ "sender_intent": parsed_result["sender_intent"]
188
+ }
189
+
190
+ # Get emails
191
+ emails = scrape_emails_from_sender(
192
+ parsed_result["resolved_email"],
193
+ parsed_result["start_date"],
194
+ parsed_result["end_date"]
195
+ )
196
+
197
+ return {
198
+ "status": "success",
199
+ "query": query,
200
+ "parsed": parsed_result,
201
+ "total_emails": len(emails),
202
+ "emails": emails[:5] # Return first 5 emails
203
+ }
204
+
205
+ except Exception as e:
206
+ raise HTTPException(status_code=500, detail=str(e))