Noo88ear commited on
Commit
7498f2c
Β·
0 Parent(s):

πŸš€ Initial deployment of Multi-Agent Job Application Assistant

Browse files

βœ… Features:
- Gemini 2.5 Flash AI generation across all agents
- A2A Protocol for agent communication
- Advanced AI capabilities: Parallel processing, Temporal tracking, Observability
- Resume & Cover Letter generation with ATS optimization
- Job matching and aggregation from multiple sources
- Document export: Word, PowerPoint, Excel
- MCP server integration for tool interoperability
- Comprehensive agent ecosystem with 15+ specialized agents

πŸ› οΈ Technical Stack:
- Gradio with MCP support for web interface
- Google GenerativeAI for LLM operations
- Multi-source job integration (Adzuna, JobSpy, etc.)
- Document generation libraries (python-docx, python-pptx, openpyxl)
- Advanced memory and context management

🎯 Production Ready:
- Environment-based configuration
- Robust error handling and fallbacks
- Comprehensive testing and validation
- Enterprise-grade architecture

This view is limited to 50 files because it contains too many changes. Β  See raw diff
Files changed (50) hide show
  1. .env.example +12 -0
  2. README.md +439 -0
  3. agents/__init__.py +1 -0
  4. agents/__pycache__/__init__.cpython-313.pyc +0 -0
  5. agents/__pycache__/context_engineer.cpython-313.pyc +0 -0
  6. agents/__pycache__/context_scaler.cpython-313.pyc +0 -0
  7. agents/__pycache__/cover_letter_agent.cpython-313.pyc +0 -0
  8. agents/__pycache__/cv_owner.cpython-313.pyc +0 -0
  9. agents/__pycache__/guidelines.cpython-313.pyc +0 -0
  10. agents/__pycache__/job_agent.cpython-313.pyc +0 -0
  11. agents/__pycache__/linkedin_manager.cpython-313.pyc +0 -0
  12. agents/__pycache__/observability.cpython-313.pyc +0 -0
  13. agents/__pycache__/orchestrator.cpython-313.pyc +0 -0
  14. agents/__pycache__/parallel_executor.cpython-313.pyc +0 -0
  15. agents/__pycache__/pipeline.cpython-313.pyc +0 -0
  16. agents/__pycache__/profile_agent.cpython-313.pyc +0 -0
  17. agents/__pycache__/router_agent.cpython-313.pyc +0 -0
  18. agents/__pycache__/temporal_tracker.cpython-313.pyc +0 -0
  19. agents/a2a_cv_owner.py +356 -0
  20. agents/context_engineer.py +540 -0
  21. agents/context_scaler.py +504 -0
  22. agents/cover_letter_agent.py +143 -0
  23. agents/cv_owner.py +441 -0
  24. agents/guidelines.py +257 -0
  25. agents/job_agent.py +29 -0
  26. agents/linkedin_manager.py +120 -0
  27. agents/observability.py +431 -0
  28. agents/orchestrator.py +232 -0
  29. agents/parallel_executor.py +425 -0
  30. agents/pipeline.py +205 -0
  31. agents/profile_agent.py +39 -0
  32. agents/router_agent.py +18 -0
  33. agents/temporal_tracker.py +464 -0
  34. app.py +65 -0
  35. hf_app.py +1613 -0
  36. mcp/__init__.py +1 -0
  37. mcp/__pycache__/__init__.cpython-313.pyc +0 -0
  38. mcp/__pycache__/cover_letter_server.cpython-313.pyc +0 -0
  39. mcp/__pycache__/cv_owner_server.cpython-313.pyc +0 -0
  40. mcp/__pycache__/orchestrator_server.cpython-313.pyc +0 -0
  41. mcp/__pycache__/server_common.cpython-313.pyc +0 -0
  42. mcp/cover_letter_server.py +27 -0
  43. mcp/cv_owner_server.py +27 -0
  44. mcp/orchestrator_server.py +31 -0
  45. mcp/server_common.py +25 -0
  46. memory/__init__.py +1 -0
  47. memory/__pycache__/__init__.cpython-313.pyc +0 -0
  48. memory/__pycache__/store.cpython-313.pyc +0 -0
  49. memory/data/anthony_test__capco_lead_ai_2024__cover_letter.json +9 -0
  50. memory/data/anthony_test__capco_lead_ai_2024__cv_owner.json +45 -0
.env.example ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # API Keys (Required to enable respective provider)
2
+ ANTHROPIC_API_KEY="your_anthropic_api_key_here" # Required: Format: sk-ant-api03-...
3
+ PERPLEXITY_API_KEY="your_perplexity_api_key_here" # Optional: Format: pplx-...
4
+ OPENAI_API_KEY="your_openai_api_key_here" # Optional, for OpenAI models. Format: sk-proj-...
5
+ GOOGLE_API_KEY="your_google_api_key_here" # Optional, for Google Gemini models.
6
+ MISTRAL_API_KEY="your_mistral_key_here" # Optional, for Mistral AI models.
7
+ XAI_API_KEY="YOUR_XAI_KEY_HERE" # Optional, for xAI AI models.
8
+ GROQ_API_KEY="YOUR_GROQ_KEY_HERE" # Optional, for Groq models.
9
+ OPENROUTER_API_KEY="YOUR_OPENROUTER_KEY_HERE" # Optional, for OpenRouter models.
10
+ AZURE_OPENAI_API_KEY="your_azure_key_here" # Optional, for Azure OpenAI models (requires endpoint in .taskmaster/config.json).
11
+ OLLAMA_API_KEY="your_ollama_api_key_here" # Optional: For remote Ollama servers that require authentication.
12
+ GITHUB_API_KEY="your_github_api_key_here" # Optional: For GitHub import/export features. Format: ghp_... or github_pat_...
README.md ADDED
@@ -0,0 +1,439 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Multi‑Agent Job Application Assistant (Streamlit + Gradio/Hugging Face)
2
+
3
+ A production‑ready system to discover jobs, generate ATS‑optimized resumes and cover letters, and export documents to Word/PowerPoint/Excel. Includes secure LinkedIn OAuth (optional), multi‑source job aggregation, Gemini‑powered generation, and advanced agent capabilities (parallelism, temporal tracking, observability, context engineering).
4
+
5
+ ---
6
+
7
+ ### What you get
8
+ - **Two UIs**: Streamlit (`app.py`) and Gradio/HF (`hf_app.py`)
9
+ - **LinkedIn OAuth 2.0** (optional; CSRF‑safe state validation)
10
+ - **Job aggregation**: Adzuna (5k/month) plus resilient fallbacks
11
+ - **ATS‑optimized drafting**: resumes + cover letters (Gemini)
12
+ - **Office exports**:
13
+ - Word resumes and cover letters (5 templates)
14
+ - PowerPoint CV (4 templates)
15
+ - Excel application tracker (5 analytical sheets)
16
+ - **Advanced agents**: parallel execution, temporal memory, observability/tracing, and context engineering/flywheel
17
+ - **LangExtract integration**: structured extraction with Gemini key; robust regex fallback in constrained environments
18
+ - **New**: Router pipeline, Temporal KG integration, Parallel-agents demo, HF minimal Space branch
19
+ - **New (Aug 2025)**: UK resume rules, action-verb upgrades, anti-buzzword scrub, skills proficiency, remote readiness, Muse/Reed/NovorΓ©sumΓ©/StandOut CV checklists, and interactive output controls (exact length, cycles, layout presets)
20
+
21
+ ---
22
+
23
+ ## Quickstart
24
+
25
+ ### 1) Environment (.env)
26
+ Create a UTF‑8 `.env` (values optional if you want mock mode). See `.env.example` for the full list of variables:
27
+ ```ini
28
+ # Behavior
29
+ MOCK_MODE=true
30
+ PORT=7860
31
+
32
+ # LLM / Research
33
+ LLM_PROVIDER=gemini
34
+ LLM_MODEL=gemini-2.5-flash
35
+ GEMINI_API_KEY=
36
+ # Optional per-agent Gemini keys
37
+ GEMINI_API_KEY_CV=
38
+ GEMINI_API_KEY_COVER=
39
+ GEMINI_API_KEY_CHAT=
40
+ GEMINI_API_KEY_PARSER=
41
+ GEMINI_API_KEY_MATCH=
42
+ GEMINI_API_KEY_TAILOR=
43
+ OPENAI_API_KEY=
44
+ ANTHROPIC_API_KEY=
45
+
46
+ TAVILY_API_KEY=
47
+
48
+ # Job APIs
49
+ ADZUNA_APP_ID=
50
+ ADZUNA_APP_KEY=
51
+
52
+ # Office MCP (optional)
53
+ POWERPOINT_MCP_URL=http://localhost:3000
54
+ WORD_MCP_URL=http://localhost:3001
55
+ EXCEL_MCP_URL=http://localhost:3002
56
+
57
+ # LangExtract uses GEMINI key by default
58
+ LANGEXTRACT_API_KEY=
59
+ ```
60
+
61
+ Hardcoded keys have been removed from utility scripts. Use `switch_api_key.py` to safely set keys into `.env` without embedding them in code.
62
+
63
+ ### 2) Install
64
+ - Windows PowerShell
65
+ ```powershell
66
+ python -m venv .venv
67
+ .\.venv\Scripts\Activate.ps1
68
+ pip install -r requirements.txt
69
+ ```
70
+ - Linux/macOS
71
+ ```bash
72
+ python3 -m venv .venv
73
+ source .venv/bin/activate
74
+ pip install -r requirements.txt
75
+ ```
76
+
77
+ ### 3) Run the apps
78
+ - Streamlit (PATH‑safe)
79
+ ```powershell
80
+ python -m streamlit run app.py --server.port 8501
81
+ ```
82
+ - Gradio / Hugging Face (avoid port conflicts)
83
+ ```powershell
84
+ $env:PORT=7861; python hf_app.py
85
+ ```
86
+ ```bash
87
+ PORT=7861 python hf_app.py
88
+ ```
89
+ The HF app binds on 0.0.0.0:$PORT.
90
+
91
+ ---
92
+
93
+ ## πŸ“Š System Architecture Overview
94
+
95
+ This is a **production-ready, multi-agent job application system** with sophisticated AI capabilities and enterprise-grade features:
96
+
97
+ ### πŸ—οΈ Core Architecture
98
+
99
+ #### **Dual Interface Design**
100
+ - **Streamlit Interface** (`app.py`) - Traditional web application for desktop use
101
+ - **Gradio/HF Interface** (`hf_app.py`) - Modern, mobile-friendly, deployable to Hugging Face Spaces
102
+
103
+ #### **Multi-Agent System** (15 Specialized Agents)
104
+
105
+ **Core Processing Agents:**
106
+ - **`OrchestratorAgent`** - Central coordinator managing workflow and job orchestration
107
+ - **`CVOwnerAgent`** - ATS-optimized resume generation with UK-specific formatting rules
108
+ - **`CoverLetterAgent`** - Personalized cover letter generation with keyword optimization
109
+ - **`ProfileAgent`** - Intelligent CV parsing and structured profile extraction
110
+ - **`JobAgent`** - Job posting analysis and requirement extraction
111
+ - **`RouterAgent`** - Dynamic routing based on payload state and workflow stage
112
+
113
+ **Advanced AI Agents:**
114
+ - **`ParallelExecutor`** - Concurrent processing for 3-5x faster multi-job handling
115
+ - **`TemporalTracker`** - Time-stamped application history and pattern analysis
116
+ - **`ObservabilityAgent`** - Real-time tracing, metrics collection, and monitoring
117
+ - **`ContextEngineer`** - Flywheel learning and context optimization
118
+ - **`ContextScaler`** - L1/L2/L3 memory management for scalable context handling
119
+ - **`LinkedInManager`** - OAuth 2.0 integration and profile synchronization
120
+ - **`MetaAgent`** - Combines outputs from multiple specialized analysis agents
121
+ - **`TriageAgent`** - Intelligent task prioritization and routing
122
+
123
+ #### **Guidelines Enforcement System** (`agents/guidelines.py`)
124
+ Comprehensive rule engine ensuring document quality:
125
+ - **UK Compliance**: British English, UK date formats (MMM YYYY), Β£ currency normalization
126
+ - **ATS Optimization**: Plain text formatting, keyword density, section structure
127
+ - **Content Quality**: Anti-buzzword filtering, action verb strengthening, first-person removal
128
+ - **Layout Rules**: Exact length enforcement, heading validation, bullet point formatting
129
+
130
+ ### πŸ”Œ Integration Ecosystem
131
+
132
+ #### **LLM Integration** (`services/llm.py`)
133
+ - **Multi-Provider Support**: OpenAI, Anthropic Claude, Google Gemini
134
+ - **Per-Agent API Keys**: Cost optimization through agent-specific key allocation
135
+ - **Intelligent Fallbacks**: Graceful degradation when providers unavailable
136
+ - **Configurable Models**: Per-agent model selection for optimal performance/cost
137
+
138
+ #### **Job Aggregation** (`services/job_aggregator.py`, `services/jobspy_client.py`)
139
+ - **Primary Sources**: Adzuna API (5,000 jobs/month free tier)
140
+ - **JobSpy Integration**: Indeed, LinkedIn, Glassdoor aggregation
141
+ - **Additional APIs**: Remotive, The Muse, GitHub Jobs
142
+ - **Smart Deduplication**: Title + company matching with fuzzy logic
143
+ - **SSL Bypass**: Automatic retry for corporate environments
144
+
145
+ #### **Document Generation** (`services/`)
146
+ - **Word Documents** (`word_cv.py`): 5 professional templates, MCP server integration
147
+ - **PowerPoint CVs** (`powerpoint_cv.py`): 4 visual templates for presentations
148
+ - **Excel Trackers** (`excel_tracker.py`): 5 analytical sheets with metrics
149
+ - **PDF Export**: Cross-platform compatibility with formatting preservation
150
+
151
+ ### πŸ“ˆ Advanced Features
152
+
153
+ #### **Pipeline Architecture** (`agents/pipeline.py`)
154
+ ```
155
+ User Input β†’ Router β†’ Profile Analysis β†’ Job Analysis β†’ Resume Generation β†’ Cover Letter β†’ Review β†’ Memory Storage
156
+ ↓ ↓ ↓ ↓ ↓ ↓
157
+ Event Log Profile Cache Job Cache Document Cache Metrics Log Temporal KG
158
+ ```
159
+
160
+ #### **Memory & Persistence**
161
+ - **File-backed Storage** (`memory/store.py`): Atomic writes, thread-safe operations
162
+ - **Temporal Knowledge Graph**: Application tracking with time-stamped relationships
163
+ - **Event Sourcing** (`events.jsonl`): Complete audit trail of all agent actions
164
+ - **Caching System** (`utils/cache.py`): TTL-based caching with automatic eviction
165
+
166
+ #### **LangExtract Integration** (`services/langextract_service.py`)
167
+ - **Structured Extraction**: Job requirements, skills, company culture
168
+ - **ATS Optimization**: Keyword extraction and scoring
169
+ - **Fallback Mechanisms**: Regex-based extraction when API unavailable
170
+ - **Result Caching**: Performance optimization for repeated analyses
171
+
172
+ ### πŸ›‘οΈ Security & Configuration
173
+
174
+ #### **Authentication & Security**
175
+ - **OAuth 2.0**: LinkedIn integration with CSRF protection
176
+ - **Input Sanitization**: Path traversal and injection prevention
177
+ - **Environment Isolation**: Secrets management via `.env`
178
+ - **Rate Limiting**: API throttling and abuse prevention
179
+
180
+ #### **Configuration Management**
181
+ - **Environment Variables**: All sensitive data in `.env`
182
+ - **Agent Configuration** (`utils/config.py`): Centralized settings
183
+ - **Template System**: Customizable document templates
184
+ - **Feature Flags**: Progressive enhancement based on available services
185
+
186
+ ### πŸ“ Project Structure
187
+
188
+ ```
189
+ 2096955/
190
+ β”œβ”€β”€ agents/ # Multi-agent system components
191
+ β”‚ β”œβ”€β”€ orchestrator.py # Main orchestration logic
192
+ β”‚ β”œβ”€β”€ cv_owner.py # Resume generation with guidelines
193
+ β”‚ β”œβ”€β”€ guidelines.py # UK rules and ATS optimization
194
+ β”‚ β”œβ”€β”€ pipeline.py # Application pipeline flow
195
+ β”‚ └── ... # Additional specialized agents
196
+ β”œβ”€β”€ services/ # External integrations and services
197
+ β”‚ β”œβ”€β”€ llm.py # Multi-provider LLM client
198
+ β”‚ β”œβ”€β”€ job_aggregator.py # Job source aggregation
199
+ β”‚ β”œβ”€β”€ word_cv.py # Word document generation
200
+ β”‚ └── ... # Document and API services
201
+ β”œβ”€β”€ utils/ # Utility functions and helpers
202
+ β”‚ β”œβ”€β”€ ats.py # ATS scoring and optimization
203
+ β”‚ β”œβ”€β”€ cache.py # TTL caching system
204
+ β”‚ β”œβ”€β”€ consistency.py # Contradiction detection
205
+ β”‚ └── ... # Text processing and helpers
206
+ β”œβ”€β”€ models/ # Data models and schemas
207
+ β”‚ └── schemas.py # Pydantic models for type safety
208
+ β”œβ”€β”€ mcp/ # Model Context Protocol servers
209
+ β”‚ β”œβ”€β”€ cv_owner_server.py
210
+ β”‚ β”œβ”€β”€ cover_letter_server.py
211
+ β”‚ └── orchestrator_server.py
212
+ β”œβ”€β”€ memory/ # Persistent storage
213
+ β”‚ β”œβ”€β”€ store.py # File-backed memory store
214
+ β”‚ └── data/ # Application state and history
215
+ β”œβ”€β”€ app.py # Streamlit interface
216
+ β”œβ”€β”€ hf_app.py # Gradio/HF interface
217
+ └── api_llm_integration.py # REST API endpoints
218
+ ```
219
+
220
+ ### πŸš€ Performance Optimizations
221
+
222
+ - **Parallel Processing**: Async job handling with `asyncio` and `nest_asyncio`
223
+ - **Lazy Loading**: Dependencies loaded only when needed
224
+ - **Smart Caching**: Multi-level caching (memory, file, API responses)
225
+ - **Batch Operations**: Efficient multi-job processing
226
+ - **Event-Driven**: Asynchronous event handling for responsiveness
227
+
228
+ ### πŸ§ͺ Testing & Quality
229
+
230
+ - **Test Suites**: Comprehensive tests in `tests/` directory
231
+ - **Integration Tests**: API and service integration validation
232
+ - **Mock Mode**: Development without API keys
233
+ - **Smoke Tests**: Quick validation scripts for deployment
234
+ - **Observability**: Built-in tracing and metrics collection
235
+
236
+ ---
237
+
238
+ ## Router pipeline (User β†’ Router β†’ Profile β†’ Job β†’ Resume β†’ Cover β†’ Review)
239
+ - Implemented in `agents/pipeline.py` and exposed via API in `api_llm_integration.py` (`/api/llm/pipeline_run`).
240
+ - Agents:
241
+ - `RouterAgent`: routes based on payload state
242
+ - `ProfileAgent`: parses CV to structured profile (LLM with fallback)
243
+ - `JobAgent`: analyzes job posting (LLM with fallback)
244
+ - `CVOwnerAgent` and `CoverLetterAgent`: draft documents (Gemini, per-agent keys)
245
+ - Review: contradiction checks and memory persist
246
+ - Temporal tracking: on review, a `drafted` status is recorded in the temporal KG with issues metadata.
247
+
248
+ **Flow diagram**
249
+ ```mermaid
250
+ flowchart TD
251
+ U["User"] --> R["RouterAgent"]
252
+ R -->|cv_text present| P["ProfileAgent (LLM)"]
253
+ R -->|job_posting present| J["JobAgent (LLM)"]
254
+ P --> RESUME["CVOwnerAgent"]
255
+ J --> RESUME
256
+ RESUME --> COVER["CoverLetterAgent"]
257
+ COVER --> REVIEW["Orchestrator Review"]
258
+ REVIEW --> M["MemoryStore (file-backed)"]
259
+ REVIEW --> TKG["Temporal KG (triplets)"]
260
+ subgraph LLM["LLM Client (Gemini 2.5 Flash, per-agent keys)"]
261
+ P
262
+ J
263
+ RESUME
264
+ COVER
265
+ end
266
+ subgraph UI["Gradio (HF)"]
267
+ U
268
+ end
269
+ subgraph API["Flask API"]
270
+ PR["/api/llm/pipeline_run"]
271
+ end
272
+ U -. optional .-> PR
273
+ ```
274
+
275
+ ---
276
+
277
+ ## Hugging Face / Gradio (interactive controls)
278
+ - In the CV Analysis tab, you can now set:
279
+ - **Refinement cycles** (1–5)
280
+ - **Exact target length** (characters) to enforce resume and cover length deterministically
281
+ - **Layout preset**: `classic`, `modern`, `minimalist`, `executive`
282
+ - classic: Summary β†’ Skills β†’ Experience β†’ Education (above the fold for Summary/Skills)
283
+ - modern: Summary β†’ Experience β†’ Skills β†’ Projects/Certifications β†’ Education
284
+ - minimalist: concise Summary β†’ Skills β†’ Experience β†’ Education
285
+ - executive: Summary β†’ Selected Achievements (3–5) β†’ Experience β†’ Skills β†’ Education β†’ Certifications
286
+
287
+ ---
288
+
289
+ ## UK resume/cover rules (built-in)
290
+ - UK English and dates (MMM YYYY)
291
+ - Current role in present tense; previous roles in past tense
292
+ - Digits for numbers; Β£ and % normalization
293
+ - Remove first‑person pronouns in resume bullets; maintain active voice
294
+ - Hard skills first (max ~10), then soft skills; verbatim critical JD keywords in bullets
295
+ - Strip DOB/photo lines; compress older roles (>15 years) to title/company/dates
296
+
297
+ These rules are applied by `agents/cv_owner.py` and validated by checklists.
298
+
299
+ ---
300
+
301
+ ## Checklists and observability
302
+ - Checklists integrate guidance from:
303
+ - Reed: CV layout and mistakes
304
+ - The Muse: action verbs and layout basics
305
+ - NovorΓ©sumΓ©: one‑page bias, clean sections, links
306
+ - StandOut CV: quantification, bullet density, recent‑role focus
307
+ - Observability tab aggregates per‑agent events and displays checklist outcomes. Events are stored in `memory/data/events.jsonl`.
308
+
309
+ ---
310
+
311
+ ## Scripts (headless runs)
312
+ - Capco (Anthony Lui β†’ Capco):
313
+ ```powershell
314
+ python .\scripts\run_with_env.py .\scripts\run_anthony_capco.py
315
+ ```
316
+ - Anthropic (Anthony Lui β†’ Anthropic):
317
+ ```powershell
318
+ python .\scripts\run_with_env.py .\scripts\run_anthropic_job.py
319
+ ```
320
+ - Pipeline (Router + Agents + Review + Events):
321
+ ```powershell
322
+ python .\scripts\run_with_env.py .\scripts\pipeline_anthony_capco.py
323
+ ```
324
+
325
+ These scripts print document lengths, agent diagnostics, and whether Gemini is enabled. Set `.env` with `LLM_PROVIDER=gemini`, `LLM_MODEL=gemini-2.5-flash`, and `GEMINI_API_KEY`.
326
+
327
+ ---
328
+
329
+ ## Temporal knowledge graph (micro‑memory)
330
+ - `agents/temporal_tracker.py` stores time‑stamped triplets with non‑destructive invalidation.
331
+ - Integrated in pipeline review to track job application states and history.
332
+ - Utilities for timelines, active applications, and pattern analysis included.
333
+
334
+ ---
335
+
336
+ ## Parallel agents + meta‑agent demo
337
+ - Notebook: `notebooks/agents_parallel_demo.ipynb`
338
+ - Runs 4 analysis agents in parallel and combines outputs via a meta‑agent, with a timeline plot.
339
+ - Uses the central LLM client (`services/llm.py`) with `LLM_PROVIDER=gemini` and `LLM_MODEL=gemini-2.5-flash`.
340
+
341
+ Run (Jupyter/VSCode):
342
+ ```python
343
+ %pip install nest_asyncio matplotlib
344
+ # Ensure GEMINI_API_KEY is set in your environment
345
+ ```
346
+ Open and run the notebook cells.
347
+
348
+ ---
349
+
350
+ ## LinkedIn OAuth (optional)
351
+ 1) Create a LinkedIn Developer App, then add redirect URLs:
352
+ ```
353
+ http://localhost:8501
354
+ http://localhost:8501/callback
355
+ ```
356
+ 2) Products: enable β€œSign In with LinkedIn using OpenID Connect”.
357
+ 3) Update `.env` and set `MOCK_MODE=false`.
358
+ 4) In the UI, use the β€œLinkedIn Authentication” section to kick off the flow.
359
+
360
+ Notes:
361
+ - LinkedIn Jobs API is enterprise‑only. The system uses Adzuna + other sources for job data.
362
+
363
+ ---
364
+
365
+ ## Job sources
366
+ - **Adzuna**: global coverage, 5,000 free jobs/month
367
+ - **Resilient aggregator** and optional **JobSpy MCP** for broader search
368
+ - **Custom jobs**: add your own postings in the UI
369
+ - Corporate SSL environments: Adzuna calls auto‑retries with `verify=False` fallback
370
+
371
+ ---
372
+
373
+ ## LLMs and configuration
374
+ - Central client supports OpenAI, Anthropic, and Gemini with per‑agent Gemini keys (`services/llm.py`).
375
+ - Recommended defaults for this project:
376
+ - `LLM_PROVIDER=gemini`
377
+ - `LLM_MODEL=gemini-2.5-flash`
378
+ - Agents pass `agent="cv|cover|parser|match|tailor|chat"` to use per‑agent keys when provided.
379
+
380
+ ---
381
+
382
+ ## Advanced agents (built‑in)
383
+ - **Parallel processing**: 3–5Γ— faster multi‑job drafting
384
+ - **Temporal tracking**: time‑stamped history and pattern analysis
385
+ - **Observability**: tracing, metrics, timeline visualization
386
+ - **Context engineering**: flywheel learning, L1/L2/L3 memory, scalable context
387
+
388
+ Toggle these in the HF app under β€œπŸš€ Advanced AI Features”.
389
+
390
+ ---
391
+
392
+ ## LangExtract + Gemini
393
+ - Uses the same `GEMINI_API_KEY` (auto‑applied to `LANGEXTRACT_API_KEY` when empty)
394
+ - Official `langextract.extract(...)` requires examples; the UI also exposes a robust regex‑based fallback (`services/langextract_service.py`) so features work even when cloud extraction is constrained
395
+ - In HF app (β€œπŸ” Enhanced Job Analysis”), you can:
396
+ - Analyze job postings (structured fields + skills)
397
+ - Optimize resume for ATS (score + missing keywords)
398
+ - Bulk analyze multiple jobs
399
+
400
+ ---
401
+
402
+ ## Office exports
403
+ - **Word** (`services/word_cv.py`): resumes + cover letters (5 templates; `python‑docx` fallback)
404
+ - **PowerPoint** (`services/powerpoint_cv.py`): visual CV (4 templates; `python‑pptx` fallback)
405
+ - **Excel** (`services/excel_tracker.py`): tracker with 5 analytical sheets (`openpyxl` fallback)
406
+ - MCP servers supported when available; local libraries are used otherwise
407
+
408
+ In HF app, after generation, expand:
409
+ - β€œπŸ“Š Export to PowerPoint CV”
410
+ - β€œπŸ“ Export to Word Documents”
411
+ - β€œπŸ“ˆ Export Excel Tracker”
412
+
413
+ ---
414
+
415
+ ## Hugging Face minimal Space branch
416
+ - Clean branch containing only `app.py` and `requirements.txt` for Spaces.
417
+ - Branch name: `hf-space-min` (push from a clean worktree).
418
+ - `.gitignore` includes `.env` and `.env.*` to avoid leaking secrets.
419
+
420
+ ---
421
+
422
+ ## Tests & scripts
423
+ - Run test suites in `tests/`
424
+ - Useful scripts: `test_*` files in project root (integration checks)
425
+
426
+ ---
427
+
428
+ ## Security
429
+ - OAuth state validation, input/path/url sanitization
430
+ - Sensitive data via environment variables; avoid committing secrets
431
+ - Atomic writes in memory store
432
+
433
+ ---
434
+
435
+ ## Run summary
436
+ - Streamlit: `python -m streamlit run app.py --server.port 8501`
437
+ - Gradio/HF: `PORT=7861 python hf_app.py`
438
+
439
+ Your system is fully documented here in one place and ready for local or HF deployment.
agents/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ # agents package
agents/__pycache__/__init__.cpython-313.pyc ADDED
Binary file (186 Bytes). View file
 
agents/__pycache__/context_engineer.cpython-313.pyc ADDED
Binary file (24.9 kB). View file
 
agents/__pycache__/context_scaler.cpython-313.pyc ADDED
Binary file (21.9 kB). View file
 
agents/__pycache__/cover_letter_agent.cpython-313.pyc ADDED
Binary file (8.97 kB). View file
 
agents/__pycache__/cv_owner.cpython-313.pyc ADDED
Binary file (27.1 kB). View file
 
agents/__pycache__/guidelines.cpython-313.pyc ADDED
Binary file (15 kB). View file
 
agents/__pycache__/job_agent.cpython-313.pyc ADDED
Binary file (1.46 kB). View file
 
agents/__pycache__/linkedin_manager.cpython-313.pyc ADDED
Binary file (6.72 kB). View file
 
agents/__pycache__/observability.cpython-313.pyc ADDED
Binary file (18.7 kB). View file
 
agents/__pycache__/orchestrator.cpython-313.pyc ADDED
Binary file (11.2 kB). View file
 
agents/__pycache__/parallel_executor.cpython-313.pyc ADDED
Binary file (15.7 kB). View file
 
agents/__pycache__/pipeline.cpython-313.pyc ADDED
Binary file (12.6 kB). View file
 
agents/__pycache__/profile_agent.cpython-313.pyc ADDED
Binary file (2.06 kB). View file
 
agents/__pycache__/router_agent.cpython-313.pyc ADDED
Binary file (1.51 kB). View file
 
agents/__pycache__/temporal_tracker.cpython-313.pyc ADDED
Binary file (20.6 kB). View file
 
agents/a2a_cv_owner.py ADDED
@@ -0,0 +1,356 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ A2A Protocol Implementation for CV Owner Agent
3
+ Proof of Concept showing how agents can communicate via A2A protocol
4
+ """
5
+
6
+ import json
7
+ import asyncio
8
+ from typing import Dict, Any, List, Optional
9
+ from datetime import datetime
10
+ from dataclasses import dataclass, asdict
11
+ import aiohttp
12
+ from aiohttp import web
13
+ import logging
14
+
15
+ # Import existing CV Owner logic
16
+ from agents.cv_owner import CVOwnerAgent as OriginalCVOwner
17
+ from models.schemas import JobPosting, ResumeDraft
18
+
19
+ logging.basicConfig(level=logging.INFO)
20
+ logger = logging.getLogger(__name__)
21
+
22
+
23
+ @dataclass
24
+ class AgentCard:
25
+ """Agent discovery card following A2A specification"""
26
+ name: str
27
+ description: str
28
+ version: str
29
+ endpoint: str
30
+ capabilities: List[str]
31
+ interaction_modes: List[str]
32
+ auth_required: bool = False
33
+
34
+ def to_dict(self) -> Dict[str, Any]:
35
+ return asdict(self)
36
+
37
+
38
+ class A2ACVOwnerAgent:
39
+ """CV Owner Agent implementing A2A Protocol"""
40
+
41
+ def __init__(self, port: int = 8001):
42
+ self.port = port
43
+ self.name = "cv_owner_service"
44
+ self.version = "1.0.0"
45
+ self.original_agent = OriginalCVOwner()
46
+ self.app = web.Application()
47
+ self.setup_routes()
48
+
49
+ # Agent Card for discovery
50
+ self.card = AgentCard(
51
+ name=self.name,
52
+ description="ATS-optimized resume generation with UK formatting rules",
53
+ version=self.version,
54
+ endpoint=f"http://localhost:{self.port}",
55
+ capabilities=[
56
+ "resume.generate",
57
+ "resume.refine",
58
+ "resume.optimize_ats",
59
+ "resume.validate_uk_format"
60
+ ],
61
+ interaction_modes=["sync", "async", "stream"],
62
+ auth_required=False
63
+ )
64
+
65
+ def setup_routes(self):
66
+ """Setup A2A JSON-RPC 2.0 routes"""
67
+ self.app.router.add_post('/rpc', self.handle_rpc)
68
+ self.app.router.add_get('/agent-card', self.get_agent_card)
69
+ self.app.router.add_get('/health', self.health_check)
70
+
71
+ async def get_agent_card(self, request: web.Request) -> web.Response:
72
+ """Return agent discovery card"""
73
+ return web.json_response(self.card.to_dict())
74
+
75
+ async def health_check(self, request: web.Request) -> web.Response:
76
+ """Health check endpoint"""
77
+ return web.json_response({
78
+ "status": "healthy",
79
+ "agent": self.name,
80
+ "version": self.version,
81
+ "timestamp": datetime.now().isoformat()
82
+ })
83
+
84
+ async def handle_rpc(self, request: web.Request) -> web.Response:
85
+ """Handle JSON-RPC 2.0 requests"""
86
+ try:
87
+ data = await request.json()
88
+
89
+ # Validate JSON-RPC request
90
+ if "jsonrpc" not in data or data["jsonrpc"] != "2.0":
91
+ return self.error_response(
92
+ -32600, "Invalid Request", data.get("id")
93
+ )
94
+
95
+ method = data.get("method")
96
+ params = data.get("params", {})
97
+ request_id = data.get("id")
98
+
99
+ # Route to appropriate method
100
+ if method == "resume.generate":
101
+ result = await self.generate_resume(params)
102
+ elif method == "resume.refine":
103
+ result = await self.refine_resume(params)
104
+ elif method == "resume.optimize_ats":
105
+ result = await self.optimize_ats(params)
106
+ elif method == "resume.validate_uk_format":
107
+ result = await self.validate_uk_format(params)
108
+ elif method == "_capabilities":
109
+ result = self.get_capabilities()
110
+ else:
111
+ return self.error_response(
112
+ -32601, f"Method not found: {method}", request_id
113
+ )
114
+
115
+ # Return success response
116
+ return web.json_response({
117
+ "jsonrpc": "2.0",
118
+ "result": result,
119
+ "id": request_id
120
+ })
121
+
122
+ except Exception as e:
123
+ logger.error(f"RPC error: {str(e)}")
124
+ return self.error_response(
125
+ -32603, f"Internal error: {str(e)}",
126
+ data.get("id") if "data" in locals() else None
127
+ )
128
+
129
+ def error_response(self, code: int, message: str, request_id: Any) -> web.Response:
130
+ """Create JSON-RPC error response"""
131
+ return web.json_response({
132
+ "jsonrpc": "2.0",
133
+ "error": {
134
+ "code": code,
135
+ "message": message
136
+ },
137
+ "id": request_id
138
+ })
139
+
140
+ async def generate_resume(self, params: Dict[str, Any]) -> Dict[str, Any]:
141
+ """Generate resume via A2A protocol"""
142
+ try:
143
+ # Extract parameters
144
+ job_data = params.get("job", {})
145
+ cv_text = params.get("cv_text", "")
146
+ target_length = params.get("target_length", 4000)
147
+
148
+ # Convert to JobPosting object
149
+ job = JobPosting(
150
+ id=job_data.get("id", "unknown"),
151
+ title=job_data.get("title", ""),
152
+ company=job_data.get("company", ""),
153
+ description=job_data.get("description", ""),
154
+ location=job_data.get("location", ""),
155
+ salary_min=job_data.get("salary_min"),
156
+ salary_max=job_data.get("salary_max")
157
+ )
158
+
159
+ # Generate using original agent
160
+ result = self.original_agent.generate_resume(
161
+ job, cv_text, target_length=target_length
162
+ )
163
+
164
+ # Return A2A-formatted response
165
+ return {
166
+ "resume_text": result.text,
167
+ "metadata": result.metadata,
168
+ "ats_score": getattr(result, "ats_score", 0.85),
169
+ "keywords": getattr(result, "keywords", []),
170
+ "generation_time": datetime.now().isoformat(),
171
+ "agent": self.name
172
+ }
173
+
174
+ except Exception as e:
175
+ logger.error(f"Resume generation error: {str(e)}")
176
+ raise
177
+
178
+ async def refine_resume(self, params: Dict[str, Any]) -> Dict[str, Any]:
179
+ """Refine existing resume"""
180
+ try:
181
+ resume_text = params.get("resume_text", "")
182
+ feedback = params.get("feedback", {})
183
+
184
+ # Use original agent's refinement logic
185
+ refined = self.original_agent.refine_resume(
186
+ resume_text, feedback
187
+ )
188
+
189
+ return {
190
+ "refined_text": refined.text,
191
+ "changes_made": refined.metadata.get("changes", []),
192
+ "refinement_time": datetime.now().isoformat()
193
+ }
194
+
195
+ except Exception as e:
196
+ logger.error(f"Resume refinement error: {str(e)}")
197
+ raise
198
+
199
+ async def optimize_ats(self, params: Dict[str, Any]) -> Dict[str, Any]:
200
+ """Optimize resume for ATS"""
201
+ resume_text = params.get("resume_text", "")
202
+ job_description = params.get("job_description", "")
203
+
204
+ # Perform ATS optimization
205
+ optimized = self.original_agent.optimize_for_ats(
206
+ resume_text, job_description
207
+ )
208
+
209
+ return {
210
+ "optimized_text": optimized["text"],
211
+ "ats_score": optimized["score"],
212
+ "keywords_added": optimized["keywords"],
213
+ "optimization_time": datetime.now().isoformat()
214
+ }
215
+
216
+ async def validate_uk_format(self, params: Dict[str, Any]) -> Dict[str, Any]:
217
+ """Validate UK formatting rules"""
218
+ resume_text = params.get("resume_text", "")
219
+
220
+ # Check UK formatting
221
+ issues = []
222
+
223
+ # Check for US date formats
224
+ if "January 2024" not in resume_text and "/2024" in resume_text:
225
+ issues.append("Use UK date format (MMM YYYY)")
226
+
227
+ # Check for US spelling
228
+ us_words = ["optimize", "analyze", "organization"]
229
+ for word in us_words:
230
+ if word in resume_text.lower():
231
+ issues.append(f"Use UK spelling for '{word}'")
232
+
233
+ return {
234
+ "is_valid": len(issues) == 0,
235
+ "issues": issues,
236
+ "validation_time": datetime.now().isoformat()
237
+ }
238
+
239
+ def get_capabilities(self) -> Dict[str, Any]:
240
+ """Return agent capabilities"""
241
+ return {
242
+ "capabilities": self.card.capabilities,
243
+ "version": self.version,
244
+ "interaction_modes": self.card.interaction_modes,
245
+ "max_resume_length": 5000,
246
+ "supported_formats": ["text", "markdown"],
247
+ "uk_formatting": True,
248
+ "ats_optimization": True
249
+ }
250
+
251
+ async def register_with_registry(self, registry_url: str):
252
+ """Register this agent with A2A registry"""
253
+ async with aiohttp.ClientSession() as session:
254
+ try:
255
+ async with session.post(
256
+ f"{registry_url}/register",
257
+ json=self.card.to_dict()
258
+ ) as response:
259
+ if response.status == 200:
260
+ logger.info(f"Registered {self.name} with registry")
261
+ else:
262
+ logger.error(f"Registration failed: {await response.text()}")
263
+ except Exception as e:
264
+ logger.error(f"Could not register with registry: {e}")
265
+
266
+ def run(self):
267
+ """Start the A2A agent server"""
268
+ logger.info(f"Starting {self.name} on port {self.port}")
269
+ logger.info(f"Agent Card available at http://localhost:{self.port}/agent-card")
270
+ logger.info(f"RPC endpoint at http://localhost:{self.port}/rpc")
271
+
272
+ web.run_app(self.app, host='0.0.0.0', port=self.port)
273
+
274
+
275
+ class A2AClient:
276
+ """Client for communicating with A2A agents"""
277
+
278
+ def __init__(self, agent_endpoint: str):
279
+ self.endpoint = agent_endpoint
280
+ self.session = None
281
+
282
+ async def __aenter__(self):
283
+ self.session = aiohttp.ClientSession()
284
+ return self
285
+
286
+ async def __aexit__(self, exc_type, exc_val, exc_tb):
287
+ if self.session:
288
+ await self.session.close()
289
+
290
+ async def call(self, method: str, params: Dict[str, Any] = None) -> Any:
291
+ """Call an A2A agent method"""
292
+ if not self.session:
293
+ self.session = aiohttp.ClientSession()
294
+
295
+ request = {
296
+ "jsonrpc": "2.0",
297
+ "method": method,
298
+ "params": params or {},
299
+ "id": datetime.now().timestamp()
300
+ }
301
+
302
+ async with self.session.post(
303
+ f"{self.endpoint}/rpc",
304
+ json=request
305
+ ) as response:
306
+ data = await response.json()
307
+
308
+ if "error" in data:
309
+ raise Exception(f"RPC Error: {data['error']}")
310
+
311
+ return data.get("result")
312
+
313
+ async def get_agent_card(self) -> Dict[str, Any]:
314
+ """Get agent's discovery card"""
315
+ if not self.session:
316
+ self.session = aiohttp.ClientSession()
317
+
318
+ async with self.session.get(
319
+ f"{self.endpoint}/agent-card"
320
+ ) as response:
321
+ return await response.json()
322
+
323
+
324
+ async def test_a2a_agent():
325
+ """Test the A2A CV Owner Agent"""
326
+ # Start agent in background
327
+ agent = A2ACVOwnerAgent()
328
+
329
+ # In production, this would run in separate process
330
+ # For testing, we'll use the client
331
+
332
+ async with A2AClient("http://localhost:8001") as client:
333
+ # Get agent card
334
+ card = await client.get_agent_card()
335
+ print(f"Agent: {card['name']}")
336
+ print(f"Capabilities: {card['capabilities']}")
337
+
338
+ # Generate resume
339
+ result = await client.call("resume.generate", {
340
+ "job": {
341
+ "id": "test_job",
342
+ "title": "Senior AI Engineer",
343
+ "company": "TechCorp",
344
+ "description": "Looking for AI expert with LLM experience..."
345
+ },
346
+ "cv_text": "John Doe, AI Engineer with 5 years experience..."
347
+ })
348
+
349
+ print(f"Generated resume: {result['resume_text'][:200]}...")
350
+ print(f"ATS Score: {result['ats_score']}")
351
+
352
+
353
+ if __name__ == "__main__":
354
+ # Run the A2A agent
355
+ agent = A2ACVOwnerAgent()
356
+ agent.run()
agents/context_engineer.py ADDED
@@ -0,0 +1,540 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Context Engineering System
3
+ Implements the complete context engineering framework for optimal LLM performance
4
+ Based on the three-step evolution: Retrieval/Generation β†’ Processing β†’ Management
5
+ """
6
+
7
+ import json
8
+ import logging
9
+ from typing import Dict, List, Any, Optional, Tuple
10
+ from datetime import datetime, timedelta
11
+ from dataclasses import dataclass, field
12
+ import hashlib
13
+ from collections import deque
14
+ import numpy as np
15
+ from pathlib import Path
16
+
17
+ logger = logging.getLogger(__name__)
18
+
19
+
20
+ @dataclass
21
+ class ContextChunk:
22
+ """A unit of context with metadata"""
23
+ content: str
24
+ source: str
25
+ timestamp: datetime
26
+ relevance_score: float = 0.0
27
+ token_count: int = 0
28
+ embedding: Optional[np.ndarray] = None
29
+ metadata: Dict = field(default_factory=dict)
30
+ compression_ratio: float = 1.0
31
+ access_count: int = 0
32
+ last_accessed: Optional[datetime] = None
33
+
34
+ def update_access(self):
35
+ """Update access statistics"""
36
+ self.access_count += 1
37
+ self.last_accessed = datetime.now()
38
+
39
+
40
+ class DataFlywheel:
41
+ """
42
+ NVIDIA's concept: Continuous improvement through input/output pairing
43
+ Learns from successful context usage to optimize future retrievals
44
+ """
45
+
46
+ def __init__(self, storage_path: str = "flywheel_data.json"):
47
+ self.storage_path = Path(storage_path)
48
+ self.successful_contexts: List[Dict] = []
49
+ self.feedback_pairs: List[Tuple[str, str, float]] = [] # (input, output, score)
50
+ self.pattern_cache: Dict[str, List[str]] = {}
51
+ self.load()
52
+
53
+ def record_success(
54
+ self,
55
+ input_context: str,
56
+ output: str,
57
+ success_score: float,
58
+ context_chunks: List[ContextChunk]
59
+ ):
60
+ """Record successful context usage for learning"""
61
+ self.successful_contexts.append({
62
+ 'timestamp': datetime.now().isoformat(),
63
+ 'input': input_context[:500], # Truncate for storage
64
+ 'output': output[:500],
65
+ 'score': success_score,
66
+ 'chunks_used': [c.source for c in context_chunks],
67
+ 'avg_relevance': np.mean([c.relevance_score for c in context_chunks])
68
+ })
69
+
70
+ # Update pattern cache
71
+ key = self._generate_pattern_key(input_context)
72
+ if key not in self.pattern_cache:
73
+ self.pattern_cache[key] = []
74
+ self.pattern_cache[key].extend([c.source for c in context_chunks])
75
+
76
+ self.save()
77
+
78
+ def get_recommended_sources(self, query: str) -> List[str]:
79
+ """Get recommended context sources based on past successes"""
80
+ key = self._generate_pattern_key(query)
81
+
82
+ if key in self.pattern_cache:
83
+ # Return most frequently used sources for similar queries
84
+ sources = self.pattern_cache[key]
85
+ from collections import Counter
86
+ return [s for s, _ in Counter(sources).most_common(5)]
87
+
88
+ return []
89
+
90
+ def _generate_pattern_key(self, text: str) -> str:
91
+ """Generate pattern key for caching"""
92
+ # Simple keyword extraction for pattern matching
93
+ keywords = sorted(set(text.lower().split()[:10]))
94
+ return hashlib.md5('_'.join(keywords).encode()).hexdigest()[:8]
95
+
96
+ def save(self):
97
+ """Persist flywheel data"""
98
+ data = {
99
+ 'successful_contexts': self.successful_contexts[-100:], # Keep last 100
100
+ 'pattern_cache': {k: v[-20:] for k, v in self.pattern_cache.items()} # Keep last 20 per pattern
101
+ }
102
+ with open(self.storage_path, 'w') as f:
103
+ json.dump(data, f, indent=2)
104
+
105
+ def load(self):
106
+ """Load flywheel data"""
107
+ if self.storage_path.exists():
108
+ try:
109
+ with open(self.storage_path, 'r') as f:
110
+ data = json.load(f)
111
+ self.successful_contexts = data.get('successful_contexts', [])
112
+ self.pattern_cache = data.get('pattern_cache', {})
113
+ except Exception as e:
114
+ logger.error(f"Error loading flywheel data: {e}")
115
+
116
+
117
+ class ContextProcessor:
118
+ """
119
+ Step 2: Process and refine raw context
120
+ Handles chunking, embedding, relevance scoring, and compression
121
+ """
122
+
123
+ def __init__(self, max_chunk_size: int = 500, overlap: int = 50):
124
+ self.max_chunk_size = max_chunk_size
125
+ self.overlap = overlap
126
+
127
+ def process_context(
128
+ self,
129
+ raw_context: str,
130
+ query: str,
131
+ source: str = "unknown"
132
+ ) -> List[ContextChunk]:
133
+ """Process raw context into optimized chunks"""
134
+
135
+ # 1. Chunk the context
136
+ chunks = self._chunk_text(raw_context)
137
+
138
+ # 2. Create ContextChunk objects
139
+ context_chunks = []
140
+ for chunk_text in chunks:
141
+ chunk = ContextChunk(
142
+ content=chunk_text,
143
+ source=source,
144
+ timestamp=datetime.now(),
145
+ token_count=len(chunk_text.split()),
146
+ relevance_score=self._calculate_relevance(chunk_text, query)
147
+ )
148
+
149
+ # 3. Apply compression if needed
150
+ if chunk.token_count > 100:
151
+ chunk.content, chunk.compression_ratio = self._compress_text(chunk_text)
152
+
153
+ context_chunks.append(chunk)
154
+
155
+ # 4. Sort by relevance
156
+ context_chunks.sort(key=lambda c: c.relevance_score, reverse=True)
157
+
158
+ return context_chunks
159
+
160
+ def _chunk_text(self, text: str) -> List[str]:
161
+ """Smart chunking with overlap"""
162
+ words = text.split()
163
+ chunks = []
164
+
165
+ for i in range(0, len(words), self.max_chunk_size - self.overlap):
166
+ chunk = ' '.join(words[i:i + self.max_chunk_size])
167
+ chunks.append(chunk)
168
+
169
+ return chunks
170
+
171
+ def _calculate_relevance(self, chunk: str, query: str) -> float:
172
+ """Calculate relevance score between chunk and query"""
173
+ # Simple keyword overlap scoring (would use embeddings in production)
174
+ query_words = set(query.lower().split())
175
+ chunk_words = set(chunk.lower().split())
176
+
177
+ if not query_words:
178
+ return 0.0
179
+
180
+ overlap = len(query_words & chunk_words)
181
+ return overlap / len(query_words)
182
+
183
+ def _compress_text(self, text: str) -> Tuple[str, float]:
184
+ """Compress text by removing redundancy"""
185
+ # Simple compression: remove duplicate sentences
186
+ sentences = text.split('.')
187
+ unique_sentences = []
188
+ seen = set()
189
+
190
+ for sent in sentences:
191
+ sent_clean = sent.strip().lower()
192
+ if sent_clean and sent_clean not in seen:
193
+ unique_sentences.append(sent.strip())
194
+ seen.add(sent_clean)
195
+
196
+ compressed = '. '.join(unique_sentences)
197
+ if unique_sentences and not compressed.endswith('.'):
198
+ compressed += '.'
199
+
200
+ compression_ratio = len(compressed) / len(text) if text else 1.0
201
+ return compressed, compression_ratio
202
+
203
+
204
+ class MemoryHierarchy:
205
+ """
206
+ Hierarchical memory system with different levels
207
+ L1: Hot cache (immediate access)
208
+ L2: Working memory (recent contexts)
209
+ L3: Long-term storage (compressed historical)
210
+ """
211
+
212
+ def __init__(
213
+ self,
214
+ l1_size: int = 10,
215
+ l2_size: int = 100,
216
+ l3_path: str = "long_term_memory.json"
217
+ ):
218
+ self.l1_cache: deque = deque(maxlen=l1_size) # Most recent/relevant
219
+ self.l2_memory: deque = deque(maxlen=l2_size) # Working memory
220
+ self.l3_storage_path = Path(l3_path)
221
+ self.l3_index: Dict[str, Dict] = {} # Index for long-term storage
222
+ self.load_l3()
223
+
224
+ def add_context(self, chunk: ContextChunk):
225
+ """Add context to appropriate memory level"""
226
+ # High relevance goes to L1
227
+ if chunk.relevance_score > 0.8:
228
+ self.l1_cache.append(chunk)
229
+ # Medium relevance to L2
230
+ elif chunk.relevance_score > 0.5:
231
+ self.l2_memory.append(chunk)
232
+ # Everything gets indexed in L3
233
+ self._add_to_l3(chunk)
234
+
235
+ def retrieve(
236
+ self,
237
+ query: str,
238
+ max_chunks: int = 10,
239
+ recency_weight: float = 0.3
240
+ ) -> List[ContextChunk]:
241
+ """Retrieve relevant context from all memory levels"""
242
+ all_chunks = []
243
+
244
+ # Get from all levels
245
+ all_chunks.extend(list(self.l1_cache))
246
+ all_chunks.extend(list(self.l2_memory))
247
+
248
+ # Score chunks considering both relevance and recency
249
+ now = datetime.now()
250
+ for chunk in all_chunks:
251
+ # Calculate recency score (0-1, where 1 is most recent)
252
+ age_hours = (now - chunk.timestamp).total_seconds() / 3600
253
+ recency_score = max(0, 1 - (age_hours / 168)) # Decay over a week
254
+
255
+ # Combine relevance and recency
256
+ chunk.metadata['combined_score'] = (
257
+ chunk.relevance_score * (1 - recency_weight) +
258
+ recency_score * recency_weight
259
+ )
260
+
261
+ # Sort by combined score
262
+ all_chunks.sort(
263
+ key=lambda c: c.metadata.get('combined_score', 0),
264
+ reverse=True
265
+ )
266
+
267
+ # Update access statistics
268
+ for chunk in all_chunks[:max_chunks]:
269
+ chunk.update_access()
270
+
271
+ return all_chunks[:max_chunks]
272
+
273
+ def _add_to_l3(self, chunk: ContextChunk):
274
+ """Add to long-term storage index"""
275
+ key = hashlib.md5(chunk.content.encode()).hexdigest()[:16]
276
+
277
+ self.l3_index[key] = {
278
+ 'source': chunk.source,
279
+ 'timestamp': chunk.timestamp.isoformat(),
280
+ 'relevance': chunk.relevance_score,
281
+ 'summary': chunk.content[:100], # Store summary only
282
+ 'access_count': chunk.access_count
283
+ }
284
+
285
+ # Periodically save
286
+ if len(self.l3_index) % 10 == 0:
287
+ self.save_l3()
288
+
289
+ def save_l3(self):
290
+ """Save long-term memory to disk"""
291
+ with open(self.l3_storage_path, 'w') as f:
292
+ json.dump(self.l3_index, f, indent=2)
293
+
294
+ def load_l3(self):
295
+ """Load long-term memory from disk"""
296
+ if self.l3_storage_path.exists():
297
+ try:
298
+ with open(self.l3_storage_path, 'r') as f:
299
+ self.l3_index = json.load(f)
300
+ except Exception as e:
301
+ logger.error(f"Error loading L3 memory: {e}")
302
+
303
+
304
+ class MultiModalContext:
305
+ """
306
+ Handle different types of context beyond text
307
+ Temporal, spatial, participant states, intentional, cultural
308
+ """
309
+
310
+ def __init__(self):
311
+ self.temporal_context: List[Dict] = [] # Time-based relationships
312
+ self.spatial_context: Dict = {} # Location/geometry
313
+ self.participant_states: Dict[str, Dict] = {} # Entity tracking
314
+ self.intentional_context: Dict = {} # Goals and motivations
315
+ self.cultural_context: Dict = {} # Social/cultural nuances
316
+
317
+ def add_temporal_context(
318
+ self,
319
+ event: str,
320
+ timestamp: datetime,
321
+ duration: Optional[timedelta] = None,
322
+ related_events: List[str] = None
323
+ ):
324
+ """Add time-based context"""
325
+ self.temporal_context.append({
326
+ 'event': event,
327
+ 'timestamp': timestamp,
328
+ 'duration': duration,
329
+ 'related': related_events or []
330
+ })
331
+
332
+ # Sort by timestamp
333
+ self.temporal_context.sort(key=lambda x: x['timestamp'])
334
+
335
+ def add_participant_state(
336
+ self,
337
+ participant_id: str,
338
+ state: Dict,
339
+ timestamp: Optional[datetime] = None
340
+ ):
341
+ """Track participant/entity states over time"""
342
+ if participant_id not in self.participant_states:
343
+ self.participant_states[participant_id] = {
344
+ 'current': state,
345
+ 'history': []
346
+ }
347
+ else:
348
+ # Archive current state
349
+ self.participant_states[participant_id]['history'].append({
350
+ 'state': self.participant_states[participant_id]['current'],
351
+ 'timestamp': timestamp or datetime.now()
352
+ })
353
+ self.participant_states[participant_id]['current'] = state
354
+
355
+ def add_intentional_context(
356
+ self,
357
+ goal: str,
358
+ motivation: str,
359
+ constraints: List[str] = None,
360
+ priority: float = 0.5
361
+ ):
362
+ """Add goals and motivations"""
363
+ self.intentional_context[goal] = {
364
+ 'motivation': motivation,
365
+ 'constraints': constraints or [],
366
+ 'priority': priority,
367
+ 'added': datetime.now()
368
+ }
369
+
370
+ def get_multimodal_summary(self) -> Dict:
371
+ """Get summary of all context types"""
372
+ return {
373
+ 'temporal_events': len(self.temporal_context),
374
+ 'tracked_participants': len(self.participant_states),
375
+ 'active_goals': len(self.intentional_context),
376
+ 'has_spatial': bool(self.spatial_context),
377
+ 'has_cultural': bool(self.cultural_context)
378
+ }
379
+
380
+
381
+ class ContextEngineer:
382
+ """
383
+ Main context engineering orchestrator
384
+ Implements the complete 3-step framework
385
+ """
386
+
387
+ def __init__(self):
388
+ self.flywheel = DataFlywheel()
389
+ self.processor = ContextProcessor()
390
+ self.memory = MemoryHierarchy()
391
+ self.multimodal = MultiModalContext()
392
+
393
+ def engineer_context(
394
+ self,
395
+ query: str,
396
+ raw_sources: List[Tuple[str, str]], # (source_name, content)
397
+ multimodal_data: Optional[Dict] = None
398
+ ) -> Dict[str, Any]:
399
+ """
400
+ Complete context engineering pipeline
401
+ Step 1: Retrieval & Generation
402
+ Step 2: Processing
403
+ Step 3: Management
404
+ """
405
+
406
+ # Step 1: Retrieval & Generation
407
+ # Get recommended sources from flywheel
408
+ recommended = self.flywheel.get_recommended_sources(query)
409
+
410
+ # Prioritize recommended sources
411
+ prioritized_sources = []
412
+ for source_name, content in raw_sources:
413
+ priority = 2.0 if source_name in recommended else 1.0
414
+ prioritized_sources.append((source_name, content, priority))
415
+
416
+ # Step 2: Processing
417
+ all_chunks = []
418
+ for source_name, content, priority in prioritized_sources:
419
+ chunks = self.processor.process_context(content, query, source_name)
420
+
421
+ # Apply priority boost
422
+ for chunk in chunks:
423
+ chunk.relevance_score *= priority
424
+
425
+ all_chunks.extend(chunks)
426
+
427
+ # Add to memory hierarchy
428
+ for chunk in all_chunks:
429
+ self.memory.add_context(chunk)
430
+
431
+ # Step 3: Management
432
+ # Retrieve optimized context
433
+ final_chunks = self.memory.retrieve(query, max_chunks=10)
434
+
435
+ # Add multimodal context if provided
436
+ if multimodal_data:
437
+ for key, value in multimodal_data.items():
438
+ if key == 'temporal':
439
+ for event in value:
440
+ self.multimodal.add_temporal_context(**event)
441
+ elif key == 'participants':
442
+ for pid, state in value.items():
443
+ self.multimodal.add_participant_state(pid, state)
444
+ elif key == 'goals':
445
+ for goal, details in value.items():
446
+ self.multimodal.add_intentional_context(goal, **details)
447
+
448
+ # Build final context
449
+ context = {
450
+ 'primary_context': '\n\n'.join([c.content for c in final_chunks[:5]]),
451
+ 'supporting_context': '\n'.join([c.content for c in final_chunks[5:10]]),
452
+ 'metadata': {
453
+ 'total_chunks': len(all_chunks),
454
+ 'selected_chunks': len(final_chunks),
455
+ 'avg_relevance': np.mean([c.relevance_score for c in final_chunks]) if final_chunks else 0,
456
+ 'compression_ratio': np.mean([c.compression_ratio for c in final_chunks]) if final_chunks else 1,
457
+ 'sources_used': list(set(c.source for c in final_chunks)),
458
+ 'multimodal': self.multimodal.get_multimodal_summary()
459
+ },
460
+ 'chunks': final_chunks # For feedback loop
461
+ }
462
+
463
+ return context
464
+
465
+ def record_feedback(
466
+ self,
467
+ context: Dict,
468
+ output: str,
469
+ success_score: float
470
+ ):
471
+ """Record feedback for continuous improvement"""
472
+ self.flywheel.record_success(
473
+ context['primary_context'],
474
+ output,
475
+ success_score,
476
+ context['chunks']
477
+ )
478
+
479
+ def optimize_memory(self):
480
+ """Optimize memory by removing low-value chunks"""
481
+ # This would implement memory pruning based on:
482
+ # - Access frequency
483
+ # - Age
484
+ # - Relevance scores
485
+ # - Compression potential
486
+ pass
487
+
488
+
489
+ # Demo usage
490
+ def demo_context_engineering():
491
+ """Demonstrate context engineering"""
492
+
493
+ engineer = ContextEngineer()
494
+
495
+ # Sample sources
496
+ sources = [
497
+ ("resume", "10 years experience in Python, AI, Machine Learning..."),
498
+ ("job_description", "Looking for senior AI engineer with Python skills..."),
499
+ ("company_research", "TechCorp is a leading AI company focused on NLP...")
500
+ ]
501
+
502
+ # Multimodal context
503
+ multimodal = {
504
+ 'temporal': [
505
+ {
506
+ 'event': 'Application deadline',
507
+ 'timestamp': datetime.now() + timedelta(days=7)
508
+ }
509
+ ],
510
+ 'participants': {
511
+ 'applicant': {'status': 'preparing', 'confidence': 0.8}
512
+ },
513
+ 'goals': {
514
+ 'get_interview': {
515
+ 'motivation': 'Career advancement',
516
+ 'constraints': ['Remote only'],
517
+ 'priority': 0.9
518
+ }
519
+ }
520
+ }
521
+
522
+ # Engineer context
523
+ context = engineer.engineer_context(
524
+ query="Write a cover letter for AI engineer position",
525
+ raw_sources=sources,
526
+ multimodal_data=multimodal
527
+ )
528
+
529
+ print("Engineered Context:")
530
+ print(f"Primary: {context['primary_context'][:200]}...")
531
+ print(f"Metadata: {context['metadata']}")
532
+
533
+ # Simulate success and record feedback
534
+ engineer.record_feedback(context, "Generated cover letter...", 0.9)
535
+
536
+ print("\nFlywheel learned patterns for future use!")
537
+
538
+
539
+ if __name__ == "__main__":
540
+ demo_context_engineering()
agents/context_scaler.py ADDED
@@ -0,0 +1,504 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Context Scaling System
3
+ Handles length scaling (millions of tokens) and multi-modal/structural scaling
4
+ Implements advanced attention methods and memory techniques from the article
5
+ """
6
+
7
+ import logging
8
+ from typing import Dict, List, Any, Optional, Tuple
9
+ from dataclasses import dataclass
10
+ import numpy as np
11
+ from datetime import datetime
12
+ import heapq
13
+
14
+ logger = logging.getLogger(__name__)
15
+
16
+
17
+ @dataclass
18
+ class ScaledContext:
19
+ """Context that can scale to millions of tokens"""
20
+ segments: List[str] # Segmented content
21
+ attention_map: np.ndarray # Attention weights for segments
22
+ token_count: int
23
+ compression_level: int # 0=none, 1=light, 2=medium, 3=heavy
24
+ modalities: Dict[str, Any] # Different context modalities
25
+
26
+
27
+ class AttentionOptimizer:
28
+ """
29
+ Advanced attention methods for handling extremely long contexts
30
+ Implements sliding window, sparse attention, and hierarchical attention
31
+ """
32
+
33
+ def __init__(self, window_size: int = 512, stride: int = 256):
34
+ self.window_size = window_size
35
+ self.stride = stride
36
+
37
+ def sliding_window_attention(
38
+ self,
39
+ context: str,
40
+ query: str,
41
+ max_windows: int = 10
42
+ ) -> List[Tuple[str, float]]:
43
+ """
44
+ Process context using sliding window attention
45
+ Returns relevant windows with attention scores
46
+ """
47
+ tokens = context.split()
48
+ windows = []
49
+
50
+ # Create sliding windows
51
+ for i in range(0, len(tokens) - self.window_size + 1, self.stride):
52
+ window = ' '.join(tokens[i:i + self.window_size])
53
+ score = self._calculate_attention_score(window, query)
54
+ windows.append((window, score))
55
+
56
+ # Return top windows
57
+ windows.sort(key=lambda x: x[1], reverse=True)
58
+ return windows[:max_windows]
59
+
60
+ def hierarchical_attention(
61
+ self,
62
+ context: str,
63
+ query: str,
64
+ levels: int = 3
65
+ ) -> Dict[int, List[str]]:
66
+ """
67
+ Multi-level hierarchical attention
68
+ Higher levels = more compressed/abstract
69
+ """
70
+ hierarchy = {}
71
+ current_text = context
72
+
73
+ for level in range(levels):
74
+ if level == 0:
75
+ # Finest level - full detail
76
+ hierarchy[level] = self._segment_text(current_text, 500)
77
+ elif level == 1:
78
+ # Middle level - paragraphs/sections
79
+ hierarchy[level] = self._extract_key_sentences(current_text)
80
+ else:
81
+ # Highest level - summary
82
+ hierarchy[level] = [self._generate_summary(current_text)]
83
+
84
+ # Compress for next level
85
+ current_text = ' '.join(hierarchy[level])
86
+
87
+ return hierarchy
88
+
89
+ def sparse_attention(
90
+ self,
91
+ context: str,
92
+ query: str,
93
+ sparsity: float = 0.1
94
+ ) -> List[str]:
95
+ """
96
+ Sparse attention - only attend to most relevant tokens
97
+ Reduces computation from O(nΒ²) to O(n*k)
98
+ """
99
+ tokens = context.split()
100
+ query_tokens = set(query.lower().split())
101
+
102
+ # Calculate relevance for each token
103
+ token_scores = []
104
+ for i, token in enumerate(tokens):
105
+ score = 1.0 if token.lower() in query_tokens else np.random.random() * 0.5
106
+ token_scores.append((i, token, score))
107
+
108
+ # Keep only top k% tokens
109
+ k = int(len(tokens) * sparsity)
110
+ top_tokens = heapq.nlargest(k, token_scores, key=lambda x: x[2])
111
+
112
+ # Sort by original position to maintain order
113
+ top_tokens.sort(key=lambda x: x[0])
114
+
115
+ # Reconstruct sparse context
116
+ sparse_context = []
117
+ last_idx = -1
118
+ for idx, token, score in top_tokens:
119
+ if idx > last_idx + 1:
120
+ sparse_context.append("...")
121
+ sparse_context.append(token)
122
+ last_idx = idx
123
+
124
+ return sparse_context
125
+
126
+ def _calculate_attention_score(self, window: str, query: str) -> float:
127
+ """Calculate attention score between window and query"""
128
+ window_words = set(window.lower().split())
129
+ query_words = set(query.lower().split())
130
+
131
+ if not query_words:
132
+ return 0.0
133
+
134
+ overlap = len(window_words & query_words)
135
+ return overlap / len(query_words)
136
+
137
+ def _segment_text(self, text: str, segment_size: int) -> List[str]:
138
+ """Segment text into chunks"""
139
+ words = text.split()
140
+ segments = []
141
+ for i in range(0, len(words), segment_size):
142
+ segments.append(' '.join(words[i:i + segment_size]))
143
+ return segments
144
+
145
+ def _extract_key_sentences(self, text: str) -> List[str]:
146
+ """Extract key sentences (simplified)"""
147
+ sentences = text.split('.')
148
+ # Keep sentences with more than 10 words (likely more informative)
149
+ key_sentences = [s.strip() + '.' for s in sentences if len(s.split()) > 10]
150
+ return key_sentences[:10] # Top 10 sentences
151
+
152
+ def _generate_summary(self, text: str) -> str:
153
+ """Generate summary (simplified - would use LLM in production)"""
154
+ sentences = text.split('.')[:3] # First 3 sentences as summary
155
+ return '. '.join(sentences) + '.'
156
+
157
+
158
+ class LengthScaler:
159
+ """
160
+ Handle context scaling from thousands to millions of tokens
161
+ Maintains coherence across long documents
162
+ """
163
+
164
+ def __init__(self, max_tokens: int = 1000000):
165
+ self.max_tokens = max_tokens
166
+ self.attention_optimizer = AttentionOptimizer()
167
+
168
+ def scale_context(
169
+ self,
170
+ context: str,
171
+ query: str,
172
+ target_tokens: int = 2000
173
+ ) -> ScaledContext:
174
+ """Scale context to target token count while maintaining relevance"""
175
+
176
+ tokens = context.split()
177
+ current_tokens = len(tokens)
178
+
179
+ # Determine compression level needed
180
+ compression_ratio = current_tokens / target_tokens
181
+
182
+ if compression_ratio <= 1:
183
+ # No compression needed
184
+ return ScaledContext(
185
+ segments=[context],
186
+ attention_map=np.array([1.0]),
187
+ token_count=current_tokens,
188
+ compression_level=0,
189
+ modalities={}
190
+ )
191
+
192
+ # Apply appropriate scaling strategy
193
+ if compression_ratio < 5:
194
+ # Light compression - sliding window
195
+ segments = self._light_compression(context, query, target_tokens)
196
+ compression_level = 1
197
+ elif compression_ratio < 20:
198
+ # Medium compression - hierarchical
199
+ segments = self._medium_compression(context, query, target_tokens)
200
+ compression_level = 2
201
+ else:
202
+ # Heavy compression - sparse attention
203
+ segments = self._heavy_compression(context, query, target_tokens)
204
+ compression_level = 3
205
+
206
+ # Calculate attention map
207
+ attention_map = self._calculate_attention_map(segments, query)
208
+
209
+ return ScaledContext(
210
+ segments=segments,
211
+ attention_map=attention_map,
212
+ token_count=sum(len(s.split()) for s in segments),
213
+ compression_level=compression_level,
214
+ modalities={}
215
+ )
216
+
217
+ def _light_compression(
218
+ self,
219
+ context: str,
220
+ query: str,
221
+ target_tokens: int
222
+ ) -> List[str]:
223
+ """Light compression using sliding windows"""
224
+ windows = self.attention_optimizer.sliding_window_attention(
225
+ context, query, max_windows=target_tokens // 100
226
+ )
227
+ return [w for w, _ in windows]
228
+
229
+ def _medium_compression(
230
+ self,
231
+ context: str,
232
+ query: str,
233
+ target_tokens: int
234
+ ) -> List[str]:
235
+ """Medium compression using hierarchical attention"""
236
+ hierarchy = self.attention_optimizer.hierarchical_attention(context, query)
237
+
238
+ segments = []
239
+ remaining_tokens = target_tokens
240
+
241
+ # Add from each level based on available tokens
242
+ for level in sorted(hierarchy.keys()):
243
+ level_segments = hierarchy[level]
244
+ for segment in level_segments:
245
+ segment_tokens = len(segment.split())
246
+ if segment_tokens <= remaining_tokens:
247
+ segments.append(segment)
248
+ remaining_tokens -= segment_tokens
249
+ if remaining_tokens <= 0:
250
+ break
251
+
252
+ return segments
253
+
254
+ def _heavy_compression(
255
+ self,
256
+ context: str,
257
+ query: str,
258
+ target_tokens: int
259
+ ) -> List[str]:
260
+ """Heavy compression using sparse attention"""
261
+ sparsity = target_tokens / len(context.split())
262
+ sparse_tokens = self.attention_optimizer.sparse_attention(
263
+ context, query, sparsity=min(sparsity, 0.3)
264
+ )
265
+
266
+ # Group sparse tokens into segments
267
+ segments = []
268
+ current_segment = []
269
+ for token in sparse_tokens:
270
+ if token == "...":
271
+ if current_segment:
272
+ segments.append(' '.join(current_segment))
273
+ current_segment = []
274
+ segments.append("...")
275
+ else:
276
+ current_segment.append(token)
277
+
278
+ if current_segment:
279
+ segments.append(' '.join(current_segment))
280
+
281
+ return segments
282
+
283
+ def _calculate_attention_map(
284
+ self,
285
+ segments: List[str],
286
+ query: str
287
+ ) -> np.ndarray:
288
+ """Calculate attention weights for each segment"""
289
+ query_words = set(query.lower().split())
290
+ attention_scores = []
291
+
292
+ for segment in segments:
293
+ if segment == "...":
294
+ attention_scores.append(0.0)
295
+ else:
296
+ segment_words = set(segment.lower().split())
297
+ overlap = len(query_words & segment_words)
298
+ score = overlap / max(len(query_words), 1)
299
+ attention_scores.append(score)
300
+
301
+ # Normalize
302
+ scores = np.array(attention_scores)
303
+ if scores.sum() > 0:
304
+ scores = scores / scores.sum()
305
+
306
+ return scores
307
+
308
+
309
+ class MultiModalScaler:
310
+ """
311
+ Handle multi-modal and structural context scaling
312
+ Temporal, spatial, participant states, intentional, cultural
313
+ """
314
+
315
+ def __init__(self):
316
+ self.modality_handlers = {
317
+ 'temporal': self._scale_temporal,
318
+ 'spatial': self._scale_spatial,
319
+ 'participant': self._scale_participant,
320
+ 'intentional': self._scale_intentional,
321
+ 'cultural': self._scale_cultural
322
+ }
323
+
324
+ def scale_multimodal(
325
+ self,
326
+ modalities: Dict[str, Any],
327
+ importance_weights: Optional[Dict[str, float]] = None
328
+ ) -> Dict[str, Any]:
329
+ """Scale multiple modalities based on importance"""
330
+
331
+ if importance_weights is None:
332
+ importance_weights = {
333
+ 'temporal': 0.3,
334
+ 'spatial': 0.1,
335
+ 'participant': 0.3,
336
+ 'intentional': 0.2,
337
+ 'cultural': 0.1
338
+ }
339
+
340
+ scaled = {}
341
+ for modality, data in modalities.items():
342
+ if modality in self.modality_handlers:
343
+ weight = importance_weights.get(modality, 0.1)
344
+ scaled[modality] = self.modality_handlers[modality](data, weight)
345
+
346
+ return scaled
347
+
348
+ def _scale_temporal(self, data: List[Dict], weight: float) -> List[Dict]:
349
+ """Scale temporal context - keep most recent and important events"""
350
+ # Sort by timestamp
351
+ sorted_data = sorted(data, key=lambda x: x.get('timestamp', datetime.min), reverse=True)
352
+
353
+ # Keep based on weight (more weight = more events kept)
354
+ keep_count = max(1, int(len(sorted_data) * weight))
355
+ return sorted_data[:keep_count]
356
+
357
+ def _scale_spatial(self, data: Dict, weight: float) -> Dict:
358
+ """Scale spatial context - simplify based on importance"""
359
+ if weight < 0.3:
360
+ # Low importance - just keep basic location
361
+ return {'location': data.get('primary_location', 'unknown')}
362
+ else:
363
+ # Higher importance - keep more detail
364
+ return data
365
+
366
+ def _scale_participant(self, data: Dict, weight: float) -> Dict:
367
+ """Scale participant states - keep most active participants"""
368
+ if not data:
369
+ return {}
370
+
371
+ # Sort by activity level (approximated by state changes)
372
+ participants = []
373
+ for pid, pdata in data.items():
374
+ activity = len(pdata.get('history', []))
375
+ participants.append((pid, pdata, activity))
376
+
377
+ participants.sort(key=lambda x: x[2], reverse=True)
378
+
379
+ # Keep based on weight
380
+ keep_count = max(1, int(len(participants) * weight))
381
+
382
+ return {pid: pdata for pid, pdata, _ in participants[:keep_count]}
383
+
384
+ def _scale_intentional(self, data: Dict, weight: float) -> Dict:
385
+ """Scale intentional context - keep high priority goals"""
386
+ if not data:
387
+ return {}
388
+
389
+ # Sort by priority
390
+ goals = [(k, v) for k, v in data.items()]
391
+ goals.sort(key=lambda x: x[1].get('priority', 0), reverse=True)
392
+
393
+ # Keep based on weight
394
+ keep_count = max(1, int(len(goals) * weight))
395
+
396
+ return {k: v for k, v in goals[:keep_count]}
397
+
398
+ def _scale_cultural(self, data: Dict, weight: float) -> Dict:
399
+ """Scale cultural context - keep if important"""
400
+ if weight < 0.2:
401
+ return {} # Skip if low importance
402
+ return data
403
+
404
+
405
+ class ContextScalingOrchestrator:
406
+ """
407
+ Main orchestrator for context scaling
408
+ Combines length and multi-modal scaling
409
+ """
410
+
411
+ def __init__(self, max_context_tokens: int = 100000):
412
+ self.length_scaler = LengthScaler(max_context_tokens)
413
+ self.multimodal_scaler = MultiModalScaler()
414
+
415
+ def scale_complete_context(
416
+ self,
417
+ text_context: str,
418
+ multimodal_context: Dict[str, Any],
419
+ query: str,
420
+ target_tokens: int = 2000,
421
+ modality_weights: Optional[Dict[str, float]] = None
422
+ ) -> Dict[str, Any]:
423
+ """
424
+ Scale both text and multi-modal context
425
+ Returns optimally scaled context
426
+ """
427
+
428
+ # Scale text context
429
+ scaled_text = self.length_scaler.scale_context(
430
+ text_context, query, target_tokens
431
+ )
432
+
433
+ # Scale multi-modal context
434
+ scaled_multimodal = self.multimodal_scaler.scale_multimodal(
435
+ multimodal_context, modality_weights
436
+ )
437
+
438
+ # Combine
439
+ result = {
440
+ 'text': {
441
+ 'segments': scaled_text.segments,
442
+ 'attention_map': scaled_text.attention_map.tolist(),
443
+ 'token_count': scaled_text.token_count,
444
+ 'compression_level': scaled_text.compression_level
445
+ },
446
+ 'multimodal': scaled_multimodal,
447
+ 'metadata': {
448
+ 'original_tokens': len(text_context.split()),
449
+ 'scaled_tokens': scaled_text.token_count,
450
+ 'compression_ratio': len(text_context.split()) / max(scaled_text.token_count, 1),
451
+ 'modalities_preserved': list(scaled_multimodal.keys())
452
+ }
453
+ }
454
+
455
+ return result
456
+
457
+
458
+ # Demo usage
459
+ def demo_context_scaling():
460
+ """Demonstrate context scaling capabilities"""
461
+
462
+ # Create a very long context
463
+ long_context = " ".join([
464
+ f"Sentence {i} about various topics including AI, engineering, and software development."
465
+ for i in range(10000)
466
+ ]) # ~100k tokens
467
+
468
+ # Multi-modal context
469
+ multimodal = {
470
+ 'temporal': [
471
+ {'event': f'Event {i}', 'timestamp': datetime.now()}
472
+ for i in range(50)
473
+ ],
474
+ 'participant': {
475
+ f'person_{i}': {'state': 'active', 'history': []}
476
+ for i in range(20)
477
+ },
478
+ 'intentional': {
479
+ f'goal_{i}': {'priority': np.random.random()}
480
+ for i in range(10)
481
+ }
482
+ }
483
+
484
+ # Scale the context
485
+ orchestrator = ContextScalingOrchestrator()
486
+ scaled = orchestrator.scale_complete_context(
487
+ text_context=long_context,
488
+ multimodal_context=multimodal,
489
+ query="AI engineering position requirements",
490
+ target_tokens=2000
491
+ )
492
+
493
+ print(f"Scaling Results:")
494
+ print(f"Original tokens: {scaled['metadata']['original_tokens']}")
495
+ print(f"Scaled tokens: {scaled['metadata']['scaled_tokens']}")
496
+ print(f"Compression ratio: {scaled['metadata']['compression_ratio']:.2f}x")
497
+ print(f"Compression level: {scaled['text']['compression_level']}")
498
+ print(f"Modalities preserved: {scaled['metadata']['modalities_preserved']}")
499
+ print(f"Text segments: {len(scaled['text']['segments'])}")
500
+ print(f"Temporal events kept: {len(scaled['multimodal'].get('temporal', []))}")
501
+
502
+
503
+ if __name__ == "__main__":
504
+ demo_context_scaling()
agents/cover_letter_agent.py ADDED
@@ -0,0 +1,143 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+ from typing import List, Optional
3
+ import re
4
+
5
+ from models.schemas import UserProfile, JobPosting, CoverLetterDraft
6
+ from memory.store import memory_store
7
+ from utils.text import extract_keywords_from_text, clamp_to_char_limit
8
+ from utils.ats import basic_cover_letter_template, strengthen_action_verbs
9
+ from utils.consistency import allowed_keywords_from_profile, coverage_score, conciseness_score
10
+ from services.web_research import get_role_guidelines, cover_letter_inspiration_from_url
11
+ from services.llm import llm
12
+ from utils.langextractor import distill_text
13
+
14
+
15
+ class CoverLetterAgent:
16
+ def __init__(self) -> None:
17
+ self.name = "cover_letter"
18
+ self.max_chars = 4000
19
+
20
+ def create_cover_letter(self, profile: UserProfile, job: JobPosting, user_id: str = "default_user", user_chat: Optional[str] = None, seed_text: Optional[str] = None, agent2_notes: Optional[str] = None, inspiration_url: Optional[str] = None) -> CoverLetterDraft:
21
+ jd_keywords: List[str] = extract_keywords_from_text(job.description or "", top_k=25)
22
+ allowed = allowed_keywords_from_profile(profile.skills, profile.experiences)
23
+
24
+ greeting = "Hiring Manager,"
25
+ body = [
26
+ (
27
+ f"I am excited to apply for the {job.title} role at {job.company}. "
28
+ f"With experience across {', '.join(profile.skills[:8])}, I can quickly contribute to your team."
29
+ ),
30
+ (
31
+ "In my recent work, I delivered outcomes such as driving cost reductions, building scalable platforms, "
32
+ "and improving reliability. I have hands-on experience with the tools and practices highlighted "
33
+ f"in your description, including {', '.join(jd_keywords[:8])}."
34
+ ),
35
+ (
36
+ "I am particularly interested in this opportunity because it aligns with my background and career goals. "
37
+ "I value impact, ownership, and collaboration."
38
+ ),
39
+ ]
40
+ closing = "Thank you for your time and consideration."
41
+ signature = profile.full_name
42
+
43
+ base_text = seed_text.strip() if seed_text else None
44
+ draft = base_text or basic_cover_letter_template(greeting, body, closing, signature)
45
+ if base_text and len(base_text) > 1500:
46
+ bullets = distill_text(base_text, max_points=10)
47
+ draft = ("\n".join(f"- {b}" for b in bullets) + "\n\n") + draft[:3000]
48
+
49
+ guidance = get_role_guidelines(job.title, job.description)
50
+ humor_notes = cover_letter_inspiration_from_url(inspiration_url) if inspiration_url else ""
51
+ used_keywords: List[str] = []
52
+
53
+ # Detect low overlap between profile and JD keywords to hint a career pivot narrative
54
+ overlap_count = sum(1 for k in jd_keywords if k.lower() in allowed)
55
+ overlap_ratio = overlap_count / max(1, len(jd_keywords[:15]))
56
+ career_change_hint = overlap_ratio < 0.25
57
+
58
+ # Prepare transferable skills (top profile skills), and pull 1-2 achievements across experiences
59
+ transferable_skills = profile.skills[:6] if profile.skills else []
60
+ sample_achievements: List[str] = []
61
+ for e in profile.experiences:
62
+ if e.achievements:
63
+ for a in e.achievements:
64
+ if a and len(sample_achievements) < 2:
65
+ sample_achievements.append(a.strip())
66
+
67
+ for cycle in range(3):
68
+ new_mentions = []
69
+ for kw in jd_keywords[:12]:
70
+ if kw.lower() in allowed and kw.lower() not in draft.lower():
71
+ new_mentions.append(kw)
72
+ if new_mentions:
73
+ draft = draft.rstrip() + "\n\nRelevant focus: " + ", ".join(new_mentions[:8]) + "\n"
74
+ used_keywords = list({*used_keywords, *new_mentions[:8]})
75
+
76
+ if llm.enabled:
77
+ system = (
78
+ "You refine cover letters. Preserve factual accuracy. Be concise (<= 1 page). "
79
+ "Keep ATS-friendly text; avoid flowery language. "
80
+ f"Apply latest guidance: {guidance}. "
81
+ "Emphasize transferable skills and a positive pivot narrative when the candidate is changing careers. "
82
+ "Structure: concise hook; 1–2 quantified achievements (STAR compressed); alignment to role/company; clear close/CTA. "
83
+ "Use active voice and strong action verbs; avoid clichΓ©s/buzzwords. UK English. Use digits for numbers and Β£ for currency. "
84
+ )
85
+ humor = f"\nInspiration guideline (do not copy text): {humor_notes}" if humor_notes else ""
86
+ notes = (f"\nNotes from Agent 2: {agent2_notes}" if agent2_notes else "")
87
+ custom = f"\nUser instructions: {user_chat}" if user_chat else ""
88
+ pivot = "\nCareer change: true β€” highlight transferable skills and motivation for the pivot." if career_change_hint else ""
89
+ examples = ("\nAchievements to consider: " + "; ".join(sample_achievements)) if sample_achievements else ""
90
+ tskills = ("\nTransferable skills: " + ", ".join(transferable_skills)) if transferable_skills else ""
91
+ user = (
92
+ f"Role: {job.title}. Company: {job.company}.\n"
93
+ f"Job keywords: {', '.join(jd_keywords[:20])}.\n"
94
+ f"Allowed keywords (from user profile): {', '.join(sorted(list(allowed))[:40])}.\n"
95
+ f"Rewrite the following cover letter to strengthen alignment without inventing new skills.{custom}{notes}{humor}{pivot}{examples}{tskills}\n"
96
+ f"Keep within {self.max_chars} characters.\n\n"
97
+ f"Cover letter content:\n{draft}"
98
+ )
99
+ draft = llm.generate(system, user, max_tokens=800, agent="cover")
100
+
101
+ # Simple buzzword scrub
102
+ lower = draft.lower()
103
+ for bad in [
104
+ "results-driven", "team player", "works well alone", "people person",
105
+ "perfectionist", "multi-tasker", "multi tasker", "dynamic go-getter",
106
+ ]:
107
+ if bad in lower:
108
+ draft = draft.replace(bad, "")
109
+ lower = draft.lower()
110
+ # Strengthen weak openers
111
+ draft = strengthen_action_verbs(draft)
112
+ # Normalise Β£/% hints
113
+ draft = draft.replace("GBP", "Β£")
114
+ draft = re.sub(r"\bpercent\b", "%", draft, flags=re.IGNORECASE)
115
+
116
+ cov = coverage_score(draft, jd_keywords)
117
+ conc = conciseness_score(draft, self.max_chars)
118
+ if conc < 1.0:
119
+ draft = clamp_to_char_limit(draft, self.max_chars)
120
+
121
+ memory_store.save(user_id, self.name, {
122
+ "job_id": job.id,
123
+ "cycle": cycle + 1,
124
+ "coverage": cov,
125
+ "conciseness": conc,
126
+ "keywords_used": used_keywords,
127
+ "guidance": guidance[:500],
128
+ "user_chat": (user_chat or "")[:500],
129
+ "agent2_notes": (agent2_notes or "")[:500],
130
+ "inspiration_url": inspiration_url or "",
131
+ "draft": draft,
132
+ "career_change_hint": career_change_hint,
133
+ }, job_id=job.id)
134
+
135
+ draft = clamp_to_char_limit(draft, self.max_chars)
136
+ memory_store.save(user_id, self.name, {
137
+ "job_id": job.id,
138
+ "final": True,
139
+ "keywords_used": used_keywords,
140
+ "draft": draft,
141
+ }, job_id=job.id)
142
+
143
+ return CoverLetterDraft(job_id=job.id, text=draft, keywords_used=used_keywords[:12])
agents/cv_owner.py ADDED
@@ -0,0 +1,441 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+ from typing import List, Optional
3
+ import logging
4
+ import re
5
+ import textwrap
6
+ from datetime import datetime
7
+
8
+ from models.schemas import UserProfile, JobPosting, ResumeDraft
9
+ from memory.store import memory_store
10
+ from utils.text import extract_keywords_from_text, clamp_to_char_limit
11
+ from utils.ats import (
12
+ format_resume_header,
13
+ format_experience_section,
14
+ format_skills_section,
15
+ basic_resume_template,
16
+ ensure_keywords,
17
+ ACTION_VERBS,
18
+ strengthen_action_verbs,
19
+ )
20
+ from utils.consistency import allowed_keywords_from_profile, coverage_score, conciseness_score
21
+ from utils.config import AgentConfig, LLMConfig
22
+ from services.web_research import get_role_guidelines
23
+ from services.llm import llm
24
+ from utils.langextractor import distill_text
25
+ try:
26
+ from utils.langextractor_enhanced import extract_structured_info, extract_ats_keywords
27
+ ENHANCED_EXTRACTION = True
28
+ except ImportError:
29
+ ENHANCED_EXTRACTION = False
30
+
31
+ logger = logging.getLogger(__name__)
32
+
33
+
34
+ def _clamp_words(text: str, max_words: int) -> str:
35
+ if not text:
36
+ return ""
37
+ words = text.strip().split()
38
+ if len(words) <= max_words:
39
+ return text.strip()
40
+ return " ".join(words[:max_words]).strip()
41
+
42
+
43
+ def _extract_year(s: Optional[str]) -> Optional[int]:
44
+ if not s:
45
+ return None
46
+ m = re.search(r"(19|20)\d{2}", s)
47
+ return int(m.group(0)) if m else None
48
+
49
+
50
+ def _uk_month_name(m: int) -> str:
51
+ return ["", "Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"][max(0, min(12, m))]
52
+
53
+
54
+ def _uk_date_str(s: Optional[str]) -> Optional[str]:
55
+ if not s:
56
+ return None
57
+ ss = s.strip()
58
+ if ss.lower() == "present":
59
+ return "Present"
60
+ # YYYY-MM or YYYY/M or YYYY/MM
61
+ m = re.match(r"^(\d{4})[-/](\d{1,2})$", ss)
62
+ if m:
63
+ y = int(m.group(1)); mo = int(m.group(2))
64
+ return f"{_uk_month_name(mo)} {y}"
65
+ # MM/YYYY
66
+ m = re.match(r"^(\d{1,2})/(\d{4})$", ss)
67
+ if m:
68
+ mo = int(m.group(1)); y = int(m.group(2))
69
+ return f"{_uk_month_name(mo)} {y}"
70
+ # YYYY only
71
+ m = re.match(r"^(\d{4})$", ss)
72
+ if m:
73
+ return m.group(1)
74
+ return ss
75
+
76
+
77
+ def _postprocess_bullets(text: str) -> str:
78
+ if not text:
79
+ return text
80
+ lines = []
81
+ for line in text.splitlines():
82
+ newline = line
83
+ if newline.lstrip().startswith("-"):
84
+ # Remove first-person pronouns at bullet start
85
+ newline = re.sub(r"^(\s*-\s*)(?:I|We|My)\s+", r"\1", newline, flags=re.IGNORECASE)
86
+ # Remove trailing period
87
+ newline = re.sub(r"\.(\s*)$", r"\1", newline)
88
+ # Normalise percent and GBP
89
+ newline = re.sub(r"\bper\s*cent\b", "%", newline, flags=re.IGNORECASE)
90
+ newline = re.sub(r"\bpercent\b", "%", newline, flags=re.IGNORECASE)
91
+ newline = newline.replace("GBP", "Β£")
92
+ lines.append(newline)
93
+ return "\n".join(lines)
94
+
95
+ def _strip_personal_info(text: str) -> str:
96
+ if not text:
97
+ return text
98
+ # Remove DOB lines and photo references
99
+ text = re.sub(r"^.*\b(date of birth|dob)\b.*$", "", text, flags=re.IGNORECASE | re.MULTILINE)
100
+ text = re.sub(r"^.*\b(photo|headshot)\b.*$", "", text, flags=re.IGNORECASE | re.MULTILINE)
101
+ # Clean extra blank lines
102
+ text = re.sub(r"\n{3,}", "\n\n", text)
103
+ return text.strip() + "\n"
104
+
105
+
106
+ class CVOwnerAgent:
107
+ def __init__(self) -> None:
108
+ self.name = "cv_owner"
109
+ self.max_chars = AgentConfig.RESUME_MAX_CHARS
110
+
111
+ def create_resume(
112
+ self,
113
+ profile: UserProfile,
114
+ job: JobPosting,
115
+ user_id: str = "default_user",
116
+ user_chat: Optional[str] = None,
117
+ seed_text: Optional[str] = None,
118
+ agent2_notes: Optional[str] = None,
119
+ layout_preset: Optional[str] = None,
120
+ ) -> ResumeDraft:
121
+ """Create an optimized resume for a specific job posting."""
122
+ jd_keywords: List[str] = extract_keywords_from_text(
123
+ job.description or "",
124
+ top_k=AgentConfig.JOB_KEYWORDS_COUNT
125
+ )
126
+ allowed = allowed_keywords_from_profile(profile.skills, profile.experiences)
127
+
128
+ # Format resume sections
129
+ header = format_resume_header(
130
+ full_name=profile.full_name,
131
+ headline=profile.headline or job.title,
132
+ email=profile.email,
133
+ phone=profile.phone,
134
+ location=profile.location,
135
+ links=profile.links,
136
+ )
137
+
138
+ # Sort experiences reverse-chronologically (Reed/Indeed best practice)
139
+ def _date_key(s: Optional[str]) -> str:
140
+ val = (s or "").strip()
141
+ if not val or val.lower() == "present":
142
+ return "9999-12-31"
143
+ return val
144
+ experiences_sorted = sorted(
145
+ profile.experiences,
146
+ key=lambda e: (_date_key(e.end_date), _date_key(e.start_date)),
147
+ reverse=True,
148
+ )
149
+ # Compute simple gap signal based on years between adjacent roles
150
+ gap_years_flag = False
151
+ for i in range(len(experiences_sorted) - 1):
152
+ end_y = _extract_year(experiences_sorted[i].end_date or "Present") or 9999
153
+ start_next_y = _extract_year(experiences_sorted[i + 1].start_date)
154
+ if start_next_y and end_y != 9999 and (start_next_y - end_y) >= 2:
155
+ gap_years_flag = True
156
+ break
157
+ # Limit achievements depth: recent roles get more bullets, older roles compressed
158
+ current_year = datetime.now().year
159
+ experience_payload = []
160
+ for idx, e in enumerate(experiences_sorted):
161
+ ach = e.achievements or []
162
+ # Compress if older than 15 years
163
+ start_y = _extract_year(e.start_date or "")
164
+ older = bool(start_y and (current_year - start_y > 15))
165
+ if idx < 2 and not older:
166
+ limited = ach[:6]
167
+ else:
168
+ limited = [] if older else ach[:1]
169
+ experience_payload.append({
170
+ "title": e.title,
171
+ "company": e.company,
172
+ "start_date": _uk_date_str(e.start_date) or e.start_date,
173
+ "end_date": _uk_date_str(e.end_date) or ("Present" if (e.end_date or "").lower()=="present" else (e.end_date or "")),
174
+ "achievements": limited,
175
+ })
176
+ experience = format_experience_section(experience_payload)
177
+ skills = format_skills_section(profile.skills)
178
+
179
+ # Personal statement (Summary) refinement (~150 words), tailored to job
180
+ summary_text = profile.summary or ""
181
+ if summary_text:
182
+ if llm.enabled:
183
+ sys_ps = (
184
+ "You write CV personal statements (Summary) for UK job applications. Keep to ~150 words (100–180). "
185
+ "Use active voice and clear, specific language; avoid clichΓ©s/buzzwords; no personal info. "
186
+ "Structure: 1) who you are/pro background; 2) key skills + 1–2 quantified achievements relevant to the role; "
187
+ "3) concise career goal aligned to the target role/company. Tailor to the job's keywords."
188
+ )
189
+ usr_ps = (
190
+ f"Target role: {job.title} at {job.company}\n"
191
+ f"Job keywords: {', '.join(jd_keywords[:15])}\n\n"
192
+ f"Existing summary (edit and improve):\n{summary_text}"
193
+ )
194
+ summary_text = llm.generate(sys_ps, usr_ps, max_tokens=220, agent="cv")
195
+ summary_text = _clamp_words(summary_text, 180)
196
+ # Ensure critical JD keywords appear in summary (top 3)
197
+ try:
198
+ needed = []
199
+ low = (summary_text or "").lower()
200
+ for k in jd_keywords[:6]:
201
+ if k and (k.lower() not in low) and len(needed) < 3:
202
+ needed.append(k)
203
+ if needed:
204
+ summary_text = (summary_text or "").strip() + " " + ("Key strengths: " + ", ".join(needed) + ".")
205
+ except Exception:
206
+ pass
207
+ else:
208
+ # No summary provided: keep empty to avoid adding new sections implicitly
209
+ summary_text = ""
210
+
211
+ education_text = "\n".join(
212
+ [f"{ed.degree or ''} {ed.field_of_study or ''} β€” {ed.school} ({ed.end_date or ''})"
213
+ for ed in profile.education]
214
+ ).strip()
215
+
216
+ # Process seed text if provided
217
+ base_text = seed_text.strip() if seed_text else None
218
+ if base_text and len(base_text) > 2000:
219
+ # Distill dense seed into key points to guide the draft
220
+ bullets = distill_text(base_text, max_points=AgentConfig.DISTILL_MAX_POINTS)
221
+ base_text = ("\n".join(f"- {b}" for b in bullets) + "\n\n") + base_text[:4000]
222
+
223
+ # Compose initial draft by layout preset (ATS-friendly, single column)
224
+ preset = (layout_preset or "").strip().lower()
225
+ preset = {
226
+ "traditional": "classic",
227
+ "classic": "classic",
228
+ "modern": "modern",
229
+ "minimalist": "minimalist",
230
+ "executive": "executive",
231
+ }.get(preset, "")
232
+ def sec_summary(s: str) -> str:
233
+ return ("\nSummary\n" + textwrap.fill(s, width=100)) if s else ""
234
+ def sec_skills(sk: str) -> str:
235
+ return ("\n" + sk) if sk else ""
236
+ def sec_experience(ex: str) -> str:
237
+ return ("\n\nExperience\n" + ex) if ex else ""
238
+ def sec_education(ed: str) -> str:
239
+ return ("\n\nEducation\n" + ed) if ed else ""
240
+ def sec_languages() -> str:
241
+ langs = getattr(profile, "languages", []) or []
242
+ pairs = []
243
+ for it in langs[:8]:
244
+ if isinstance(it, dict):
245
+ name = it.get("language") or it.get("name") or ""
246
+ lvl = it.get("level") or ""
247
+ if name:
248
+ pairs.append(f"{name}{' ('+lvl+')' if lvl else ''}")
249
+ return ("\n\nLanguages\n- " + "\n- ".join(pairs)) if pairs else ""
250
+ def sec_certs() -> str:
251
+ certs = getattr(profile, "certifications", []) or []
252
+ lines = []
253
+ for c in certs[:6]:
254
+ if isinstance(c, dict):
255
+ name = c.get("name") or ""
256
+ issuer = c.get("issuer") or ""
257
+ year = c.get("year") or ""
258
+ if name:
259
+ parts = [name]
260
+ if issuer: parts.append(issuer)
261
+ if year: parts.append(str(year))
262
+ lines.append(" β€” ".join(parts))
263
+ return ("\n\nCertifications\n- " + "\n- ".join(lines)) if lines else ""
264
+ def sec_projects() -> str:
265
+ projs = getattr(profile, "projects", []) or []
266
+ lines = []
267
+ for p in projs[:4]:
268
+ if isinstance(p, dict):
269
+ title = p.get("title") or ""
270
+ link = p.get("link") or ""
271
+ impact = p.get("impact") or ""
272
+ if title or impact:
273
+ line = title
274
+ if link: line += f" β€” {link}"
275
+ if impact: line += f" β€” {impact}"
276
+ lines.append(line)
277
+ return ("\n\nSelected Projects\n- " + "\n- ".join(lines)) if lines else ""
278
+ def sec_achievements() -> str:
279
+ bul = []
280
+ for e in experiences_sorted[:2]:
281
+ for a in (e.achievements or []):
282
+ if a and len(bul) < 5:
283
+ bul.append(a)
284
+ return ("\n\nSelected Achievements\n- " + "\n- ".join(bul)) if bul else ""
285
+
286
+ if base_text:
287
+ draft = base_text
288
+ elif preset == "classic":
289
+ parts: List[str] = [header, sec_summary(summary_text), sec_skills(skills), sec_experience(experience), sec_education(education_text), sec_certs(), sec_languages()]
290
+ draft = "".join(parts).strip() + "\n"
291
+ elif preset == "modern":
292
+ parts = [header, sec_summary(summary_text), sec_experience(experience), sec_skills(skills), sec_projects(), sec_certs(), sec_education(education_text)]
293
+ draft = "".join(parts).strip() + "\n"
294
+ elif preset == "minimalist":
295
+ parts = [header, sec_summary(summary_text), sec_skills(skills), sec_experience(experience), sec_education(education_text)]
296
+ draft = "".join(parts).strip() + "\n"
297
+ elif preset == "executive":
298
+ parts = [header, sec_summary(summary_text), sec_achievements(), sec_experience(experience), sec_skills(skills), sec_education(education_text), sec_certs()]
299
+ draft = "".join(parts).strip() + "\n"
300
+ else:
301
+ # Default formatting
302
+ draft = basic_resume_template(
303
+ header=header,
304
+ summary=(summary_text or None),
305
+ skills=skills,
306
+ experience=experience,
307
+ education=education_text,
308
+ )
309
+ # If profile.skill_proficiency exists, append a simple proficiency hint line under Skills (ATS-safe)
310
+ try:
311
+ if hasattr(profile, "links") and isinstance(profile.links, dict):
312
+ pass
313
+ # naive inject: if "Skills:" line exists, add a second line with proficiencies
314
+ if getattr(profile, "skills", None) and getattr(profile, "links", None) is not None:
315
+ prof_map = getattr(profile, "skill_proficiency", {}) or {}
316
+ if prof_map:
317
+ profs = ", ".join([f"{k}: {v}" for k, v in list(prof_map.items())[:8]])
318
+ if "\nSkills:" in draft:
319
+ parts = draft.split("\nSkills:")
320
+ draft = parts[0] + "\nSkills:" + parts[1].split("\n", 1)[0] + ("\n" + profs) + "\n" + (parts[1].split("\n", 1)[1] if "\n" in parts[1] else "")
321
+ except Exception:
322
+ pass
323
+
324
+ guidance = get_role_guidelines(job.title, job.description)
325
+ used_keywords: List[str] = []
326
+
327
+ # Optimization cycles
328
+ for cycle in range(AgentConfig.OPTIMIZATION_CYCLES):
329
+ draft, used_cycle = ensure_keywords(
330
+ draft,
331
+ jd_keywords,
332
+ max_new=AgentConfig.MAX_NEW_KEYWORDS,
333
+ allowed_keywords=allowed
334
+ )
335
+ used_keywords = list({*used_keywords, *used_cycle})
336
+
337
+ if llm.enabled:
338
+ system = (
339
+ "You refine resumes. Preserve factual accuracy. Keep ATS-friendly text-only formatting. "
340
+ "Follow UK best practices (Indeed/Reed/StandOut/NovorΓ©sumΓ©): keep concise (prefer 1 page; <= 2 pages for senior roles), use clear section headings. "
341
+ "Present work experience in reverse chronological order, highlight recent quantified achievements, and keep older roles brief. "
342
+ "Use bullet points for skimmability, maintain consistent spacing and layout, avoid irrelevant info. Do not add images/tables or unusual symbols. "
343
+ "Tailor to the job's keywords. Prefer quantification where truthful (%, Β£, time, team size); never fabricate metrics. "
344
+ "AVOID vague buzzwords (e.g., 'results-driven', 'team player', 'people person', 'perfectionist', 'multi-tasker'). Replace with specific, measurable achievements. "
345
+ "Use active voice and strong action verbs (e.g., Achieved, Led, Implemented, Improved, Generated, Managed, Completed, Designed). "
346
+ "Skills: when possible, separate Hard skills vs Soft skills (hard skills first, max ~10), then soft skills. Keep Education concise (highest/most recent first). "
347
+ "Contact hygiene: prefer professional email; include relevant links (e.g., LinkedIn/portfolio) if provided; never include DOB or photos. "
348
+ "If a 'Summary'/'Personal Statement' section exists, keep it ~150 words with the intro–skills/achievements–goal structure; do not add new sections. "
349
+ "UK English, UK date style (MMM YYYY). Use present tense for the current role and past tense for previous roles. Remove first-person pronouns in bullets. "
350
+ "Use digits for numbers (e.g., 7, 12%, Β£1,200). Include critical JD keywords verbatim inside bullets (not only in Skills). "
351
+ f"Apply latest guidance: {guidance}."
352
+ )
353
+ notes = (f"\nNotes from Agent 2: {agent2_notes}" if agent2_notes else "")
354
+ custom = f"\nUser instructions: {user_chat}" if user_chat else ""
355
+ user = (
356
+ f"Role: {job.title}. Company: {job.company}.\n"
357
+ f"Job keywords: {', '.join(jd_keywords[:AgentConfig.RESUME_KEYWORDS_COUNT])}.\n"
358
+ f"Allowed keywords (from user profile): {', '.join(sorted(list(allowed))[:40])}.\n"
359
+ f"Rewrite the following resume content to strengthen alignment without inventing new skills.{custom}{notes}\n"
360
+ f"Enforce reverse chronological experience ordering, bullet points, and consistent headings. Keep within {self.max_chars} characters.\n\n"
361
+ f"Resume content:\n{draft}"
362
+ )
363
+ draft = llm.generate(system, user, max_tokens=LLMConfig.RESUME_MAX_TOKENS, agent="cv")
364
+
365
+ # Simple buzzword scrub per Reed guidance
366
+ lower = draft.lower()
367
+ for bad in [
368
+ "results-driven", "team player", "works well alone", "people person",
369
+ "perfectionist", "multi-tasker", "multi tasker", "dynamic go-getter",
370
+ ]:
371
+ if bad in lower:
372
+ # Replace phrase occurrences with an empty string; rely on achievements to convey value
373
+ draft = draft.replace(bad, "")
374
+ lower = draft.lower()
375
+ # Strengthen weak bullet openers to action verbs (The Muse)
376
+ draft = strengthen_action_verbs(draft)
377
+ # ATS plain-text scrub: remove tabs and unusual symbols
378
+ draft = draft.replace("\t", " ")
379
+ # Pronoun/punctuation/currency/percent normalisation
380
+ draft = _postprocess_bullets(draft)
381
+ # Strip DOB/photo lines if present
382
+ draft = _strip_personal_info(draft)
383
+
384
+ cov = coverage_score(draft, jd_keywords)
385
+ conc = conciseness_score(draft, self.max_chars)
386
+
387
+ if conc < 1.0:
388
+ draft = clamp_to_char_limit(draft, self.max_chars)
389
+
390
+ # Signals for orchestrator/observability (StandOut CV + NovorΓ©sumΓ©)
391
+ bullet_lines = sum(1 for l in (draft or "").splitlines() if l.strip().startswith("-"))
392
+ line_count = max(1, len((draft or "").splitlines()))
393
+ bullet_density = round(bullet_lines / line_count, 3)
394
+ quant_count = sum(1 for ch in (draft or "") if ch.isdigit()) + (draft or "").count('%') + (draft or "").count('Β£')
395
+ email_ok = bool(re.match(r"^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$", profile.email or ""))
396
+ links_present = ("http://" in (draft or "").lower()) or ("https://" in (draft or "").lower()) or ("linkedin" in (draft or "").lower())
397
+ skills_split_hint = ("hard skills" in (draft or "").lower()) or ("soft skills" in (draft or "").lower())
398
+ languages_section = "\nlanguages" in (draft or "").lower()
399
+ action_verb_count = sum(1 for v in ACTION_VERBS if v.lower() in (draft or "").lower())
400
+ approx_pages = round(max(1, len(draft or "")) / 2400.0, 2)
401
+ approx_one_page = approx_pages <= 1.2
402
+
403
+ memory_store.save(user_id, self.name, {
404
+ "job_id": job.id,
405
+ "cycle": cycle + 1,
406
+ "coverage": cov,
407
+ "conciseness": conc,
408
+ "keywords_used": used_keywords,
409
+ "guidance": guidance[:500],
410
+ "user_chat": (user_chat or "")[:500],
411
+ "agent2_notes": (agent2_notes or "")[:500],
412
+ "draft": draft,
413
+ "signals": {
414
+ "bullet_density": bullet_density,
415
+ "quant_count": quant_count,
416
+ "email_ok": email_ok,
417
+ "gap_years_flag": gap_years_flag,
418
+ "skills_split_hint": skills_split_hint,
419
+ "languages_section": languages_section,
420
+ "links_present": links_present,
421
+ "action_verb_count": action_verb_count,
422
+ "approx_pages": approx_pages,
423
+ "approx_one_page": approx_one_page,
424
+ },
425
+ }, job_id=job.id)
426
+
427
+ logger.debug(f"Resume optimization cycle {cycle + 1}: coverage={cov:.2f}, conciseness={conc:.2f}")
428
+
429
+ # Final cleanup
430
+ draft = clamp_to_char_limit(draft, self.max_chars)
431
+
432
+ memory_store.save(user_id, self.name, {
433
+ "job_id": job.id,
434
+ "final": True,
435
+ "keywords_used": used_keywords,
436
+ "draft": draft,
437
+ }, job_id=job.id)
438
+
439
+ logger.info(f"Resume created for job {job.id} with {len(used_keywords)} keywords")
440
+
441
+ return ResumeDraft(job_id=job.id, text=draft, keywords_used=used_keywords)
agents/guidelines.py ADDED
@@ -0,0 +1,257 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+ from dataclasses import dataclass
3
+ from typing import Callable, Dict, Any, List, Tuple
4
+ import re
5
+
6
+
7
+ @dataclass
8
+ class Guideline:
9
+ id: str
10
+ description: str
11
+ condition: Callable[[Dict[str, Any]], bool]
12
+ validate: Callable[[str, Dict[str, Any]], Tuple[bool, str]]
13
+ enforce: Callable[[str, Dict[str, Any]], str]
14
+
15
+
16
+ class GuidelineEngine:
17
+ def __init__(self, rules: List[Guideline]) -> None:
18
+ self.rules = rules
19
+
20
+ def check_and_enforce(self, text: str, ctx: Dict[str, Any]) -> Tuple[str, List[str], List[str]]:
21
+ matched: List[str] = []
22
+ fixed: List[str] = []
23
+ out = text or ""
24
+ for g in self.rules:
25
+ try:
26
+ if not g.condition(ctx):
27
+ continue
28
+ matched.append(g.id)
29
+ ok, _ = g.validate(out, ctx)
30
+ if not ok:
31
+ out = g.enforce(out, ctx)
32
+ fixed.append(g.id)
33
+ except Exception:
34
+ # fail-safe, do not block
35
+ continue
36
+ return out, matched, fixed
37
+
38
+
39
+ # ---------- Helpers ----------
40
+
41
+ _BUZZWORDS = [
42
+ "results-driven", "team player", "people person", "perfectionist",
43
+ "multi-tasker", "multi tasker", "dynamic go-getter", "rockstar",
44
+ "guru", "ninja"
45
+ ]
46
+
47
+ _WEAK_OPENERS = [
48
+ (re.compile(r"^\s*[-β€’]\s*responsible for\s+", re.I), "- Led "),
49
+ (re.compile(r"^\s*[-β€’]\s*tasked with\s+", re.I), "- Executed "),
50
+ (re.compile(r"^\s*[-β€’]\s*worked on\s+", re.I), "- Delivered "),
51
+ (re.compile(r"^\s*[-β€’]\s*helped\s+", re.I), "- Supported "),
52
+ (re.compile(r"^\s*[-β€’]\s*assisted with\s+", re.I), "- Supported "),
53
+ (re.compile(r"^\s*[-β€’]\s*handled\s+", re.I), "- Managed "),
54
+ ]
55
+
56
+
57
+ def _enforce_exact_length(text: str, target_len: int) -> str:
58
+ if target_len <= 0:
59
+ return text or ""
60
+ txt = (text or "")
61
+ if len(txt) == target_len:
62
+ return txt
63
+ if len(txt) > target_len:
64
+ return txt[:target_len]
65
+ return txt + (" " * (target_len - len(txt)))
66
+
67
+
68
+ def _ensure_headings(text: str) -> str:
69
+ """Ensure key headings exist: SUMMARY, SKILLS, EXPERIENCE, EDUCATION."""
70
+ t = text or ""
71
+ low = t.lower()
72
+ out = t
73
+ def add_heading(h: str) -> None:
74
+ nonlocal out
75
+ if h.lower() not in low:
76
+ out = (out + f"\n\n{h}\n").strip()
77
+ for h in ["SUMMARY", "SKILLS", "EXPERIENCE", "EDUCATION"]:
78
+ if h.lower() not in low:
79
+ add_heading(h)
80
+ return out
81
+
82
+
83
+ def _strip_tabs(text: str) -> str:
84
+ return (text or "").replace("\t", " ")
85
+
86
+
87
+ def _scrub_buzzwords(text: str) -> str:
88
+ out = text or ""
89
+ low = out.lower()
90
+ for bw in _BUZZWORDS:
91
+ if bw in low:
92
+ out = re.sub(re.escape(bw), "", out, flags=re.I)
93
+ return out
94
+
95
+
96
+ def _strengthen_action_verbs(text: str) -> str:
97
+ lines = (text or "").splitlines()
98
+ fixed: List[str] = []
99
+ for ln in lines:
100
+ new_ln = ln
101
+ for pat, repl in _WEAK_OPENERS:
102
+ if pat.search(new_ln):
103
+ new_ln = pat.sub(repl, new_ln)
104
+ break
105
+ fixed.append(new_ln)
106
+ return "\n".join(fixed)
107
+
108
+
109
+ def _remove_first_person(text: str) -> str:
110
+ # Remove leading "I " / "My " in bullets only
111
+ lines = (text or "").splitlines()
112
+ out: List[str] = []
113
+ for ln in lines:
114
+ m = re.match(r"^\s*[-β€’]\s*(i|my|we)\b", ln, flags=re.I)
115
+ if m:
116
+ ln = re.sub(r"^\s*([-β€’]\s*)(i|my|we)\b\s*", r"\1", ln, flags=re.I)
117
+ out.append(ln)
118
+ return "\n".join(out)
119
+
120
+
121
+ def _ats_plain_text(text: str) -> str:
122
+ # normalize bullets and strip odd symbols
123
+ out = _strip_tabs(text)
124
+ out = out.replace("β€’\t", "- ").replace("β€’ ", "- ")
125
+ out = re.sub(r"[β– β–ͺβ—¦β—β—‹βœ”βœ¦β™¦]", "-", out)
126
+ return out
127
+
128
+
129
+ def _enforce_uk_habits(text: str) -> str:
130
+ # normalize currency symbol spacing and percentages
131
+ out = re.sub(r"\s*Β£\s*", " Β£", text or "")
132
+ out = re.sub(r"\s*%\s*", "%", out)
133
+ return out
134
+
135
+
136
+ def _allowed_skills_from_profile(ctx: Dict[str, Any]) -> List[str]:
137
+ p = (ctx.get("profile_text") or "").lower()
138
+ # naive split of alphanum skill-like tokens
139
+ cands = re.findall(r"[a-zA-Z][a-zA-Z0-9+_.#-]{2,}", p)
140
+ seen: Dict[str, int] = {}
141
+ for c in cands:
142
+ seen[c.lower()] = 1
143
+ return list(seen.keys())
144
+
145
+
146
+ def _no_invented_skills(text: str, ctx: Dict[str, Any]) -> Tuple[bool, str]:
147
+ allowed = set(_allowed_skills_from_profile(ctx))
148
+ if not allowed:
149
+ return True, "no baseline"
150
+ skills_block = re.search(r"(?is)\n\s*(skills|core skills)[\s:]*\n(.+?)(\n\n|$)", text or "")
151
+ if not skills_block:
152
+ return True, "no skills block"
153
+ block = skills_block.group(0)
154
+ found = re.findall(r"[A-Za-z][A-Za-z0-9+_.#-]{2,}", block)
155
+ for f in found:
156
+ if f.lower() not in allowed:
157
+ return False, f
158
+ return True, "ok"
159
+
160
+
161
+ # ---------- Rule sets ----------
162
+
163
+ def build_resume_rules() -> List[Guideline]:
164
+ return [
165
+ Guideline(
166
+ id="exact_length",
167
+ description="Enforce exact target length when provided",
168
+ condition=lambda ctx: bool(ctx.get("target_len")),
169
+ validate=lambda txt, ctx: (len(txt or "") == int(ctx.get("target_len", 0)), "len"),
170
+ enforce=lambda txt, ctx: _enforce_exact_length(txt, int(ctx.get("target_len", 0))),
171
+ ),
172
+ Guideline(
173
+ id="headings_present",
174
+ description="Ensure key headings exist",
175
+ condition=lambda ctx: True,
176
+ validate=lambda txt, ctx: (all(h.lower() in (txt or "").lower() for h in ["summary", "experience", "education", "skills"]), "headings"),
177
+ enforce=lambda txt, ctx: _ensure_headings(txt),
178
+ ),
179
+ Guideline(
180
+ id="ats_plain_text",
181
+ description="Normalize bullets/tabs for ATS",
182
+ condition=lambda ctx: True,
183
+ validate=lambda txt, ctx: ("\t" not in (txt or ""), "tabs"),
184
+ enforce=lambda txt, ctx: _ats_plain_text(txt),
185
+ ),
186
+ Guideline(
187
+ id="buzzword_scrub",
188
+ description="Remove common buzzwords",
189
+ condition=lambda ctx: True,
190
+ validate=lambda txt, ctx: (not any(bw in (txt or "").lower() for bw in _BUZZWORDS), "buzz"),
191
+ enforce=lambda txt, ctx: _scrub_buzzwords(txt),
192
+ ),
193
+ Guideline(
194
+ id="verb_strengthen",
195
+ description="Strengthen weak bullet openers",
196
+ condition=lambda ctx: True,
197
+ validate=lambda txt, ctx: (True, "noop"),
198
+ enforce=lambda txt, ctx: _strengthen_action_verbs(txt),
199
+ ),
200
+ Guideline(
201
+ id="remove_first_person",
202
+ description="Remove first-person pronouns on bullets",
203
+ condition=lambda ctx: True,
204
+ validate=lambda txt, ctx: (not re.search(r"^\s*[-β€’]\s*(i|my|we)\b", txt or "", re.I | re.M), "pronouns"),
205
+ enforce=lambda txt, ctx: _remove_first_person(txt),
206
+ ),
207
+ Guideline(
208
+ id="uk_normalization",
209
+ description="Normalize UK currency/percent spacing",
210
+ condition=lambda ctx: True,
211
+ validate=lambda txt, ctx: (True, "noop"),
212
+ enforce=lambda txt, ctx: _enforce_uk_habits(txt),
213
+ ),
214
+ Guideline(
215
+ id="no_invented_skills",
216
+ description="Prevent skills not evidenced in profile",
217
+ condition=lambda ctx: True,
218
+ validate=_no_invented_skills,
219
+ enforce=lambda txt, ctx: txt, # log-only to avoid false positives
220
+ ),
221
+ ]
222
+
223
+
224
+ def build_cover_rules() -> List[Guideline]:
225
+ return [
226
+ Guideline(
227
+ id="exact_length",
228
+ description="Enforce exact target length when provided",
229
+ condition=lambda ctx: bool(ctx.get("target_len")),
230
+ validate=lambda txt, ctx: (len(txt or "") == int(ctx.get("target_len", 0)), "len"),
231
+ enforce=lambda txt, ctx: _enforce_exact_length(txt, int(ctx.get("target_len", 0))),
232
+ ),
233
+ Guideline(
234
+ id="ats_plain_text",
235
+ description="Normalize bullets/tabs for ATS",
236
+ condition=lambda ctx: True,
237
+ validate=lambda txt, ctx: ("\t" not in (txt or ""), "tabs"),
238
+ enforce=lambda txt, ctx: _ats_plain_text(txt),
239
+ ),
240
+ Guideline(
241
+ id="buzzword_scrub",
242
+ description="Remove common buzzwords",
243
+ condition=lambda ctx: True,
244
+ validate=lambda txt, ctx: (not any(bw in (txt or "").lower() for bw in _BUZZWORDS), "buzz"),
245
+ enforce=lambda txt, ctx: _scrub_buzzwords(txt),
246
+ ),
247
+ ]
248
+
249
+
250
+ def apply_resume_guidelines(text: str, ctx: Dict[str, Any]) -> Tuple[str, List[str], List[str]]:
251
+ engine = GuidelineEngine(build_resume_rules())
252
+ return engine.check_and_enforce(text, ctx)
253
+
254
+
255
+ def apply_cover_guidelines(text: str, ctx: Dict[str, Any]) -> Tuple[str, List[str], List[str]]:
256
+ engine = GuidelineEngine(build_cover_rules())
257
+ return engine.check_and_enforce(text, ctx)
agents/job_agent.py ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+ from typing import Dict, Any
3
+ from services.llm import llm
4
+ import json
5
+
6
+
7
+ class JobAgent:
8
+ """Analyzes a job posting to extract structured requirements."""
9
+
10
+ def analyze(self, job_posting_text: str) -> Dict[str, Any]:
11
+ if not job_posting_text:
12
+ return {}
13
+ if not llm.enabled:
14
+ return {
15
+ "company": "",
16
+ "role": "",
17
+ "key_requirements": [],
18
+ "nice_to_have": [],
19
+ }
20
+ system = (
21
+ "Analyze this job posting and output JSON with fields: company, role, key_requirements (list), "
22
+ "nice_to_have (list), industry, employment_type, location, ats_keywords (list of top 15 keywords), "
23
+ "top_skills_summary (short string)."
24
+ )
25
+ resp = llm.generate(system, job_posting_text, max_tokens=700, agent="match")
26
+ try:
27
+ return json.loads(resp)
28
+ except Exception:
29
+ return {"raw": resp}
agents/linkedin_manager.py ADDED
@@ -0,0 +1,120 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+ from typing import List, Optional, Dict
3
+ import logging
4
+
5
+ from models.schemas import JobPosting, UserProfile
6
+ from services.linkedin_client import LinkedInClient
7
+ from services.mcp_linkedin_client import mcp_linkedin_client
8
+ from utils.salary import estimate_salary_range
9
+
10
+ logger = logging.getLogger(__name__)
11
+
12
+
13
+ class LinkedInManagerAgent:
14
+ def __init__(self) -> None:
15
+ self.client = LinkedInClient()
16
+ self.user_profile: Optional[UserProfile] = None
17
+
18
+ def get_login_url(self) -> str:
19
+ return self.client.get_authorize_url()
20
+
21
+ def handle_oauth_callback(self, code: str, state: Optional[str] = None) -> bool:
22
+ """Handle OAuth callback with state validation."""
23
+ ok = self.client.exchange_code_for_token(code, state)
24
+ if ok:
25
+ self.user_profile = self.client.get_profile()
26
+ return ok
27
+
28
+ def get_profile(self) -> UserProfile:
29
+ if not self.user_profile:
30
+ # Try MCP first if available
31
+ if mcp_linkedin_client.enabled:
32
+ try:
33
+ import asyncio
34
+ prof = asyncio.run(mcp_linkedin_client.get_profile())
35
+ if prof:
36
+ self.user_profile = prof
37
+ except Exception:
38
+ self.user_profile = None
39
+ if not self.user_profile:
40
+ self.user_profile = self.client.get_profile()
41
+ return self.user_profile
42
+
43
+ def set_profile(self, profile: UserProfile) -> None:
44
+ """Update the stored profile with new data."""
45
+ self.user_profile = profile
46
+ logger.info(f"Profile updated: {profile.full_name}")
47
+
48
+ def update_profile_fields(self, **kwargs) -> None:
49
+ """Update specific profile fields."""
50
+ if not self.user_profile:
51
+ self.user_profile = UserProfile()
52
+
53
+ for key, value in kwargs.items():
54
+ if hasattr(self.user_profile, key):
55
+ setattr(self.user_profile, key, value)
56
+ logger.debug(f"Updated profile.{key}")
57
+
58
+ def get_saved_jobs(self) -> List[JobPosting]:
59
+ all_jobs = []
60
+
61
+ # Try MCP client first
62
+ if mcp_linkedin_client.enabled:
63
+ try:
64
+ import asyncio
65
+ jobs = asyncio.run(mcp_linkedin_client.get_saved_jobs())
66
+ if jobs:
67
+ all_jobs.extend(jobs)
68
+ except Exception:
69
+ pass
70
+
71
+ # Try LinkedIn API
72
+ linkedin_jobs = self.client.get_saved_jobs()
73
+ all_jobs.extend(linkedin_jobs)
74
+
75
+ # If in mock mode or no real LinkedIn jobs, supplement with job aggregators
76
+ if self.client.mock_mode or len(all_jobs) < 5:
77
+ # Try JobSpy MCP Server first (most comprehensive)
78
+ try:
79
+ from services.jobspy_client import JobSpyClient
80
+ jobspy = JobSpyClient()
81
+ jobspy_jobs = jobspy.search_jobs_sync(
82
+ search_term="software engineer",
83
+ location="Remote",
84
+ site_names="indeed,linkedin,glassdoor",
85
+ results_wanted=15
86
+ )
87
+ all_jobs.extend(jobspy_jobs)
88
+ except Exception as e:
89
+ import logging
90
+ logging.getLogger(__name__).info(f"JobSpy not available: {e}")
91
+
92
+ # Fall back to basic job aggregator
93
+ if len(all_jobs) < 5:
94
+ try:
95
+ from services.job_aggregator import JobAggregator
96
+ aggregator = JobAggregator()
97
+ aggregated_jobs = aggregator.search_all("software engineer", "Remote")
98
+ all_jobs.extend(aggregated_jobs[:10])
99
+ except Exception as e:
100
+ import logging
101
+ logging.getLogger(__name__).info(f"Job aggregator not available: {e}")
102
+
103
+ # Deduplicate jobs
104
+ seen = set()
105
+ unique_jobs = []
106
+ for job in all_jobs:
107
+ key = (job.title.lower(), job.company.lower())
108
+ if key not in seen:
109
+ seen.add(key)
110
+ unique_jobs.append(job)
111
+
112
+ return unique_jobs
113
+
114
+ def get_job(self, job_id: str) -> Optional[JobPosting]:
115
+ return self.client.get_job_details(job_id)
116
+
117
+ def estimate_salary(self, job: JobPosting) -> Dict[str, Dict[str, int]]:
118
+ profile = self.get_profile()
119
+ industry = None
120
+ return estimate_salary_range(job.title, job.location, industry, profile.skills)
agents/observability.py ADDED
@@ -0,0 +1,431 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Agent Observability and Debugging
3
+ Provides transparency into agent interactions and decision-making
4
+ Based on the OpenAI Deep Research observability pattern
5
+ """
6
+
7
+ import json
8
+ import logging
9
+ import time
10
+ from typing import Dict, List, Any, Optional
11
+ from datetime import datetime
12
+ from dataclasses import dataclass, field
13
+ from pathlib import Path
14
+ import traceback
15
+
16
+ logger = logging.getLogger(__name__)
17
+
18
+
19
+ @dataclass
20
+ class AgentEvent:
21
+ """Single event in agent execution"""
22
+ timestamp: datetime
23
+ agent_name: str
24
+ event_type: str # 'start', 'tool_call', 'reasoning', 'output', 'error', 'handoff'
25
+ data: Dict[str, Any]
26
+ duration_ms: Optional[float] = None
27
+ parent_event: Optional[str] = None
28
+
29
+ def to_dict(self) -> Dict:
30
+ return {
31
+ 'timestamp': self.timestamp.isoformat(),
32
+ 'agent_name': self.agent_name,
33
+ 'event_type': self.event_type,
34
+ 'data': self.data,
35
+ 'duration_ms': self.duration_ms,
36
+ 'parent_event': self.parent_event
37
+ }
38
+
39
+
40
+ class AgentTracer:
41
+ """
42
+ Trace and log agent interactions for debugging and monitoring
43
+ Similar to OpenAI's print_agent_interaction function
44
+ """
45
+
46
+ def __init__(self, trace_file: Optional[str] = "agent_traces.jsonl"):
47
+ self.events: List[AgentEvent] = []
48
+ self.trace_file = Path(trace_file) if trace_file else None
49
+ self.active_agents: Dict[str, float] = {} # Track active agent start times
50
+
51
+ def start_agent(self, agent_name: str, input_data: Any) -> str:
52
+ """Log agent start"""
53
+ event_id = f"{agent_name}_{int(time.time() * 1000)}"
54
+ self.active_agents[agent_name] = time.time()
55
+
56
+ event = AgentEvent(
57
+ timestamp=datetime.now(),
58
+ agent_name=agent_name,
59
+ event_type='start',
60
+ data={
61
+ 'event_id': event_id,
62
+ 'input': str(input_data)[:500] # Truncate for readability
63
+ }
64
+ )
65
+
66
+ self._log_event(event)
67
+ return event_id
68
+
69
+ def tool_call(
70
+ self,
71
+ agent_name: str,
72
+ tool_name: str,
73
+ tool_args: Dict,
74
+ result: Any = None
75
+ ):
76
+ """Log tool call"""
77
+ event = AgentEvent(
78
+ timestamp=datetime.now(),
79
+ agent_name=agent_name,
80
+ event_type='tool_call',
81
+ data={
82
+ 'tool': tool_name,
83
+ 'args': tool_args,
84
+ 'result': str(result)[:500] if result else None
85
+ }
86
+ )
87
+
88
+ self._log_event(event)
89
+
90
+ def reasoning_step(self, agent_name: str, reasoning: str):
91
+ """Log reasoning or thought process"""
92
+ event = AgentEvent(
93
+ timestamp=datetime.now(),
94
+ agent_name=agent_name,
95
+ event_type='reasoning',
96
+ data={'reasoning': reasoning}
97
+ )
98
+
99
+ self._log_event(event)
100
+
101
+ def agent_output(self, agent_name: str, output: Any):
102
+ """Log agent output"""
103
+ duration = None
104
+ if agent_name in self.active_agents:
105
+ duration = (time.time() - self.active_agents[agent_name]) * 1000
106
+ del self.active_agents[agent_name]
107
+
108
+ event = AgentEvent(
109
+ timestamp=datetime.now(),
110
+ agent_name=agent_name,
111
+ event_type='output',
112
+ data={'output': str(output)[:1000]},
113
+ duration_ms=duration
114
+ )
115
+
116
+ self._log_event(event)
117
+
118
+ def agent_handoff(
119
+ self,
120
+ from_agent: str,
121
+ to_agent: str,
122
+ handoff_data: Any
123
+ ):
124
+ """Log handoff between agents"""
125
+ event = AgentEvent(
126
+ timestamp=datetime.now(),
127
+ agent_name=from_agent,
128
+ event_type='handoff',
129
+ data={
130
+ 'to_agent': to_agent,
131
+ 'handoff_data': str(handoff_data)[:500]
132
+ }
133
+ )
134
+
135
+ self._log_event(event)
136
+
137
+ def error(self, agent_name: str, error: Exception):
138
+ """Log error"""
139
+ event = AgentEvent(
140
+ timestamp=datetime.now(),
141
+ agent_name=agent_name,
142
+ event_type='error',
143
+ data={
144
+ 'error_type': type(error).__name__,
145
+ 'error_message': str(error),
146
+ 'traceback': traceback.format_exc()
147
+ }
148
+ )
149
+
150
+ self._log_event(event)
151
+
152
+ def _log_event(self, event: AgentEvent):
153
+ """Log event to memory and file"""
154
+ self.events.append(event)
155
+
156
+ # Log to file if configured
157
+ if self.trace_file:
158
+ with open(self.trace_file, 'a') as f:
159
+ f.write(json.dumps(event.to_dict()) + '\n')
160
+
161
+ # Also log to standard logger
162
+ logger.info(f"[{event.agent_name}] {event.event_type}: {event.data}")
163
+
164
+ def print_interaction_flow(self, start_time: Optional[datetime] = None):
165
+ """
166
+ Print human-readable interaction flow
167
+ Similar to OpenAI's print_agent_interaction
168
+ """
169
+ print("\n" + "="*60)
170
+ print("AGENT INTERACTION FLOW")
171
+ print("="*60 + "\n")
172
+
173
+ filtered_events = self.events
174
+ if start_time:
175
+ filtered_events = [e for e in self.events if e.timestamp >= start_time]
176
+
177
+ for i, event in enumerate(filtered_events, 1):
178
+ prefix = f"{i:3}. [{event.timestamp.strftime('%H:%M:%S')}] {event.agent_name}"
179
+
180
+ if event.event_type == 'start':
181
+ print(f"{prefix} β†’ STARTED")
182
+ print(f" Input: {event.data.get('input', '')[:100]}...")
183
+
184
+ elif event.event_type == 'tool_call':
185
+ tool = event.data.get('tool', 'unknown')
186
+ print(f"{prefix} β†’ TOOL: {tool}")
187
+ if event.data.get('args'):
188
+ print(f" Args: {event.data['args']}")
189
+
190
+ elif event.event_type == 'reasoning':
191
+ print(f"{prefix} β†’ THINKING:")
192
+ print(f" {event.data.get('reasoning', '')[:200]}...")
193
+
194
+ elif event.event_type == 'handoff':
195
+ to_agent = event.data.get('to_agent', 'unknown')
196
+ print(f"{prefix} β†’ HANDOFF to {to_agent}")
197
+
198
+ elif event.event_type == 'output':
199
+ print(f"{prefix} β†’ OUTPUT:")
200
+ print(f" {event.data.get('output', '')[:200]}...")
201
+ if event.duration_ms:
202
+ print(f" Duration: {event.duration_ms:.0f}ms")
203
+
204
+ elif event.event_type == 'error':
205
+ print(f"{prefix} β†’ ERROR: {event.data.get('error_type', 'unknown')}")
206
+ print(f" {event.data.get('error_message', '')}")
207
+
208
+ print()
209
+
210
+ print("="*60 + "\n")
211
+
212
+ def get_metrics(self) -> Dict[str, Any]:
213
+ """Get execution metrics"""
214
+ metrics = {
215
+ 'total_events': len(self.events),
216
+ 'agents_involved': len(set(e.agent_name for e in self.events)),
217
+ 'tool_calls': len([e for e in self.events if e.event_type == 'tool_call']),
218
+ 'errors': len([e for e in self.events if e.event_type == 'error']),
219
+ 'handoffs': len([e for e in self.events if e.event_type == 'handoff']),
220
+ 'avg_duration_ms': 0
221
+ }
222
+
223
+ durations = [e.duration_ms for e in self.events if e.duration_ms]
224
+ if durations:
225
+ metrics['avg_duration_ms'] = sum(durations) / len(durations)
226
+
227
+ return metrics
228
+
229
+
230
+ class TriageAgent:
231
+ """
232
+ Triage agent that routes requests to appropriate specialized agents
233
+ Based on OpenAI's Deep Research triage pattern
234
+ """
235
+
236
+ def __init__(self, tracer: Optional[AgentTracer] = None):
237
+ self.tracer = tracer or AgentTracer()
238
+
239
+ def triage_request(self, request: str) -> Dict[str, Any]:
240
+ """
241
+ Analyze request and determine routing
242
+ """
243
+ self.tracer.start_agent("TriageAgent", request)
244
+
245
+ # Analyze request type
246
+ request_lower = request.lower()
247
+
248
+ routing = {
249
+ 'needs_clarification': False,
250
+ 'route_to': None,
251
+ 'confidence': 0.0,
252
+ 'reasoning': '',
253
+ 'suggested_agents': []
254
+ }
255
+
256
+ # Check if clarification needed
257
+ if len(request.split()) < 5 or '?' in request:
258
+ routing['needs_clarification'] = True
259
+ routing['reasoning'] = "Request is too brief or unclear"
260
+ self.tracer.reasoning_step("TriageAgent", routing['reasoning'])
261
+
262
+ # Determine routing based on keywords
263
+ if 'research' in request_lower or 'analyze' in request_lower:
264
+ routing['route_to'] = 'ResearchAgent'
265
+ routing['suggested_agents'] = ['ResearchAgent', 'WebSearchAgent']
266
+ routing['confidence'] = 0.9
267
+
268
+ elif 'resume' in request_lower or 'cv' in request_lower:
269
+ routing['route_to'] = 'CVAgent'
270
+ routing['suggested_agents'] = ['CVAgent', 'ATSOptimizer']
271
+ routing['confidence'] = 0.95
272
+
273
+ elif 'cover' in request_lower or 'letter' in request_lower:
274
+ routing['route_to'] = 'CoverLetterAgent'
275
+ routing['suggested_agents'] = ['CoverLetterAgent']
276
+ routing['confidence'] = 0.95
277
+
278
+ elif 'job' in request_lower or 'application' in request_lower:
279
+ routing['route_to'] = 'OrchestratorAgent'
280
+ routing['suggested_agents'] = ['OrchestratorAgent', 'CVAgent', 'CoverLetterAgent']
281
+ routing['confidence'] = 0.85
282
+
283
+ else:
284
+ routing['route_to'] = 'GeneralAgent'
285
+ routing['confidence'] = 0.5
286
+
287
+ self.tracer.agent_output("TriageAgent", routing)
288
+
289
+ return routing
290
+
291
+
292
+ class AgentMonitor:
293
+ """
294
+ Monitor agent performance and health
295
+ """
296
+
297
+ def __init__(self):
298
+ self.performance_stats: Dict[str, Dict] = {}
299
+ self.error_counts: Dict[str, int] = {}
300
+ self.last_errors: Dict[str, str] = {}
301
+
302
+ def record_execution(
303
+ self,
304
+ agent_name: str,
305
+ duration_ms: float,
306
+ success: bool,
307
+ error: Optional[str] = None
308
+ ):
309
+ """Record agent execution stats"""
310
+ if agent_name not in self.performance_stats:
311
+ self.performance_stats[agent_name] = {
312
+ 'total_runs': 0,
313
+ 'successful_runs': 0,
314
+ 'failed_runs': 0,
315
+ 'total_duration_ms': 0,
316
+ 'avg_duration_ms': 0,
317
+ 'min_duration_ms': float('inf'),
318
+ 'max_duration_ms': 0
319
+ }
320
+
321
+ stats = self.performance_stats[agent_name]
322
+ stats['total_runs'] += 1
323
+
324
+ if success:
325
+ stats['successful_runs'] += 1
326
+ else:
327
+ stats['failed_runs'] += 1
328
+ self.error_counts[agent_name] = self.error_counts.get(agent_name, 0) + 1
329
+ if error:
330
+ self.last_errors[agent_name] = error
331
+
332
+ stats['total_duration_ms'] += duration_ms
333
+ stats['avg_duration_ms'] = stats['total_duration_ms'] / stats['total_runs']
334
+ stats['min_duration_ms'] = min(stats['min_duration_ms'], duration_ms)
335
+ stats['max_duration_ms'] = max(stats['max_duration_ms'], duration_ms)
336
+
337
+ def get_health_status(self) -> Dict[str, Any]:
338
+ """Get overall system health"""
339
+ total_errors = sum(self.error_counts.values())
340
+ total_runs = sum(s['total_runs'] for s in self.performance_stats.values())
341
+
342
+ if total_runs == 0:
343
+ error_rate = 0
344
+ else:
345
+ error_rate = (total_errors / total_runs) * 100
346
+
347
+ # Determine health status
348
+ if error_rate < 5:
349
+ status = "healthy"
350
+ elif error_rate < 15:
351
+ status = "degraded"
352
+ else:
353
+ status = "unhealthy"
354
+
355
+ return {
356
+ 'status': status,
357
+ 'error_rate': f"{error_rate:.1f}%",
358
+ 'total_runs': total_runs,
359
+ 'total_errors': total_errors,
360
+ 'agent_stats': self.performance_stats,
361
+ 'recent_errors': self.last_errors
362
+ }
363
+
364
+ def reset_stats(self):
365
+ """Reset all statistics"""
366
+ self.performance_stats.clear()
367
+ self.error_counts.clear()
368
+ self.last_errors.clear()
369
+
370
+
371
+ # Global instances for easy access
372
+ global_tracer = AgentTracer()
373
+ global_monitor = AgentMonitor()
374
+
375
+
376
+ # Decorator for automatic tracing
377
+ def trace_agent(agent_name: str):
378
+ """Decorator to automatically trace agent execution"""
379
+ def decorator(func):
380
+ def wrapper(*args, **kwargs):
381
+ event_id = global_tracer.start_agent(agent_name, args)
382
+ start_time = time.time()
383
+
384
+ try:
385
+ result = func(*args, **kwargs)
386
+ duration = (time.time() - start_time) * 1000
387
+
388
+ global_tracer.agent_output(agent_name, result)
389
+ global_monitor.record_execution(agent_name, duration, True)
390
+
391
+ return result
392
+
393
+ except Exception as e:
394
+ duration = (time.time() - start_time) * 1000
395
+
396
+ global_tracer.error(agent_name, e)
397
+ global_monitor.record_execution(agent_name, duration, False, str(e))
398
+
399
+ raise
400
+
401
+ return wrapper
402
+ return decorator
403
+
404
+
405
+ # Demo usage
406
+ def demo_observability():
407
+ """Demonstrate observability features"""
408
+
409
+ tracer = AgentTracer()
410
+ monitor = AgentMonitor()
411
+ triage = TriageAgent(tracer)
412
+
413
+ # Simulate agent interactions
414
+ routing = triage.triage_request("Help me write a resume for a software engineering position")
415
+
416
+ # Simulate tool calls
417
+ tracer.tool_call("CVAgent", "extract_keywords", {"text": "software engineering"})
418
+ tracer.tool_call("CVAgent", "optimize_ats", {"resume": "..."})
419
+
420
+ # Simulate handoff
421
+ tracer.agent_handoff("CVAgent", "ATSOptimizer", {"resume_draft": "..."})
422
+
423
+ # Print interaction flow
424
+ tracer.print_interaction_flow()
425
+
426
+ # Show metrics
427
+ print("Metrics:", tracer.get_metrics())
428
+
429
+
430
+ if __name__ == "__main__":
431
+ demo_observability()
agents/orchestrator.py ADDED
@@ -0,0 +1,232 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+ from typing import List, Tuple, Optional
3
+ import logging
4
+ import re
5
+
6
+ from models.schemas import OrchestrationResult, JobPosting, UserProfile
7
+ from utils.text import extract_keywords_from_text
8
+ from utils.consistency import detect_contradictions, allowed_keywords_from_profile
9
+ from utils.probability import resume_probability, cover_letter_probability
10
+ from utils.config import AgentConfig, UIConfig
11
+ from memory.store import memory_store
12
+ from .linkedin_manager import LinkedInManagerAgent
13
+ from .cv_owner import CVOwnerAgent
14
+ from .cover_letter_agent import CoverLetterAgent
15
+
16
+ logger = logging.getLogger(__name__)
17
+
18
+
19
+ class OrchestratorAgent:
20
+ def __init__(self) -> None:
21
+ self.linkedin = LinkedInManagerAgent()
22
+ self.cv_owner = CVOwnerAgent()
23
+ self.cover_letter = CoverLetterAgent()
24
+ self.name = "orchestrator"
25
+
26
+ def login_url(self) -> str:
27
+ return self.linkedin.get_login_url()
28
+
29
+ def handle_login_code(self, code: str, state: Optional[str] = None) -> bool:
30
+ """Handle OAuth callback with state validation for CSRF protection."""
31
+ return self.linkedin.handle_oauth_callback(code, state)
32
+
33
+ def get_profile(self) -> UserProfile:
34
+ return self.linkedin.get_profile()
35
+
36
+ def get_saved_jobs(self) -> List[JobPosting]:
37
+ return self.linkedin.get_saved_jobs()
38
+
39
+ def get_tailored_jobs(self, limit: int = UIConfig.MAX_SUGGESTED_JOBS) -> List[Tuple[JobPosting, float]]:
40
+ """Get jobs tailored to user's profile, scored by skill overlap."""
41
+ profile = self.get_profile()
42
+ jobs = self.get_saved_jobs()
43
+ scored: List[Tuple[JobPosting, float]] = []
44
+ profile_keywords = set([s.lower() for s in profile.skills])
45
+
46
+ if not profile_keywords:
47
+ logger.warning("No profile keywords found for job matching")
48
+ return [(j, 0.0) for j in jobs[:limit]]
49
+
50
+ for j in jobs:
51
+ jd_keywords = set([k.lower() for k in extract_keywords_from_text(
52
+ j.description or "",
53
+ top_k=AgentConfig.JOB_KEYWORDS_COUNT
54
+ )])
55
+ overlap = profile_keywords.intersection(jd_keywords)
56
+ score = len(overlap) / max(1, len(profile_keywords))
57
+ scored.append((j, score))
58
+
59
+ scored.sort(key=lambda t: t[1], reverse=True)
60
+ return scored[:limit]
61
+
62
+ def _smart_remove_keyword(self, text: str, keyword: str) -> str:
63
+ """Intelligently remove a keyword from text without breaking sentences."""
64
+ # Try to remove complete phrases containing the keyword
65
+ patterns = [
66
+ rf'\b[^.]*\b{re.escape(keyword)}\b[^.]*\.', # Full sentence
67
+ rf',\s*[^,]*\b{re.escape(keyword)}\b[^,]*(?=,|\.|$)', # Clause
68
+ rf'\b{re.escape(keyword)}\b\s*(?:and|or|,)\s*', # List item
69
+ rf'(?:and|or|,)\s*\b{re.escape(keyword)}\b', # List item
70
+ rf'\b{re.escape(keyword)}\b', # Just the word
71
+ ]
72
+
73
+ for pattern in patterns:
74
+ new_text = re.sub(pattern, '', text, flags=re.IGNORECASE)
75
+ # Clean up any double spaces or punctuation
76
+ new_text = re.sub(r'\s+', ' ', new_text)
77
+ new_text = re.sub(r',\s*,', ',', new_text)
78
+ new_text = re.sub(r'\.\s*\.', '.', new_text)
79
+
80
+ if new_text != text:
81
+ logger.debug(f"Removed keyword '{keyword}' using pattern: {pattern[:30]}...")
82
+ return new_text.strip()
83
+
84
+ return text
85
+
86
+ def run_for_jobs(
87
+ self,
88
+ jobs: List[JobPosting],
89
+ user_id: str = "default_user",
90
+ cv_chat: Optional[str] = None,
91
+ cover_chat: Optional[str] = None,
92
+ cv_seed: Optional[str] = None,
93
+ cover_seed: Optional[str] = None,
94
+ agent2_notes: Optional[str] = None,
95
+ inspiration_url: Optional[str] = None
96
+ ) -> List[OrchestrationResult]:
97
+ """Orchestrate resume and cover letter generation for multiple jobs."""
98
+ profile = self.get_profile()
99
+ results: List[OrchestrationResult] = []
100
+ allowed = allowed_keywords_from_profile(profile.skills, profile.experiences)
101
+
102
+ logger.info(f"Starting orchestration for {len(jobs)} jobs")
103
+
104
+ for job in jobs:
105
+ logger.info(f"Processing job: {job.title} at {job.company}")
106
+
107
+ # Initial generation
108
+ resume_draft = self.cv_owner.create_resume(
109
+ profile, job,
110
+ user_id=user_id,
111
+ user_chat=cv_chat,
112
+ seed_text=cv_seed,
113
+ agent2_notes=agent2_notes
114
+ )
115
+
116
+ cover_draft = self.cover_letter.create_cover_letter(
117
+ profile, job,
118
+ user_id=user_id,
119
+ user_chat=cover_chat,
120
+ seed_text=cover_seed,
121
+ agent2_notes=agent2_notes,
122
+ inspiration_url=inspiration_url
123
+ )
124
+
125
+ # Consistency checking and refinement
126
+ for cycle in range(AgentConfig.OPTIMIZATION_CYCLES):
127
+ issues = detect_contradictions(resume_draft.text, cover_draft.text, allowed)
128
+
129
+ memory_store.save(user_id, self.name, {
130
+ "job_id": job.id,
131
+ "cycle": cycle + 1,
132
+ "issues": issues,
133
+ "issues_count": len(issues),
134
+ }, job_id=job.id)
135
+
136
+ if not issues:
137
+ logger.info(f"No consistency issues found in cycle {cycle + 1}")
138
+ break
139
+
140
+ logger.warning(f"Found {len(issues)} consistency issues in cycle {cycle + 1}")
141
+
142
+ # Smart removal of contradictory keywords
143
+ issues_to_fix = issues[:AgentConfig.MAX_CONTRADICTION_FIXES]
144
+ for keyword in issues_to_fix:
145
+ if keyword.lower() not in allowed:
146
+ # Use smart removal instead of simple replace
147
+ cover_draft.text = self._smart_remove_keyword(cover_draft.text, keyword)
148
+ logger.debug(f"Removed unauthorized keyword: {keyword}")
149
+
150
+ # Regenerate cover letter with fixes
151
+ cover_draft = self.cover_letter.create_cover_letter(
152
+ profile, job,
153
+ user_id=user_id,
154
+ user_chat=cover_chat,
155
+ seed_text=cover_draft.text, # Use modified text as seed
156
+ agent2_notes=agent2_notes,
157
+ inspiration_url=inspiration_url
158
+ )
159
+
160
+ # Calculate metrics
161
+ salary = self.linkedin.estimate_salary(job)
162
+ p_resume = resume_probability(resume_draft.text, job.description)
163
+ p_cover = cover_letter_probability(cover_draft.text, job.description)
164
+ overall_p = max(0.0, min(1.0, p_resume * p_cover))
165
+
166
+ # Validate salary estimates
167
+ reasoning_ok = (
168
+ overall_p >= 0.0 and
169
+ salary.get("GBP", {}).get("low", 0) < salary.get("GBP", {}).get("high", 999999)
170
+ )
171
+
172
+ # Save final metrics
173
+ memory_store.save(user_id, self.name, {
174
+ "job_id": job.id,
175
+ "final": True,
176
+ "resume_keywords": resume_draft.keywords_used,
177
+ "cover_keywords": cover_draft.keywords_used,
178
+ "metrics": {
179
+ "salary": salary,
180
+ "p_resume": p_resume,
181
+ "p_cover": p_cover,
182
+ "overall_p": overall_p,
183
+ "reasoning_ok": reasoning_ok,
184
+ }
185
+ }, job_id=job.id)
186
+
187
+ result = OrchestrationResult(
188
+ job=job,
189
+ resume=resume_draft,
190
+ cover_letter=cover_draft,
191
+ metrics={
192
+ "salary": salary,
193
+ "p_resume": p_resume,
194
+ "p_cover": p_cover,
195
+ "overall_p": overall_p,
196
+ "reasoning_ok": reasoning_ok,
197
+ }
198
+ )
199
+ results.append(result)
200
+
201
+ logger.info(
202
+ f"Completed job {job.id}: resume_p={p_resume:.2f}, "
203
+ f"cover_p={p_cover:.2f}, overall_p={overall_p:.2f}"
204
+ )
205
+
206
+ logger.info(f"Orchestration complete for {len(results)} jobs")
207
+ return results
208
+
209
+ def regenerate_for_job(
210
+ self,
211
+ job: JobPosting,
212
+ user_id: str,
213
+ cv_chat: Optional[str] = None,
214
+ cover_chat: Optional[str] = None,
215
+ cv_seed: Optional[str] = None,
216
+ cover_seed: Optional[str] = None,
217
+ agent2_notes: Optional[str] = None,
218
+ inspiration_url: Optional[str] = None
219
+ ) -> OrchestrationResult:
220
+ """Regenerate documents for a single job."""
221
+ logger.info(f"Regenerating documents for job: {job.title} at {job.company}")
222
+ results = self.run_for_jobs(
223
+ [job],
224
+ user_id=user_id,
225
+ cv_chat=cv_chat,
226
+ cover_chat=cover_chat,
227
+ cv_seed=cv_seed,
228
+ cover_seed=cover_seed,
229
+ agent2_notes=agent2_notes,
230
+ inspiration_url=inspiration_url
231
+ )
232
+ return results[0]
agents/parallel_executor.py ADDED
@@ -0,0 +1,425 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Parallel Agent Executor
3
+ Implements async parallel execution of agents for faster processing
4
+ Based on the parallel agent pattern for improved performance
5
+ """
6
+
7
+ import asyncio
8
+ import time
9
+ import logging
10
+ from typing import List, Dict, Any, Tuple, Optional
11
+ from dataclasses import dataclass
12
+ from datetime import datetime
13
+ import nest_asyncio
14
+ import matplotlib.pyplot as plt
15
+ from concurrent.futures import ThreadPoolExecutor
16
+
17
+ from models.schemas import JobPosting, ResumeDraft, CoverLetterDraft, OrchestrationResult
18
+
19
+ # Apply nest_asyncio to allow nested event loops (useful in Jupyter/Gradio)
20
+ try:
21
+ nest_asyncio.apply()
22
+ except:
23
+ pass
24
+
25
+ logger = logging.getLogger(__name__)
26
+
27
+
28
+ @dataclass
29
+ class AgentResult:
30
+ """Result from an agent execution"""
31
+ agent_name: str
32
+ output: Any
33
+ start_time: float
34
+ end_time: float
35
+ duration: float
36
+ success: bool
37
+ error: Optional[str] = None
38
+
39
+
40
+ class ParallelAgentExecutor:
41
+ """Execute multiple agents in parallel for faster processing"""
42
+
43
+ def __init__(self, max_workers: int = 4):
44
+ self.max_workers = max_workers
45
+ self.executor = ThreadPoolExecutor(max_workers=max_workers)
46
+ self.execution_history: List[Tuple[str, float, float]] = []
47
+
48
+ async def run_agent_async(
49
+ self,
50
+ agent_func: callable,
51
+ agent_name: str,
52
+ *args,
53
+ **kwargs
54
+ ) -> AgentResult:
55
+ """Run a single agent asynchronously"""
56
+ start_time = time.time()
57
+
58
+ try:
59
+ # Log start
60
+ logger.info(f"Starting {agent_name} at {datetime.now()}")
61
+
62
+ # Run the agent function
63
+ if asyncio.iscoroutinefunction(agent_func):
64
+ result = await agent_func(*args, **kwargs)
65
+ else:
66
+ # Run sync function in executor
67
+ loop = asyncio.get_event_loop()
68
+ result = await loop.run_in_executor(
69
+ self.executor,
70
+ agent_func,
71
+ *args
72
+ )
73
+
74
+ end_time = time.time()
75
+ duration = end_time - start_time
76
+
77
+ # Track execution
78
+ self.execution_history.append((agent_name, start_time, end_time))
79
+
80
+ logger.info(f"Completed {agent_name} in {duration:.2f}s")
81
+
82
+ return AgentResult(
83
+ agent_name=agent_name,
84
+ output=result,
85
+ start_time=start_time,
86
+ end_time=end_time,
87
+ duration=duration,
88
+ success=True
89
+ )
90
+
91
+ except Exception as e:
92
+ end_time = time.time()
93
+ duration = end_time - start_time
94
+
95
+ logger.error(f"Error in {agent_name}: {str(e)}")
96
+
97
+ return AgentResult(
98
+ agent_name=agent_name,
99
+ output=None,
100
+ start_time=start_time,
101
+ end_time=end_time,
102
+ duration=duration,
103
+ success=False,
104
+ error=str(e)
105
+ )
106
+
107
+ async def run_parallel_agents(
108
+ self,
109
+ agents: List[Dict[str, Any]]
110
+ ) -> Dict[str, AgentResult]:
111
+ """
112
+ Run multiple agents in parallel
113
+
114
+ Args:
115
+ agents: List of dicts with 'name', 'func', 'args', 'kwargs'
116
+
117
+ Returns:
118
+ Dict mapping agent names to results
119
+ """
120
+ tasks = []
121
+
122
+ for agent in agents:
123
+ task = self.run_agent_async(
124
+ agent['func'],
125
+ agent['name'],
126
+ *agent.get('args', []),
127
+ **agent.get('kwargs', {})
128
+ )
129
+ tasks.append(task)
130
+
131
+ # Run all agents in parallel
132
+ results = await asyncio.gather(*tasks, return_exceptions=True)
133
+
134
+ # Map results by name
135
+ result_map = {}
136
+ for i, agent in enumerate(agents):
137
+ if isinstance(results[i], Exception):
138
+ result_map[agent['name']] = AgentResult(
139
+ agent_name=agent['name'],
140
+ output=None,
141
+ start_time=time.time(),
142
+ end_time=time.time(),
143
+ duration=0,
144
+ success=False,
145
+ error=str(results[i])
146
+ )
147
+ else:
148
+ result_map[agent['name']] = results[i]
149
+
150
+ return result_map
151
+
152
+ def plot_timeline(self, save_path: Optional[str] = None):
153
+ """Plot execution timeline of agents"""
154
+ if not self.execution_history:
155
+ logger.warning("No execution history to plot")
156
+ return
157
+
158
+ # Normalize times to zero
159
+ base = min(start for _, start, _ in self.execution_history)
160
+
161
+ # Prepare data
162
+ labels = []
163
+ start_offsets = []
164
+ durations = []
165
+
166
+ for name, start, end in self.execution_history:
167
+ labels.append(name)
168
+ start_offsets.append(start - base)
169
+ durations.append(end - start)
170
+
171
+ # Create plot
172
+ plt.figure(figsize=(10, 6))
173
+ plt.barh(labels, durations, left=start_offsets, height=0.5)
174
+ plt.xlabel("Seconds since start")
175
+ plt.title("Agent Execution Timeline")
176
+ plt.grid(True, alpha=0.3)
177
+
178
+ # Add duration labels
179
+ for i, (offset, duration) in enumerate(zip(start_offsets, durations)):
180
+ plt.text(offset + duration/2, i, f'{duration:.2f}s',
181
+ ha='center', va='center', color='white', fontsize=8)
182
+
183
+ plt.tight_layout()
184
+
185
+ if save_path:
186
+ plt.savefig(save_path)
187
+ logger.info(f"Timeline saved to {save_path}")
188
+ else:
189
+ plt.show()
190
+
191
+ return plt.gcf()
192
+
193
+
194
+ class ParallelJobProcessor:
195
+ """Process multiple jobs in parallel using agent parallelization"""
196
+
197
+ def __init__(self):
198
+ self.executor = ParallelAgentExecutor(max_workers=4)
199
+
200
+ async def process_jobs_parallel(
201
+ self,
202
+ jobs: List[JobPosting],
203
+ cv_agent_func: callable,
204
+ cover_agent_func: callable,
205
+ research_func: callable = None,
206
+ **kwargs
207
+ ) -> List[OrchestrationResult]:
208
+ """
209
+ Process multiple jobs in parallel
210
+
211
+ Each job gets:
212
+ 1. Resume generation
213
+ 2. Cover letter generation
214
+ 3. Optional web research
215
+ All running in parallel per job
216
+ """
217
+ all_results = []
218
+
219
+ for job in jobs:
220
+ # Define agents for this job
221
+ agents = [
222
+ {
223
+ 'name': f'Resume_{job.company}',
224
+ 'func': cv_agent_func,
225
+ 'args': [job],
226
+ 'kwargs': kwargs
227
+ },
228
+ {
229
+ 'name': f'CoverLetter_{job.company}',
230
+ 'func': cover_agent_func,
231
+ 'args': [job],
232
+ 'kwargs': kwargs
233
+ }
234
+ ]
235
+
236
+ # Add research if available
237
+ if research_func:
238
+ agents.append({
239
+ 'name': f'Research_{job.company}',
240
+ 'func': research_func,
241
+ 'args': [job.company],
242
+ 'kwargs': {}
243
+ })
244
+
245
+ # Run agents in parallel for this job
246
+ results = await self.executor.run_parallel_agents(agents)
247
+
248
+ # Combine results
249
+ orchestration_result = OrchestrationResult(
250
+ job=job,
251
+ resume=results[f'Resume_{job.company}'].output,
252
+ cover_letter=results[f'CoverLetter_{job.company}'].output,
253
+ keywords=[], # Would be extracted
254
+ research=results.get(f'Research_{job.company}', {}).output if research_func else None
255
+ )
256
+
257
+ all_results.append(orchestration_result)
258
+
259
+ # Generate timeline
260
+ self.executor.plot_timeline(save_path="parallel_execution_timeline.png")
261
+
262
+ return all_results
263
+
264
+
265
+ class MetaAgent:
266
+ """
267
+ Meta-agent that combines outputs from multiple specialized agents
268
+ Similar to the article's pattern of combining summaries
269
+ """
270
+
271
+ def __init__(self):
272
+ self.executor = ParallelAgentExecutor()
273
+
274
+ async def analyze_job_fit(
275
+ self,
276
+ job: JobPosting,
277
+ resume: ResumeDraft
278
+ ) -> Dict[str, Any]:
279
+ """
280
+ Run multiple analysis agents in parallel and combine results
281
+ """
282
+
283
+ # Define specialized analysis agents
284
+ agents = [
285
+ {
286
+ 'name': 'SkillsMatcher',
287
+ 'func': self._match_skills,
288
+ 'args': [job, resume]
289
+ },
290
+ {
291
+ 'name': 'ExperienceAnalyzer',
292
+ 'func': self._analyze_experience,
293
+ 'args': [job, resume]
294
+ },
295
+ {
296
+ 'name': 'CultureFit',
297
+ 'func': self._assess_culture_fit,
298
+ 'args': [job, resume]
299
+ },
300
+ {
301
+ 'name': 'SalaryEstimator',
302
+ 'func': self._estimate_salary_fit,
303
+ 'args': [job, resume]
304
+ }
305
+ ]
306
+
307
+ # Run all agents in parallel
308
+ results = await self.executor.run_parallel_agents(agents)
309
+
310
+ # Combine into executive summary
311
+ summary = self._combine_analyses(results)
312
+
313
+ return summary
314
+
315
+ def _match_skills(self, job: JobPosting, resume: ResumeDraft) -> Dict:
316
+ """Match skills between job and resume"""
317
+ job_skills = set(job.description.lower().split())
318
+ resume_skills = set(resume.text.lower().split())
319
+
320
+ matched = job_skills & resume_skills
321
+ missing = job_skills - resume_skills
322
+
323
+ return {
324
+ 'matched_skills': len(matched),
325
+ 'missing_skills': len(missing),
326
+ 'match_percentage': len(matched) / len(job_skills) * 100 if job_skills else 0,
327
+ 'top_matches': list(matched)[:10]
328
+ }
329
+
330
+ def _analyze_experience(self, job: JobPosting, resume: ResumeDraft) -> Dict:
331
+ """Analyze experience relevance"""
332
+ # Simplified analysis
333
+ return {
334
+ 'years_experience': 5, # Would extract from resume
335
+ 'relevant_roles': 3,
336
+ 'industry_match': True
337
+ }
338
+
339
+ def _assess_culture_fit(self, job: JobPosting, resume: ResumeDraft) -> Dict:
340
+ """Assess cultural fit"""
341
+ return {
342
+ 'remote_preference': 'remote' in job.location.lower() if job.location else False,
343
+ 'company_size_fit': True,
344
+ 'values_alignment': 0.8
345
+ }
346
+
347
+ def _estimate_salary_fit(self, job: JobPosting, resume: ResumeDraft) -> Dict:
348
+ """Estimate salary fit"""
349
+ return {
350
+ 'estimated_range': '$100k-$150k',
351
+ 'market_rate': True,
352
+ 'negotiation_room': 'moderate'
353
+ }
354
+
355
+ def _combine_analyses(self, results: Dict[str, AgentResult]) -> Dict:
356
+ """Combine all analyses into executive summary"""
357
+ summary = {
358
+ 'overall_fit_score': 0,
359
+ 'strengths': [],
360
+ 'gaps': [],
361
+ 'recommendations': [],
362
+ 'detailed_analysis': {}
363
+ }
364
+
365
+ # Extract successful results
366
+ for name, result in results.items():
367
+ if result.success and result.output:
368
+ summary['detailed_analysis'][name] = result.output
369
+
370
+ # Calculate overall score
371
+ if 'SkillsMatcher' in summary['detailed_analysis']:
372
+ skills_score = summary['detailed_analysis']['SkillsMatcher'].get('match_percentage', 0)
373
+ summary['overall_fit_score'] = skills_score
374
+
375
+ # Generate recommendations
376
+ if summary['overall_fit_score'] > 70:
377
+ summary['recommendations'].append("Strong candidate - proceed with application")
378
+ elif summary['overall_fit_score'] > 50:
379
+ summary['recommendations'].append("Moderate fit - customize resume for better match")
380
+ else:
381
+ summary['recommendations'].append("Low fit - consider if this role aligns with goals")
382
+
383
+ return summary
384
+
385
+
386
+ # Usage example
387
+ async def demo_parallel_execution():
388
+ """Demonstrate parallel agent execution"""
389
+
390
+ # Create executor
391
+ executor = ParallelAgentExecutor()
392
+
393
+ # Define sample agents
394
+ async def agent1():
395
+ await asyncio.sleep(2)
396
+ return "Agent 1 result"
397
+
398
+ async def agent2():
399
+ await asyncio.sleep(1)
400
+ return "Agent 2 result"
401
+
402
+ async def agent3():
403
+ await asyncio.sleep(3)
404
+ return "Agent 3 result"
405
+
406
+ agents = [
407
+ {'name': 'FastAgent', 'func': agent2},
408
+ {'name': 'MediumAgent', 'func': agent1},
409
+ {'name': 'SlowAgent', 'func': agent3}
410
+ ]
411
+
412
+ # Run in parallel
413
+ results = await executor.run_parallel_agents(agents)
414
+
415
+ # Show results
416
+ for name, result in results.items():
417
+ print(f"{name}: {result.output} (took {result.duration:.2f}s)")
418
+
419
+ # Plot timeline
420
+ executor.plot_timeline()
421
+
422
+
423
+ if __name__ == "__main__":
424
+ # Run demo
425
+ asyncio.run(demo_parallel_execution())
agents/pipeline.py ADDED
@@ -0,0 +1,205 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+ from typing import Dict, Any
3
+ import os
4
+ import json
5
+ from datetime import datetime
6
+ from models.schemas import JobPosting, UserProfile, ResumeDraft, CoverLetterDraft, OrchestrationResult
7
+ from .router_agent import RouterAgent
8
+ from .profile_agent import ProfileAgent
9
+ from .job_agent import JobAgent
10
+ from .cv_owner import CVOwnerAgent
11
+ from .cover_letter_agent import CoverLetterAgent
12
+ from utils.consistency import detect_contradictions, allowed_keywords_from_profile, coverage_score
13
+ from memory.store import memory_store
14
+ from .temporal_tracker import TemporalApplicationTracker
15
+ from utils.text import extract_keywords_from_text
16
+
17
+
18
+ class ApplicationPipeline:
19
+ """User -> Router -> Profile -> Job -> Resume -> Cover -> Orchestrator Review -> User"""
20
+
21
+ def __init__(self) -> None:
22
+ self.router = RouterAgent()
23
+ self.profile_agent = ProfileAgent()
24
+ self.job_agent = JobAgent()
25
+ self.resume_agent = CVOwnerAgent()
26
+ self.cover_agent = CoverLetterAgent()
27
+ self.temporal_tracker = TemporalApplicationTracker()
28
+ self._events_path = os.path.join(str(memory_store.base_dir), "events.jsonl")
29
+
30
+ def _log_event(self, agent: str, event: str, payload: Dict[str, Any]) -> None:
31
+ try:
32
+ os.makedirs(os.path.dirname(self._events_path), exist_ok=True)
33
+ entry = {
34
+ "ts": datetime.now().isoformat(),
35
+ "agent": agent,
36
+ "event": event,
37
+ "payload": payload or {},
38
+ }
39
+ with open(self._events_path, "a", encoding="utf-8") as f:
40
+ f.write(json.dumps(entry, ensure_ascii=False) + "\n")
41
+ except Exception:
42
+ pass
43
+
44
+ def run(self, payload: Dict[str, Any], user_id: str = "default_user") -> Dict[str, Any]:
45
+ state = dict(payload)
46
+ while True:
47
+ next_step = self.router.route(state)
48
+ # Router decision summary (safe, no chain-of-thought)
49
+ try:
50
+ self._log_event(
51
+ "RouterAgent",
52
+ "route_decision",
53
+ {
54
+ "cv_present": bool(state.get("cv_text")),
55
+ "job_present": bool(state.get("job_posting")),
56
+ "profile_ready": bool(state.get("profile")),
57
+ "job_analyzed": bool(state.get("job_analysis")),
58
+ "resume_ready": bool(state.get("resume_draft")),
59
+ "cover_ready": bool(state.get("cover_letter_draft")),
60
+ "next": next_step,
61
+ },
62
+ )
63
+ except Exception:
64
+ pass
65
+
66
+ if next_step == "profile":
67
+ parsed = self.profile_agent.parse(state.get("cv_text", ""))
68
+ state["profile"] = parsed
69
+ # Profile summary
70
+ try:
71
+ prof = parsed or {}
72
+ self._log_event(
73
+ "ProfileAgent",
74
+ "parsed_profile",
75
+ {
76
+ "has_full_name": bool(prof.get("full_name")),
77
+ "skills_count": len(prof.get("skills", [])) if isinstance(prof, dict) else 0,
78
+ },
79
+ )
80
+ except Exception:
81
+ pass
82
+
83
+ elif next_step == "job":
84
+ analysis = self.job_agent.analyze(state.get("job_posting", ""))
85
+ state["job_analysis"] = analysis
86
+ # Job analysis summary
87
+ try:
88
+ ja = analysis or {}
89
+ self._log_event(
90
+ "JobAgent",
91
+ "job_analyzed",
92
+ {
93
+ "has_company": bool(ja.get("company")),
94
+ "has_role": bool(ja.get("role")),
95
+ "key_req_count": len(ja.get("key_requirements", [])) if isinstance(ja, dict) else 0,
96
+ },
97
+ )
98
+ except Exception:
99
+ pass
100
+
101
+ elif next_step == "resume":
102
+ profile_model = self._to_profile_model(state["profile"]) if not isinstance(state["profile"], UserProfile) else state["profile"]
103
+ job_model = self._to_job_model(state)
104
+ resume = self.resume_agent.create_resume(profile_model, job_model, user_id=user_id)
105
+ state["resume_draft"] = resume
106
+ # Optional summary
107
+ try:
108
+ job_k = extract_keywords_from_text(job_model.description or "", top_k=20)
109
+ cov = coverage_score(getattr(resume, "text", "") or "", job_k)
110
+ self._log_event("CVOwnerAgent", "resume_generated", {"job_id": job_model.id, "chars": len(getattr(resume, "text", "") or ""), "coverage": round(cov, 3)})
111
+ except Exception:
112
+ pass
113
+
114
+ elif next_step == "cover":
115
+ profile_model = self._to_profile_model(state["profile"]) if not isinstance(state["profile"], UserProfile) else state["profile"]
116
+ job_model = self._to_job_model(state)
117
+ cover = self.cover_agent.create_cover_letter(profile_model, job_model, user_id=user_id)
118
+ state["cover_letter_draft"] = cover
119
+ # Optional summary
120
+ try:
121
+ job_k = extract_keywords_from_text(job_model.description or "", top_k=20)
122
+ cov = coverage_score(getattr(cover, "text", "") or "", job_k)
123
+ self._log_event("CoverLetterAgent", "cover_generated", {"job_id": job_model.id, "chars": len(getattr(cover, "text", "") or ""), "coverage": round(cov, 3)})
124
+ except Exception:
125
+ pass
126
+
127
+ elif next_step == "review":
128
+ self._review(state, user_id)
129
+ break
130
+ return state
131
+
132
+ def _to_job_model(self, state: Dict[str, Any]) -> JobPosting:
133
+ return JobPosting(
134
+ id=state.get("job_id", "job_1"),
135
+ title=state.get("job_title") or state.get("job_analysis", {}).get("role", "Role"),
136
+ company=state.get("job_company") or state.get("job_analysis", {}).get("company", "Company"),
137
+ description=state.get("job_posting", ""),
138
+ location=state.get("job_analysis", {}).get("location"),
139
+ employment_type=state.get("job_analysis", {}).get("employment_type"),
140
+ )
141
+
142
+ def _to_profile_model(self, profile_dict: Dict[str, Any]) -> UserProfile:
143
+ # Best-effort mapping from parsed dict to model
144
+ return UserProfile(
145
+ full_name=profile_dict.get("full_name", ""),
146
+ headline=profile_dict.get("headline"),
147
+ summary=profile_dict.get("summary"),
148
+ email=profile_dict.get("email"),
149
+ phone=profile_dict.get("phone"),
150
+ location=profile_dict.get("location"),
151
+ skills=profile_dict.get("skills", []),
152
+ experiences=[
153
+ # Minimal mapping; agents rely on text and keywords anyway
154
+ ]
155
+ )
156
+
157
+ def _review(self, state: Dict[str, Any], user_id: str) -> None:
158
+ # Orchestrator-style review: detect contradictions and persist
159
+ resume_text = state.get("resume_draft").text if isinstance(state.get("resume_draft"), ResumeDraft) else ""
160
+ cover_text = state.get("cover_letter_draft").text if isinstance(state.get("cover_letter_draft"), CoverLetterDraft) else ""
161
+ profile = state.get("profile") or {}
162
+ job_desc = state.get("job_posting", "")
163
+ job_k = extract_keywords_from_text(job_desc or "", top_k=30)
164
+ base_allowed = allowed_keywords_from_profile(profile.get("skills", []), profile.get("experiences", [])) if isinstance(profile, dict) else allowed_keywords_from_profile(profile.skills, profile.experiences)
165
+ # Broaden allowed keywords with those present in the generated documents to reduce false positives
166
+ resume_k = set(k.lower() for k in extract_keywords_from_text(resume_text or "", top_k=150))
167
+ cover_k = set(k.lower() for k in extract_keywords_from_text(cover_text or "", top_k=150))
168
+ allowed = set(base_allowed) | resume_k | cover_k | set(k.lower() for k in job_k)
169
+ issues = detect_contradictions(resume_text, cover_text, allowed)
170
+ # Coverage metrics
171
+ resume_cov = coverage_score(resume_text or "", job_k)
172
+ cover_cov = coverage_score(cover_text or "", job_k)
173
+ # Simple recommendation score and decision
174
+ score = 0.45 * resume_cov + 0.45 * cover_cov - min(0.3, len(issues) / 100.0)
175
+ decision = "interview" if score >= 0.45 else "review"
176
+ memory_store.save(user_id, "orchestrator_review", {
177
+ "issues": issues,
178
+ "issues_count": len(issues),
179
+ "resume_coverage": round(resume_cov, 3),
180
+ "cover_coverage": round(cover_cov, 3),
181
+ "score": round(score, 3),
182
+ "decision": decision,
183
+ })
184
+ # Emit review event
185
+ try:
186
+ self._log_event(
187
+ "Orchestrator",
188
+ "review_summary",
189
+ {
190
+ "issues_count": len(issues),
191
+ "resume_cov": round(resume_cov, 3),
192
+ "cover_cov": round(cover_cov, 3),
193
+ "decision": decision,
194
+ },
195
+ )
196
+ except Exception:
197
+ pass
198
+
199
+ # Temporal tracking: record a drafted status with issues metadata
200
+ try:
201
+ job_model = self._to_job_model(state)
202
+ self.temporal_tracker.track_application(job_model, status="drafted", metadata={"issues_count": len(issues)})
203
+ except Exception:
204
+ # Non-fatal; continue even if temporal tracking fails
205
+ pass
agents/profile_agent.py ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+ from typing import Dict, Any
3
+ from services.llm import llm
4
+ import json
5
+
6
+
7
+ class ProfileAgent:
8
+ """Parses raw CV text into a structured profile using LLM with fallback."""
9
+
10
+ def parse(self, cv_text: str) -> Dict[str, Any]:
11
+ if not cv_text:
12
+ return {}
13
+ if not llm.enabled:
14
+ return {
15
+ "full_name": "Unknown",
16
+ "email": "",
17
+ "skills": [],
18
+ "experiences": [],
19
+ "links": {},
20
+ "languages": [],
21
+ "certifications": [],
22
+ "projects": [],
23
+ "work_mode": "",
24
+ "skill_proficiency": {},
25
+ }
26
+ system = (
27
+ "You are a CV parser. Extract JSON with fields: full_name, email, phone, location, "
28
+ "skills (list), experiences (list of {title, company, start_date, end_date, achievements, technologies}), "
29
+ "education (list of {school, degree, field_of_study, start_date, end_date}), links (map with linkedin/portfolio/website if present). "
30
+ "Also extract optional: languages (list of {language, level}), certifications (list of {name, issuer, year}), "
31
+ "projects (list of {title, link, impact}), work_mode (remote/hybrid/on-site if evident), skill_proficiency (map skill->level). "
32
+ "Keep values concise; do not invent information."
33
+ )
34
+ user = f"Parse this CV into JSON with the schema above. Be strict JSON.\n\n{cv_text}"
35
+ resp = llm.generate(system, user, max_tokens=900, agent="parser")
36
+ try:
37
+ return json.loads(resp)
38
+ except Exception:
39
+ return {"raw": resp}
agents/router_agent.py ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+ from typing import Literal, Optional, Dict, Any
3
+
4
+
5
+ class RouterAgent:
6
+ """Simple router that decides the next step in the pipeline."""
7
+
8
+ def route(self, payload: Dict[str, Any]) -> Literal["profile", "job", "resume", "cover", "review"]:
9
+ # Basic heuristics based on provided payload
10
+ if payload.get("cv_text") and not payload.get("profile"):
11
+ return "profile"
12
+ if payload.get("job_posting") and not payload.get("job_analysis"):
13
+ return "job"
14
+ if payload.get("profile") and payload.get("job_analysis") and not payload.get("resume_draft"):
15
+ return "resume"
16
+ if payload.get("resume_draft") and not payload.get("cover_letter_draft"):
17
+ return "cover"
18
+ return "review"
agents/temporal_tracker.py ADDED
@@ -0,0 +1,464 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Temporal Application Tracker
3
+ Implements time-aware tracking of job applications with versioned history
4
+ Based on the Temporal AI Agents pattern for maintaining historical context
5
+ """
6
+
7
+ import json
8
+ import logging
9
+ from typing import Dict, List, Tuple, Optional, Any
10
+ from datetime import datetime, timedelta
11
+ from dataclasses import dataclass, field
12
+ from pathlib import Path
13
+ import hashlib
14
+
15
+ from models.schemas import JobPosting, OrchestrationResult
16
+
17
+ logger = logging.getLogger(__name__)
18
+
19
+
20
+ @dataclass
21
+ class Triplet:
22
+ """
23
+ A time-stamped fact in subject-predicate-object format
24
+ Example: (JobID123, status, applied, 2025-01-15)
25
+ """
26
+ subject: str
27
+ predicate: str
28
+ object: Any
29
+ valid_at: datetime
30
+ expired_at: Optional[datetime] = None
31
+ confidence: float = 1.0
32
+ source: str = "user"
33
+ metadata: Dict = field(default_factory=dict)
34
+
35
+ def to_dict(self) -> Dict:
36
+ return {
37
+ 'subject': self.subject,
38
+ 'predicate': self.predicate,
39
+ 'object': str(self.object),
40
+ 'valid_at': self.valid_at.isoformat(),
41
+ 'expired_at': self.expired_at.isoformat() if self.expired_at else None,
42
+ 'confidence': self.confidence,
43
+ 'source': self.source,
44
+ 'metadata': self.metadata
45
+ }
46
+
47
+ @classmethod
48
+ def from_dict(cls, data: Dict) -> 'Triplet':
49
+ return cls(
50
+ subject=data['subject'],
51
+ predicate=data['predicate'],
52
+ object=data['object'],
53
+ valid_at=datetime.fromisoformat(data['valid_at']),
54
+ expired_at=datetime.fromisoformat(data['expired_at']) if data.get('expired_at') else None,
55
+ confidence=data.get('confidence', 1.0),
56
+ source=data.get('source', 'user'),
57
+ metadata=data.get('metadata', {})
58
+ )
59
+
60
+
61
+ class TemporalKnowledgeGraph:
62
+ """
63
+ Knowledge graph that tracks changes over time
64
+ Maintains history of all application states and changes
65
+ """
66
+
67
+ def __init__(self, storage_path: str = "temporal_graph.json"):
68
+ self.storage_path = Path(storage_path)
69
+ self.triplets: List[Triplet] = []
70
+ self.load()
71
+
72
+ def add_triplet(self, triplet: Triplet) -> None:
73
+ """Add a new fact to the graph"""
74
+ # Check for contradictions
75
+ existing = self.find_current(triplet.subject, triplet.predicate)
76
+
77
+ if existing and existing.object != triplet.object:
78
+ # Invalidate old triplet
79
+ existing.expired_at = triplet.valid_at
80
+ logger.info(f"Invalidated old triplet: {existing.subject}-{existing.predicate}")
81
+
82
+ self.triplets.append(triplet)
83
+ self.save()
84
+
85
+ def find_current(
86
+ self,
87
+ subject: str,
88
+ predicate: str,
89
+ at_time: Optional[datetime] = None
90
+ ) -> Optional[Triplet]:
91
+ """Find the current valid triplet for a subject-predicate pair"""
92
+ at_time = at_time or datetime.now()
93
+
94
+ for triplet in reversed(self.triplets): # Check most recent first
95
+ if (triplet.subject == subject and
96
+ triplet.predicate == predicate and
97
+ triplet.valid_at <= at_time and
98
+ (triplet.expired_at is None or triplet.expired_at > at_time)):
99
+ return triplet
100
+
101
+ return None
102
+
103
+ def get_history(
104
+ self,
105
+ subject: str,
106
+ predicate: Optional[str] = None
107
+ ) -> List[Triplet]:
108
+ """Get full history for a subject"""
109
+ history = []
110
+
111
+ for triplet in self.triplets:
112
+ if triplet.subject == subject:
113
+ if predicate is None or triplet.predicate == predicate:
114
+ history.append(triplet)
115
+
116
+ return sorted(history, key=lambda t: t.valid_at)
117
+
118
+ def query_timerange(
119
+ self,
120
+ start_date: datetime,
121
+ end_date: datetime,
122
+ predicate: Optional[str] = None
123
+ ) -> List[Triplet]:
124
+ """Query all triplets valid within a time range"""
125
+ results = []
126
+
127
+ for triplet in self.triplets:
128
+ if (triplet.valid_at >= start_date and
129
+ triplet.valid_at <= end_date):
130
+ if predicate is None or triplet.predicate == predicate:
131
+ results.append(triplet)
132
+
133
+ return results
134
+
135
+ def save(self) -> None:
136
+ """Save graph to disk"""
137
+ data = {
138
+ 'triplets': [t.to_dict() for t in self.triplets],
139
+ 'last_updated': datetime.now().isoformat()
140
+ }
141
+
142
+ with open(self.storage_path, 'w') as f:
143
+ json.dump(data, f, indent=2)
144
+
145
+ def load(self) -> None:
146
+ """Load graph from disk"""
147
+ if not self.storage_path.exists():
148
+ return
149
+
150
+ try:
151
+ with open(self.storage_path, 'r') as f:
152
+ data = json.load(f)
153
+
154
+ self.triplets = [
155
+ Triplet.from_dict(t) for t in data.get('triplets', [])
156
+ ]
157
+
158
+ logger.info(f"Loaded {len(self.triplets)} triplets from storage")
159
+
160
+ except Exception as e:
161
+ logger.error(f"Error loading temporal graph: {e}")
162
+
163
+
164
+ class TemporalApplicationTracker:
165
+ """
166
+ Track job applications with full temporal history
167
+ Maintains versioned states and changes over time
168
+ """
169
+
170
+ def __init__(self):
171
+ self.graph = TemporalKnowledgeGraph("application_history.json")
172
+
173
+ def track_application(
174
+ self,
175
+ job: JobPosting,
176
+ status: str,
177
+ metadata: Optional[Dict] = None
178
+ ) -> None:
179
+ """Track a new application or status change"""
180
+ job_id = self._get_job_id(job)
181
+ now = datetime.now()
182
+
183
+ # Core application triplets
184
+ triplets = [
185
+ Triplet(job_id, "company", job.company, now),
186
+ Triplet(job_id, "position", job.title, now),
187
+ Triplet(job_id, "status", status, now),
188
+ Triplet(job_id, "applied_date", now.isoformat(), now),
189
+ ]
190
+
191
+ # Optional fields
192
+ if job.location:
193
+ triplets.append(Triplet(job_id, "location", job.location, now))
194
+
195
+ if job.salary:
196
+ triplets.append(Triplet(job_id, "salary", job.salary, now))
197
+
198
+ if job.url:
199
+ triplets.append(Triplet(job_id, "url", job.url, now))
200
+
201
+ # Add metadata as triplets
202
+ if metadata:
203
+ for key, value in metadata.items():
204
+ triplets.append(
205
+ Triplet(job_id, f"meta_{key}", value, now, metadata={'source': 'metadata'})
206
+ )
207
+
208
+ # Add all triplets
209
+ for triplet in triplets:
210
+ self.graph.add_triplet(triplet)
211
+
212
+ logger.info(f"Tracked application for {job.company} - {job.title}")
213
+
214
+ def update_status(
215
+ self,
216
+ job_id: str,
217
+ new_status: str,
218
+ notes: Optional[str] = None
219
+ ) -> None:
220
+ """Update application status"""
221
+ now = datetime.now()
222
+
223
+ # Add new status triplet (old one auto-invalidated)
224
+ self.graph.add_triplet(
225
+ Triplet(job_id, "status", new_status, now)
226
+ )
227
+
228
+ # Add notes if provided
229
+ if notes:
230
+ self.graph.add_triplet(
231
+ Triplet(job_id, "status_notes", notes, now, metadata={'type': 'note'})
232
+ )
233
+
234
+ # Track status change event
235
+ self.graph.add_triplet(
236
+ Triplet(
237
+ job_id,
238
+ "status_changed",
239
+ f"Changed to {new_status}",
240
+ now,
241
+ metadata={'event_type': 'status_change'}
242
+ )
243
+ )
244
+
245
+ def add_interview(
246
+ self,
247
+ job_id: str,
248
+ interview_date: datetime,
249
+ interview_type: str,
250
+ notes: Optional[str] = None
251
+ ) -> None:
252
+ """Track interview scheduling"""
253
+ now = datetime.now()
254
+
255
+ self.graph.add_triplet(
256
+ Triplet(
257
+ job_id,
258
+ "interview_scheduled",
259
+ interview_date.isoformat(),
260
+ now,
261
+ metadata={'type': interview_type}
262
+ )
263
+ )
264
+
265
+ if notes:
266
+ self.graph.add_triplet(
267
+ Triplet(job_id, "interview_notes", notes, now)
268
+ )
269
+
270
+ # Auto-update status
271
+ self.update_status(job_id, "interview_scheduled")
272
+
273
+ def get_application_timeline(self, job_id: str) -> List[Dict]:
274
+ """Get complete timeline for an application"""
275
+ history = self.graph.get_history(job_id)
276
+
277
+ timeline = []
278
+ for triplet in history:
279
+ timeline.append({
280
+ 'date': triplet.valid_at.isoformat(),
281
+ 'event': f"{triplet.predicate}: {triplet.object}",
282
+ 'expired': triplet.expired_at is not None
283
+ })
284
+
285
+ return timeline
286
+
287
+ def get_active_applications(self) -> List[Dict]:
288
+ """Get all currently active applications"""
289
+ # Find all unique job IDs
290
+ job_ids = set()
291
+ for triplet in self.graph.triplets:
292
+ if triplet.subject.startswith('JOB_'):
293
+ job_ids.add(triplet.subject)
294
+
295
+ active = []
296
+ for job_id in job_ids:
297
+ status = self.graph.find_current(job_id, "status")
298
+
299
+ if status and status.object not in ['rejected', 'withdrawn', 'archived']:
300
+ company = self.graph.find_current(job_id, "company")
301
+ position = self.graph.find_current(job_id, "position")
302
+
303
+ active.append({
304
+ 'job_id': job_id,
305
+ 'company': company.object if company else 'Unknown',
306
+ 'position': position.object if position else 'Unknown',
307
+ 'status': status.object,
308
+ 'last_updated': status.valid_at.isoformat()
309
+ })
310
+
311
+ return active
312
+
313
+ def analyze_patterns(self) -> Dict[str, Any]:
314
+ """Analyze application patterns over time"""
315
+ now = datetime.now()
316
+
317
+ # Applications per week
318
+ week_ago = now - timedelta(days=7)
319
+ month_ago = now - timedelta(days=30)
320
+
321
+ week_apps = self.graph.query_timerange(week_ago, now, "status")
322
+ month_apps = self.graph.query_timerange(month_ago, now, "status")
323
+
324
+ # Status distribution
325
+ status_counts = {}
326
+ for triplet in self.graph.triplets:
327
+ if triplet.predicate == "status" and triplet.expired_at is None:
328
+ status = triplet.object
329
+ status_counts[status] = status_counts.get(status, 0) + 1
330
+
331
+ # Response rate
332
+ total_apps = len([t for t in self.graph.triplets if t.predicate == "status" and t.object == "applied"])
333
+ responses = len([t for t in self.graph.triplets if t.predicate == "status" and t.object in ["interview_scheduled", "rejected", "offer"]])
334
+
335
+ response_rate = (responses / total_apps * 100) if total_apps > 0 else 0
336
+
337
+ return {
338
+ 'applications_this_week': len(week_apps),
339
+ 'applications_this_month': len(month_apps),
340
+ 'status_distribution': status_counts,
341
+ 'response_rate': f"{response_rate:.1f}%",
342
+ 'total_applications': total_apps
343
+ }
344
+
345
+ def _get_job_id(self, job: JobPosting) -> str:
346
+ """Generate consistent job ID"""
347
+ if job.id:
348
+ return job.id
349
+
350
+ # Generate ID from company and title
351
+ key = f"{job.company}_{job.title}".lower().replace(' ', '_')
352
+ hash_val = hashlib.md5(key.encode()).hexdigest()[:8]
353
+ return f"JOB_{hash_val}"
354
+
355
+
356
+ class TemporalInvalidationAgent:
357
+ """
358
+ Agent that checks for and invalidates outdated information
359
+ Based on the invalidation pattern from the article
360
+ """
361
+
362
+ def __init__(self, graph: TemporalKnowledgeGraph):
363
+ self.graph = graph
364
+
365
+ def check_contradictions(
366
+ self,
367
+ new_triplet: Triplet,
368
+ threshold: float = 0.8
369
+ ) -> Optional[Triplet]:
370
+ """Check if new triplet contradicts existing ones"""
371
+
372
+ # Find existing triplets with same subject-predicate
373
+ existing = self.graph.find_current(
374
+ new_triplet.subject,
375
+ new_triplet.predicate
376
+ )
377
+
378
+ if not existing:
379
+ return None
380
+
381
+ # Check for contradiction
382
+ if existing.object != new_triplet.object:
383
+ # Calculate confidence in contradiction
384
+ time_diff = (new_triplet.valid_at - existing.valid_at).total_seconds()
385
+
386
+ # More recent info is more likely to be correct
387
+ if time_diff > 0: # New triplet is more recent
388
+ confidence = min(1.0, time_diff / (24 * 3600)) # Max confidence after 1 day
389
+
390
+ if confidence > threshold:
391
+ return existing # Return triplet to invalidate
392
+
393
+ return None
394
+
395
+ def cleanup_expired(self, days_old: int = 90) -> int:
396
+ """Archive triplets older than specified days"""
397
+ cutoff = datetime.now() - timedelta(days=days_old)
398
+ archived = 0
399
+
400
+ for triplet in self.graph.triplets:
401
+ if triplet.expired_at and triplet.expired_at < cutoff:
402
+ # Move to archive (in real implementation)
403
+ triplet.metadata['archived'] = True
404
+ archived += 1
405
+
406
+ if archived > 0:
407
+ self.graph.save()
408
+ logger.info(f"Archived {archived} expired triplets")
409
+
410
+ return archived
411
+
412
+
413
+ # Usage example
414
+ def demo_temporal_tracking():
415
+ """Demonstrate temporal tracking"""
416
+
417
+ tracker = TemporalApplicationTracker()
418
+
419
+ # Create sample job
420
+ job = JobPosting(
421
+ id="JOB_001",
422
+ title="Senior Software Engineer",
423
+ company="TechCorp",
424
+ location="San Francisco",
425
+ salary="$150k-$200k",
426
+ url="https://techcorp.com/jobs/123"
427
+ )
428
+
429
+ # Track initial application
430
+ tracker.track_application(job, "applied", {
431
+ 'cover_letter_version': 'v1',
432
+ 'resume_version': 'v2'
433
+ })
434
+
435
+ # Simulate status updates over time
436
+ import time
437
+ time.sleep(1)
438
+ tracker.update_status("JOB_001", "screening", "Passed initial ATS scan")
439
+
440
+ time.sleep(1)
441
+ tracker.add_interview(
442
+ "JOB_001",
443
+ datetime.now() + timedelta(days=7),
444
+ "phone_screen",
445
+ "30 min call with hiring manager"
446
+ )
447
+
448
+ # Get timeline
449
+ timeline = tracker.get_application_timeline("JOB_001")
450
+ print("Application Timeline:")
451
+ for event in timeline:
452
+ print(f" {event['date']}: {event['event']}")
453
+
454
+ # Get active applications
455
+ active = tracker.get_active_applications()
456
+ print(f"\nActive Applications: {len(active)}")
457
+
458
+ # Analyze patterns
459
+ patterns = tracker.analyze_patterns()
460
+ print(f"\nPatterns: {patterns}")
461
+
462
+
463
+ if __name__ == "__main__":
464
+ demo_temporal_tracking()
app.py ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Multi-Agent Job Application Assistant - HuggingFace Spaces Deployment
4
+ Production-ready system with Gemini 2.5 Flash, A2A Protocol, and MCP Integration
5
+ Features: Resume/Cover Letter Generation, Job Matching, Document Export, Advanced AI Agents
6
+ """
7
+
8
+ # Use the hf_app.py as the main app for HuggingFace Spaces
9
+ from hf_app import *
10
+
11
+ if __name__ == "__main__":
12
+ # Configure for HuggingFace Spaces deployment
13
+ import os
14
+
15
+ # Set up HF-specific configurations
16
+ os.environ.setdefault("GRADIO_SERVER_NAME", "0.0.0.0")
17
+ os.environ.setdefault("GRADIO_SERVER_PORT", str(os.getenv("PORT", "7860")))
18
+
19
+ print("πŸš€ Starting Multi-Agent Job Application Assistant on HuggingFace Spaces")
20
+ print("=" * 70)
21
+ print("Features:")
22
+ print("βœ… Gemini 2.5 Flash AI Generation")
23
+ print("βœ… Advanced Multi-Agent System (A2A Protocol)")
24
+ print("βœ… Resume & Cover Letter Generation")
25
+ print("βœ… Job Matching & Research")
26
+ print("βœ… Document Export (Word/PowerPoint/Excel)")
27
+ print("βœ… MCP Server Integration")
28
+ print("=" * 70)
29
+
30
+ try:
31
+ app = build_app()
32
+ app.launch(
33
+ server_name="0.0.0.0",
34
+ server_port=int(os.getenv("PORT", 7860)),
35
+ share=False,
36
+ show_error=True,
37
+ mcp_server=True # Enable MCP server for HuggingFace Spaces
38
+ )
39
+ except Exception as e:
40
+ print(f"❌ Startup Error: {e}")
41
+ print("\nπŸ”§ Troubleshooting:")
42
+ print("1. Check environment variables in Space settings")
43
+ print("2. Verify all dependencies in requirements.txt")
44
+ print("3. Check logs for detailed error information")
45
+
46
+ # Fallback: Simple demo interface
47
+ print("\nπŸ”„ Starting simplified interface...")
48
+ import gradio as gr
49
+
50
+ def simple_demo():
51
+ return "Multi-Agent Job Application Assistant is initializing. Please check back in a moment."
52
+
53
+ demo = gr.Interface(
54
+ fn=simple_demo,
55
+ inputs=gr.Textbox(label="Status Check"),
56
+ outputs=gr.Textbox(label="System Status"),
57
+ title="πŸš€ Job Application Assistant",
58
+ description="Production-ready multi-agent system for job applications"
59
+ )
60
+
61
+ demo.launch(
62
+ server_name="0.0.0.0",
63
+ server_port=int(os.getenv("PORT", 7860)),
64
+ share=False
65
+ )
hf_app.py ADDED
@@ -0,0 +1,1613 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Multi-Agent Job Application Assistant - HuggingFace Spaces Deployment
4
+ Production-ready system with Gemini 2.5 Flash, A2A Protocol, and MCP Integration
5
+ Features: Resume/Cover Letter Generation, Job Matching, Document Export, Advanced AI Agents
6
+ """
7
+
8
+ import os
9
+ import uuid
10
+ import time
11
+ import logging
12
+ import asyncio
13
+ from typing import List, Optional, Dict, Any
14
+ from dataclasses import dataclass, field
15
+ import webbrowser
16
+ from datetime import datetime, timedelta
17
+ import json
18
+ from pathlib import Path
19
+
20
+ import gradio as gr
21
+ from dotenv import load_dotenv
22
+ import nest_asyncio
23
+
24
+ # Apply nest_asyncio for async support in Gradio
25
+ try:
26
+ nest_asyncio.apply()
27
+ except:
28
+ pass
29
+
30
+ # Load environment variables
31
+ load_dotenv(override=True)
32
+
33
+ # Configure logging
34
+ logging.basicConfig(level=logging.INFO)
35
+ logger = logging.getLogger(__name__)
36
+
37
+ # =======================
38
+ # Try to import from system, fall back to standalone mode if not available
39
+ # =======================
40
+
41
+ USE_SYSTEM_AGENTS = True
42
+ ADVANCED_FEATURES = False
43
+ LANGEXTRACT_AVAILABLE = False
44
+
45
+ try:
46
+ from agents.orchestrator import OrchestratorAgent
47
+ from models.schemas import JobPosting, OrchestrationResult
48
+ logger.info("System agents loaded - full functionality available")
49
+
50
+ # Try to import LangExtract service
51
+ try:
52
+ from services.langextract_service import (
53
+ extract_job_info,
54
+ extract_ats_keywords,
55
+ optimize_for_ats,
56
+ create_extraction_summary,
57
+ create_ats_report
58
+ )
59
+ LANGEXTRACT_AVAILABLE = True
60
+ logger.info("πŸ“Š LangExtract service loaded for enhanced extraction")
61
+ except ImportError:
62
+ LANGEXTRACT_AVAILABLE = False
63
+
64
+ # Try to import advanced AI agent features
65
+ try:
66
+ from agents.parallel_executor import ParallelAgentExecutor, ParallelJobProcessor, MetaAgent
67
+ from agents.temporal_tracker import TemporalApplicationTracker, TemporalKnowledgeGraph
68
+ from agents.observability import AgentTracer, AgentMonitor, TriageAgent, global_tracer
69
+ from agents.context_engineer import ContextEngineer, DataFlywheel
70
+ from agents.context_scaler import ContextScalingOrchestrator
71
+ ADVANCED_FEATURES = True
72
+ logger.info("✨ Advanced AI agent features loaded successfully!")
73
+ except ImportError as e:
74
+ logger.info(f"Advanced features not available: {e}")
75
+
76
+ # Try to import knowledge graph service
77
+ try:
78
+ from services.knowledge_graph_service import get_knowledge_graph_service
79
+ kg_service = get_knowledge_graph_service()
80
+ KG_AVAILABLE = kg_service.is_enabled()
81
+ if KG_AVAILABLE:
82
+ logger.info("πŸ“Š Knowledge Graph service initialized - tracking enabled")
83
+ except ImportError:
84
+ KG_AVAILABLE = False
85
+ kg_service = None
86
+ logger.info("Knowledge graph service not available")
87
+
88
+ USE_SYSTEM_AGENTS = True
89
+
90
+ except ImportError:
91
+ logger.info("Running in standalone mode - using simplified agents")
92
+ USE_SYSTEM_AGENTS = False
93
+
94
+ # Define minimal data structures for standalone operation
95
+ @dataclass
96
+ class JobPosting:
97
+ id: str
98
+ title: str
99
+ company: str
100
+ description: str
101
+ location: Optional[str] = None
102
+ url: Optional[str] = None
103
+ source: Optional[str] = None
104
+ saved_by_user: bool = False
105
+
106
+ @dataclass
107
+ class ResumeDraft:
108
+ job_id: str
109
+ text: str
110
+ keywords_used: List[str] = field(default_factory=list)
111
+
112
+ @dataclass
113
+ class CoverLetterDraft:
114
+ job_id: str
115
+ text: str
116
+ keywords_used: List[str] = field(default_factory=list)
117
+
118
+ @dataclass
119
+ class OrchestrationResult:
120
+ job: JobPosting
121
+ resume: ResumeDraft
122
+ cover_letter: CoverLetterDraft
123
+ metrics: Optional[Dict[str, Any]] = None
124
+
125
+ # Simplified orchestrator for standalone operation
126
+ class OrchestratorAgent:
127
+ def __init__(self):
128
+ self.mock_jobs = [
129
+ JobPosting(
130
+ id="example_1",
131
+ title="Senior Software Engineer",
132
+ company="Tech Corp",
133
+ location="Remote",
134
+ description="We need a Senior Software Engineer with Python, AWS, Docker experience.",
135
+ saved_by_user=True
136
+ )
137
+ ]
138
+
139
+ def get_saved_jobs(self):
140
+ return self.mock_jobs
141
+
142
+ def run_for_jobs(self, jobs, **kwargs):
143
+ results = []
144
+ for job in jobs:
145
+ resume = ResumeDraft(
146
+ job_id=job.id,
147
+ text=f"Professional Resume for {job.title}\n\nExperienced professional with skills matching {job.company} requirements.",
148
+ keywords_used=["Python", "AWS", "Docker"]
149
+ )
150
+ cover = CoverLetterDraft(
151
+ job_id=job.id,
152
+ text=f"Dear Hiring Manager,\n\nI am excited to apply for the {job.title} position at {job.company}.",
153
+ keywords_used=["leadership", "innovation"]
154
+ )
155
+ results.append(OrchestrationResult(
156
+ job=job,
157
+ resume=resume,
158
+ cover_letter=cover,
159
+ metrics={
160
+ "salary": {"USD": {"low": 100000, "high": 150000}},
161
+ "p_resume": 0.75,
162
+ "p_cover": 0.80,
163
+ "overall_p": 0.60
164
+ }
165
+ ))
166
+ return results
167
+
168
+ def regenerate_for_job(self, job, **kwargs):
169
+ return self.run_for_jobs([job], **kwargs)[0]
170
+
171
+ # Initialize orchestrator and advanced features
172
+ try:
173
+ orch = OrchestratorAgent()
174
+ logger.info("Orchestrator initialized successfully")
175
+
176
+ # Initialize advanced features if available
177
+ if ADVANCED_FEATURES:
178
+ # Initialize parallel executor
179
+ parallel_executor = ParallelAgentExecutor(max_workers=4)
180
+ parallel_processor = ParallelJobProcessor()
181
+ meta_agent = MetaAgent()
182
+
183
+ # Initialize temporal tracker
184
+ temporal_tracker = TemporalApplicationTracker()
185
+
186
+ # Initialize observability
187
+ agent_tracer = AgentTracer()
188
+ agent_monitor = AgentMonitor()
189
+ triage_agent = TriageAgent(agent_tracer)
190
+
191
+ # Initialize context engineering
192
+ context_engineer = ContextEngineer()
193
+ context_scaler = ContextScalingOrchestrator()
194
+
195
+ logger.info("βœ… All advanced AI agent features initialized")
196
+ else:
197
+ parallel_executor = None
198
+ temporal_tracker = None
199
+ agent_tracer = None
200
+ context_engineer = None
201
+
202
+ except Exception as e:
203
+ logger.error(f"Failed to initialize orchestrator: {e}")
204
+ raise
205
+
206
+ # Session state
207
+ STATE = {
208
+ "user_id": "default_user",
209
+ "cv_seed": None,
210
+ "cover_seed": None,
211
+ "agent2_notes": "",
212
+ "custom_jobs": [],
213
+ "cv_chat": "",
214
+ "cover_chat": "",
215
+ "results": [],
216
+ "inspiration_url": "https://www.careeraddict.com/7-funniest-cover-letters",
217
+ "use_inspiration": False,
218
+ "linkedin_authenticated": False,
219
+ "linkedin_profile": None,
220
+ "parallel_mode": False,
221
+ "track_applications": True,
222
+ "enable_observability": True,
223
+ "use_context_engineering": True,
224
+ "execution_timeline": None,
225
+ "application_history": [],
226
+ }
227
+
228
+ # Check LinkedIn OAuth configuration
229
+ LINKEDIN_CLIENT_ID = os.getenv("LINKEDIN_CLIENT_ID")
230
+ LINKEDIN_CLIENT_SECRET = os.getenv("LINKEDIN_CLIENT_SECRET")
231
+ MOCK_MODE = os.getenv("MOCK_MODE", "true").lower() == "true"
232
+
233
+ # Check Adzuna configuration
234
+ ADZUNA_APP_ID = os.getenv("ADZUNA_APP_ID")
235
+ ADZUNA_APP_KEY = os.getenv("ADZUNA_APP_KEY")
236
+
237
+
238
+ def add_custom_job(title: str, company: str, location: str, url: str, desc: str):
239
+ """Add a custom job with validation"""
240
+ try:
241
+ if not title or not company or not desc:
242
+ return gr.update(value="❌ Title, Company, and Description are required"), None
243
+
244
+ job = JobPosting(
245
+ id=f"custom_{uuid.uuid4().hex[:8]}",
246
+ title=title.strip(),
247
+ company=company.strip(),
248
+ location=location.strip() if location else None,
249
+ description=desc.strip(),
250
+ url=url.strip() if url else None,
251
+ source="custom",
252
+ saved_by_user=True,
253
+ )
254
+ STATE["custom_jobs"].append(job)
255
+ logger.info(f"Added custom job: {job.title} at {job.company}")
256
+ return gr.update(value=f"βœ… Added: {job.title} at {job.company}"), ""
257
+ except Exception as e:
258
+ logger.error(f"Error adding job: {e}")
259
+ return gr.update(value=f"❌ Error: {str(e)}"), None
260
+
261
+
262
+ def get_linkedin_auth_url():
263
+ """Get LinkedIn OAuth URL"""
264
+ if USE_SYSTEM_AGENTS and not MOCK_MODE and LINKEDIN_CLIENT_ID:
265
+ try:
266
+ from services.linkedin_client import LinkedInClient
267
+ client = LinkedInClient()
268
+ return client.get_authorize_url()
269
+ except Exception as e:
270
+ logger.error(f"LinkedIn OAuth error: {e}")
271
+ return None
272
+
273
+
274
+ def linkedin_login():
275
+ """Handle LinkedIn login"""
276
+ auth_url = get_linkedin_auth_url()
277
+ if auth_url:
278
+ webbrowser.open(auth_url)
279
+ return "βœ… Opening LinkedIn login in browser...", True
280
+ else:
281
+ return "⚠️ LinkedIn OAuth not configured or in mock mode", False
282
+
283
+
284
+ def search_adzuna_jobs(query: str = "Software Engineer", location: str = "London"):
285
+ """Search jobs using Adzuna API"""
286
+ if ADZUNA_APP_ID and ADZUNA_APP_KEY:
287
+ try:
288
+ from services.job_aggregator import JobAggregator
289
+ aggregator = JobAggregator()
290
+
291
+ # Handle SSL issues for corporate networks
292
+ import requests
293
+ import urllib3
294
+ old_get = requests.get
295
+ def patched_get(*args, **kwargs):
296
+ if 'adzuna' in str(args[0]):
297
+ kwargs['verify'] = False
298
+ urllib3.disable_warnings()
299
+ return old_get(*args, **kwargs)
300
+ requests.get = patched_get
301
+
302
+ jobs = aggregator.search_adzuna(query, location)
303
+ return jobs, f"βœ… Found {len(jobs)} jobs from Adzuna"
304
+ except Exception as e:
305
+ logger.error(f"Adzuna search error: {e}")
306
+ return [], f"❌ Adzuna search failed: {str(e)}"
307
+ return [], "⚠️ Adzuna API not configured"
308
+
309
+
310
+ def list_jobs_options():
311
+ """Get list of available jobs with enhanced sources"""
312
+ try:
313
+ all_jobs = []
314
+
315
+ # Get LinkedIn/mock jobs
316
+ saved_jobs = orch.get_saved_jobs()
317
+ all_jobs.extend(saved_jobs)
318
+
319
+ # Add custom jobs
320
+ custom_jobs = STATE.get("custom_jobs", [])
321
+ all_jobs.extend(custom_jobs)
322
+
323
+ # Try to add Adzuna jobs if configured
324
+ if ADZUNA_APP_ID and ADZUNA_APP_KEY:
325
+ adzuna_jobs, _ = search_adzuna_jobs("Software Engineer", "Remote")
326
+ all_jobs.extend(adzuna_jobs[:10]) # Add top 10 Adzuna jobs
327
+
328
+ labels = [f"{j.title} β€” {j.company} ({j.location or 'N/A'}) [{j.source or 'custom'}]" for j in all_jobs]
329
+ return labels
330
+ except Exception as e:
331
+ logger.error(f"Error listing jobs: {e}")
332
+ return []
333
+
334
+
335
+ def generate(selected_labels: List[str]):
336
+ """Generate documents with advanced AI features"""
337
+ try:
338
+ if not selected_labels:
339
+ return "⚠️ Please select at least one job to process", None, gr.update(visible=False), gr.update(visible=False), gr.update(visible=False)
340
+
341
+ # Triage the request if observability is enabled
342
+ if ADVANCED_FEATURES and STATE.get("enable_observability") and agent_tracer:
343
+ routing = triage_agent.triage_request(f"Generate documents for {len(selected_labels)} jobs")
344
+ logger.info(f"Triage routing: {routing}")
345
+
346
+ # Map labels to job objects
347
+ all_jobs = orch.get_saved_jobs() + STATE.get("custom_jobs", [])
348
+
349
+ # Update label mapping to handle source tags
350
+ label_to_job = {}
351
+ for j in all_jobs:
352
+ label = f"{j.title} β€” {j.company} ({j.location or 'N/A'})"
353
+ label_with_source = f"{label} [{j.source or 'custom'}]"
354
+ # Map both versions
355
+ label_to_job[label] = j
356
+ label_to_job[label_with_source] = j
357
+
358
+ jobs = [label_to_job[l] for l in selected_labels if l in label_to_job]
359
+
360
+ if not jobs:
361
+ return "❌ No valid jobs found", None, None
362
+
363
+ logger.info(f"Generating documents for {len(jobs)} jobs")
364
+
365
+ # Use context engineering if enabled
366
+ if ADVANCED_FEATURES and STATE.get("use_context_engineering") and context_engineer:
367
+ for job in jobs:
368
+ # Engineer optimal context for each job
369
+ context = context_engineer.engineer_context(
370
+ query=f"Generate resume and cover letter for {job.title} at {job.company}",
371
+ raw_sources=[
372
+ ("job_description", job.description),
373
+ ("cv_seed", STATE.get("cv_seed") or ""),
374
+ ("notes", STATE.get("agent2_notes") or "")
375
+ ]
376
+ )
377
+ # Store engineered context
378
+ job.metadata = job.metadata or {}
379
+ job.metadata['engineered_context'] = context
380
+
381
+ # Run generation (parallel or sequential)
382
+ start = time.time()
383
+
384
+ if ADVANCED_FEATURES and STATE.get("parallel_mode") and parallel_executor:
385
+ # Use parallel processing
386
+ logger.info("Using parallel processing for document generation")
387
+ results = asyncio.run(parallel_processor.process_jobs_parallel(
388
+ jobs=jobs,
389
+ cv_agent_func=lambda j: orch.cv_agent.get_draft(j, STATE.get("cv_seed")),
390
+ cover_agent_func=lambda j: orch.cover_letter_agent.get_draft(j, STATE.get("cover_seed"))
391
+ ))
392
+ else:
393
+ # Standard sequential processing
394
+ results = orch.run_for_jobs(
395
+ jobs,
396
+ user_id=STATE.get("user_id", "default_user"),
397
+ cv_chat=STATE.get("cv_chat"),
398
+ cover_chat=STATE.get("cover_chat"),
399
+ cv_seed=STATE.get("cv_seed"),
400
+ cover_seed=STATE.get("cover_seed"),
401
+ agent2_notes=STATE.get("agent2_notes"),
402
+ inspiration_url=(STATE.get("inspiration_url") if STATE.get("use_inspiration") else None),
403
+ )
404
+
405
+ total_time = time.time() - start
406
+ STATE["results"] = results
407
+
408
+ # Track applications temporally if enabled
409
+ if ADVANCED_FEATURES and STATE.get("track_applications") and temporal_tracker:
410
+ for result in results:
411
+ temporal_tracker.track_application(result.job, "generated", {
412
+ 'generation_time': total_time,
413
+ 'parallel_mode': STATE.get("parallel_mode", False)
414
+ })
415
+
416
+ # Track in knowledge graph if available
417
+ if 'kg_service' in globals() and kg_service and kg_service.is_enabled():
418
+ for result in results:
419
+ try:
420
+ # Extract skills from job description
421
+ skills = []
422
+ if hasattr(result, 'matched_keywords'):
423
+ skills = result.matched_keywords
424
+ elif hasattr(result.job, 'description'):
425
+ # Simple skill extraction from job description
426
+ common_skills = ['python', 'java', 'javascript', 'react', 'node',
427
+ 'aws', 'azure', 'docker', 'kubernetes', 'sql',
428
+ 'machine learning', 'ai', 'data science']
429
+ job_desc_lower = result.job.description.lower()
430
+ skills = [s for s in common_skills if s in job_desc_lower]
431
+
432
+ # Track the application
433
+ kg_service.track_application(
434
+ user_name=STATE.get("user_name", "User"),
435
+ company=result.job.company,
436
+ job_title=result.job.title,
437
+ job_description=result.job.description,
438
+ cv_text=result.resume.text,
439
+ cover_letter=result.cover_letter.text,
440
+ skills_matched=skills,
441
+ score=getattr(result, 'match_score', 0.0)
442
+ )
443
+ logger.info(f"Tracked application in knowledge graph: {result.job.title} @ {result.job.company}")
444
+ except Exception as e:
445
+ logger.warning(f"Failed to track in knowledge graph: {e}")
446
+
447
+ # Record to context engineering flywheel
448
+ if ADVANCED_FEATURES and context_engineer:
449
+ for result in results:
450
+ if hasattr(result.job, 'metadata') and 'engineered_context' in result.job.metadata:
451
+ context_engineer.record_feedback(
452
+ result.job.metadata['engineered_context'],
453
+ result.resume.text[:500], # Sample output
454
+ 0.8 # Success score (could be calculated)
455
+ )
456
+
457
+ # Build preview
458
+ blocks = [f"βœ… Generated {len(results)} documents in {total_time:.2f}s\n"]
459
+ pptx_buttons = []
460
+
461
+ for i, res in enumerate(results):
462
+ blocks.append(f"### πŸ“„ {res.job.title} β€” {res.job.company}")
463
+ blocks.append("**Resume Preview:**")
464
+ blocks.append("```")
465
+ blocks.append(res.resume.text[:1500] + "...")
466
+ blocks.append("```")
467
+ blocks.append("\n**Cover Letter Preview:**")
468
+ blocks.append("```")
469
+ blocks.append(res.cover_letter.text[:1000] + "...")
470
+ blocks.append("```")
471
+
472
+ # Add PowerPoint export option
473
+ blocks.append(f"\n**[πŸ“Š Export as PowerPoint CV - Job #{i+1}]**")
474
+ pptx_buttons.append((res.resume, res.job))
475
+
476
+ STATE["pptx_candidates"] = pptx_buttons
477
+ return "\n".join(blocks), total_time, gr.update(visible=True), gr.update(visible=True), gr.update(visible=True)
478
+
479
+ except Exception as e:
480
+ logger.error(f"Error generating documents: {e}")
481
+ return f"❌ Error: {str(e)}", None, gr.update(visible=False), gr.update(visible=False), gr.update(visible=False)
482
+
483
+
484
+ def regenerate_one(job_label: str):
485
+ """Regenerate documents for a single job"""
486
+ try:
487
+ if not job_label:
488
+ return "⚠️ Please select a job to regenerate", None
489
+
490
+ all_jobs = orch.get_saved_jobs() + STATE.get("custom_jobs", [])
491
+ label_to_job = {f"{j.title} β€” {j.company} ({j.location or 'N/A'})": j for j in all_jobs}
492
+ job = label_to_job.get(job_label)
493
+
494
+ if not job:
495
+ return f"❌ Job not found: {job_label}", None
496
+
497
+ start = time.time()
498
+ result = orch.regenerate_for_job(
499
+ job,
500
+ user_id=STATE.get("user_id", "default_user"),
501
+ cv_chat=STATE.get("cv_chat"),
502
+ cover_chat=STATE.get("cover_chat"),
503
+ cv_seed=STATE.get("cv_seed"),
504
+ cover_seed=STATE.get("cover_seed"),
505
+ agent2_notes=STATE.get("agent2_notes"),
506
+ inspiration_url=(STATE.get("inspiration_url") if STATE.get("use_inspiration") else None),
507
+ )
508
+ elapsed = time.time() - start
509
+
510
+ # Update state
511
+ new_results = []
512
+ for r in STATE.get("results", []):
513
+ if r.job.id == job.id:
514
+ new_results.append(result)
515
+ else:
516
+ new_results.append(r)
517
+ STATE["results"] = new_results
518
+
519
+ preview = f"### πŸ”„ Regenerated: {result.job.title} β€” {result.job.company}\n\n"
520
+ preview += "**Resume:**\n```\n" + result.resume.text[:1500] + "\n...```\n\n"
521
+ preview += "**Cover Letter:**\n```\n" + result.cover_letter.text[:1000] + "\n...```"
522
+
523
+ return preview, elapsed
524
+
525
+ except Exception as e:
526
+ logger.error(f"Error regenerating: {e}")
527
+ return f"❌ Error: {str(e)}", None
528
+
529
+
530
+ def export_to_powerpoint(job_index: int, template: str = "modern_blue"):
531
+ """Export resume to PowerPoint CV"""
532
+ try:
533
+ candidates = STATE.get("pptx_candidates", [])
534
+ if not candidates or job_index >= len(candidates):
535
+ return "❌ No resume available for export", None
536
+
537
+ resume, job = candidates[job_index]
538
+
539
+ # Import the PowerPoint CV generator
540
+ try:
541
+ from services.powerpoint_cv import convert_resume_to_powerpoint
542
+ pptx_path = convert_resume_to_powerpoint(resume, job, template)
543
+ if pptx_path:
544
+ return f"βœ… PowerPoint CV created: {pptx_path}", pptx_path
545
+ except ImportError:
546
+ # Fallback to local generation
547
+ from pptx import Presentation
548
+ from pptx.util import Inches, Pt
549
+
550
+ prs = Presentation()
551
+
552
+ # Title slide
553
+ slide = prs.slides.add_slide(prs.slide_layouts[0])
554
+ slide.shapes.title.text = resume.sections.get("name", "Professional CV")
555
+ slide.placeholders[1].text = f"{resume.sections.get('title', '')}\n{resume.sections.get('email', '')}"
556
+
557
+ # Summary slide
558
+ slide = prs.slides.add_slide(prs.slide_layouts[1])
559
+ slide.shapes.title.text = "Professional Summary"
560
+ slide.placeholders[1].text = resume.sections.get("summary", "")[:500]
561
+
562
+ # Experience slide
563
+ slide = prs.slides.add_slide(prs.slide_layouts[1])
564
+ slide.shapes.title.text = "Professional Experience"
565
+ exp_text = []
566
+ for exp in resume.sections.get("experience", [])[:3]:
567
+ exp_text.append(f"β€’ {exp.get('title', '')} @ {exp.get('company', '')}")
568
+ exp_text.append(f" {exp.get('dates', '')}")
569
+ slide.placeholders[1].text = "\n".join(exp_text)
570
+
571
+ # Skills slide
572
+ slide = prs.slides.add_slide(prs.slide_layouts[1])
573
+ slide.shapes.title.text = "Core Skills"
574
+ skills_text = []
575
+ for category, items in resume.sections.get("skills", {}).items():
576
+ if isinstance(items, list):
577
+ skills_text.append(f"{category}: {', '.join(items[:5])}")
578
+ slide.placeholders[1].text = "\n".join(skills_text)
579
+
580
+ # Save
581
+ output_path = f"cv_{job.company.replace(' ', '_')}_{template}.pptx"
582
+ prs.save(output_path)
583
+ return f"βœ… PowerPoint CV created: {output_path}", output_path
584
+
585
+ except Exception as e:
586
+ logger.error(f"PowerPoint export error: {e}")
587
+ return f"❌ Export failed: {str(e)}", None
588
+
589
+
590
+ def extract_from_powerpoint(file_path: str):
591
+ """Extract content from uploaded PowerPoint"""
592
+ try:
593
+ from pptx import Presentation
594
+
595
+ prs = Presentation(file_path)
596
+ extracted_text = []
597
+
598
+ for slide in prs.slides:
599
+ for shape in slide.shapes:
600
+ if hasattr(shape, "text"):
601
+ text = shape.text.strip()
602
+ if text:
603
+ extracted_text.append(text)
604
+
605
+ combined_text = "\n".join(extracted_text)
606
+
607
+ # Use as CV seed
608
+ STATE["cv_seed"] = combined_text
609
+
610
+ return f"βœ… Extracted {len(extracted_text)} text blocks from PowerPoint\n\nPreview:\n{combined_text[:500]}..."
611
+
612
+ except Exception as e:
613
+ logger.error(f"PowerPoint extraction error: {e}")
614
+ return f"❌ Extraction failed: {str(e)}"
615
+
616
+
617
+ def summary_table():
618
+ """Generate summary table"""
619
+ try:
620
+ import pandas as pd
621
+ res = STATE.get("results", [])
622
+ if not res:
623
+ return pd.DataFrame({"Status": ["No results yet. Generate documents first."]})
624
+
625
+ rows = []
626
+ for r in res:
627
+ m = r.metrics or {}
628
+ sal = m.get("salary", {})
629
+
630
+ # Handle different salary formats
631
+ usd = sal.get("USD", {})
632
+ gbp = sal.get("GBP", {})
633
+
634
+ rows.append({
635
+ "Job": f"{r.job.title} β€” {r.job.company}",
636
+ "Location": r.job.location or "N/A",
637
+ "USD": f"${usd.get('low', 0):,}-${usd.get('high', 0):,}" if usd else "N/A",
638
+ "GBP": f"Β£{gbp.get('low', 0):,}-Β£{gbp.get('high', 0):,}" if gbp else "N/A",
639
+ "Resume Score": f"{m.get('p_resume', 0):.1%}",
640
+ "Cover Score": f"{m.get('p_cover', 0):.1%}",
641
+ "Overall": f"{m.get('overall_p', 0):.1%}",
642
+ })
643
+ return pd.DataFrame(rows)
644
+ except ImportError:
645
+ # If pandas not available, return simple dict
646
+ return {"Error": ["pandas not installed - table view unavailable"]}
647
+ except Exception as e:
648
+ logger.error(f"Error generating summary: {e}")
649
+ return {"Error": [str(e)]}
650
+
651
+
652
+ def build_app():
653
+ """Build the Gradio interface with LinkedIn OAuth and Adzuna integration"""
654
+ with gr.Blocks(
655
+ title="Job Application Assistant",
656
+ theme=gr.themes.Soft(),
657
+ css="""
658
+ .gradio-container { max-width: 1400px; margin: auto; }
659
+ """
660
+ ) as demo:
661
+ gr.Markdown("""
662
+ # πŸš€ Multi-Agent Job Application Assistant
663
+ ### AI-Powered Resume & Cover Letter Generation with ATS Optimization
664
+ ### Now with LinkedIn OAuth + Adzuna Job Search!
665
+ """)
666
+
667
+ # System Status
668
+ status_items = []
669
+ if USE_SYSTEM_AGENTS:
670
+ status_items.append("βœ… **Full System Mode**")
671
+ else:
672
+ status_items.append("⚠️ **Standalone Mode**")
673
+
674
+ if ADVANCED_FEATURES:
675
+ status_items.append("πŸš€ **Advanced AI Features**")
676
+
677
+ if LANGEXTRACT_AVAILABLE:
678
+ status_items.append("πŸ“Š **LangExtract Enhanced**")
679
+
680
+ if not MOCK_MODE and LINKEDIN_CLIENT_ID:
681
+ status_items.append("βœ… **LinkedIn OAuth Ready**")
682
+ else:
683
+ status_items.append("⚠️ **LinkedIn in Mock Mode**")
684
+
685
+ if ADZUNA_APP_ID and ADZUNA_APP_KEY:
686
+ status_items.append("βœ… **Adzuna API Active** (5000 jobs/month)")
687
+ else:
688
+ status_items.append("⚠️ **Adzuna Not Configured**")
689
+
690
+ gr.Markdown(" | ".join(status_items))
691
+
692
+ # Show advanced features if available
693
+ if ADVANCED_FEATURES:
694
+ advanced_features = []
695
+ if 'parallel_executor' in locals():
696
+ advanced_features.append("⚑ Parallel Processing")
697
+ if 'temporal_tracker' in locals():
698
+ advanced_features.append("πŸ“Š Temporal Tracking")
699
+ if 'agent_tracer' in locals():
700
+ advanced_features.append("πŸ” Observability")
701
+ if 'context_engineer' in locals():
702
+ advanced_features.append("🧠 Context Engineering")
703
+
704
+ if advanced_features:
705
+ gr.Markdown(f"**Advanced Features Available:** {' | '.join(advanced_features)}")
706
+
707
+ # Import enhanced UI components
708
+ try:
709
+ from services.enhanced_ui import (
710
+ create_enhanced_ui_components,
711
+ handle_resume_upload,
712
+ handle_linkedin_import,
713
+ handle_job_matching,
714
+ handle_document_export,
715
+ populate_ui_from_data,
716
+ format_job_matches_for_display,
717
+ generate_recommendations_markdown,
718
+ generate_skills_gap_analysis
719
+ )
720
+ ENHANCED_UI_AVAILABLE = True
721
+ except ImportError:
722
+ ENHANCED_UI_AVAILABLE = False
723
+ logger.warning("Enhanced UI components not available")
724
+
725
+ with gr.Row():
726
+ # Left column - Configuration
727
+ with gr.Column(scale=2):
728
+ gr.Markdown("## βš™οΈ Configuration")
729
+
730
+ # Enhanced Resume Upload Section (if available)
731
+ if ENHANCED_UI_AVAILABLE:
732
+ ui_components = create_enhanced_ui_components()
733
+
734
+ # Create a wrapper function that properly handles the response
735
+ def process_resume_and_populate(file_path):
736
+ """Process resume upload and return extracted data for UI fields"""
737
+ if not file_path:
738
+ return populate_ui_from_data({})
739
+
740
+ try:
741
+ # Call handle_resume_upload to extract data
742
+ response = handle_resume_upload(file_path)
743
+
744
+ # Extract the data from the response
745
+ if response and isinstance(response, dict):
746
+ data = response.get('data', {})
747
+ # Return the populated fields
748
+ return populate_ui_from_data(data)
749
+ else:
750
+ return populate_ui_from_data({})
751
+ except Exception as e:
752
+ logger.error(f"Error processing resume: {e}")
753
+ return populate_ui_from_data({})
754
+
755
+ # Wire up the handlers - single function call
756
+ ui_components['extract_btn'].click(
757
+ fn=process_resume_and_populate,
758
+ inputs=[ui_components['resume_upload']],
759
+ outputs=[
760
+ ui_components['contact_name'],
761
+ ui_components['contact_email'],
762
+ ui_components['contact_phone'],
763
+ ui_components['contact_linkedin'],
764
+ ui_components['contact_location'],
765
+ ui_components['summary_text'],
766
+ ui_components['experience_data'],
767
+ ui_components['skills_list'],
768
+ ui_components['education_data']
769
+ ]
770
+ )
771
+
772
+ ui_components['linkedin_auto_fill'].click(
773
+ fn=handle_linkedin_import,
774
+ inputs=[ui_components['linkedin_url'], gr.State()],
775
+ outputs=[gr.State()]
776
+ ).then(
777
+ fn=populate_ui_from_data,
778
+ inputs=[gr.State()],
779
+ outputs=[
780
+ ui_components['contact_name'],
781
+ ui_components['contact_email'],
782
+ ui_components['contact_phone'],
783
+ ui_components['contact_linkedin'],
784
+ ui_components['contact_location'],
785
+ ui_components['summary_text'],
786
+ ui_components['experience_data'],
787
+ ui_components['skills_list'],
788
+ ui_components['education_data']
789
+ ]
790
+ )
791
+
792
+ # LinkedIn OAuth Section (keep existing)
793
+ elif not MOCK_MODE and LINKEDIN_CLIENT_ID:
794
+ with gr.Accordion("πŸ” LinkedIn Authentication", open=True):
795
+ linkedin_status = gr.Textbox(
796
+ label="Status",
797
+ value="Not authenticated",
798
+ interactive=False
799
+ )
800
+ linkedin_btn = gr.Button("πŸ”— Sign in with LinkedIn", variant="primary")
801
+ linkedin_btn.click(
802
+ fn=linkedin_login,
803
+ outputs=[linkedin_status, gr.State()]
804
+ )
805
+
806
+ # Advanced AI Features Section
807
+ if ADVANCED_FEATURES:
808
+ with gr.Accordion("πŸš€ Advanced AI Features", open=True):
809
+ gr.Markdown("### AI Agent Enhancements")
810
+
811
+ with gr.Row():
812
+ parallel_mode = gr.Checkbox(
813
+ label="⚑ Parallel Processing (3-5x faster)",
814
+ value=STATE.get("parallel_mode", False)
815
+ )
816
+ track_apps = gr.Checkbox(
817
+ label="πŸ“Š Temporal Tracking",
818
+ value=STATE.get("track_applications", True)
819
+ )
820
+
821
+ with gr.Row():
822
+ observability = gr.Checkbox(
823
+ label="πŸ” Observability & Tracing",
824
+ value=STATE.get("enable_observability", True)
825
+ )
826
+ context_eng = gr.Checkbox(
827
+ label="🧠 Context Engineering",
828
+ value=STATE.get("use_context_engineering", True)
829
+ )
830
+
831
+ def update_features(parallel, track, observe, context):
832
+ STATE["parallel_mode"] = parallel
833
+ STATE["track_applications"] = track
834
+ STATE["enable_observability"] = observe
835
+ STATE["use_context_engineering"] = context
836
+
837
+ features = []
838
+ if parallel: features.append("Parallel")
839
+ if track: features.append("Tracking")
840
+ if observe: features.append("Observability")
841
+ if context: features.append("Context Engineering")
842
+
843
+ return f"βœ… Features enabled: {', '.join(features) if features else 'None'}"
844
+
845
+ features_status = gr.Textbox(label="Features Status", interactive=False)
846
+
847
+ parallel_mode.change(
848
+ fn=lambda p: update_features(p, track_apps.value, observability.value, context_eng.value),
849
+ inputs=[parallel_mode],
850
+ outputs=features_status
851
+ )
852
+ track_apps.change(
853
+ fn=lambda t: update_features(parallel_mode.value, t, observability.value, context_eng.value),
854
+ inputs=[track_apps],
855
+ outputs=features_status
856
+ )
857
+ observability.change(
858
+ fn=lambda o: update_features(parallel_mode.value, track_apps.value, o, context_eng.value),
859
+ inputs=[observability],
860
+ outputs=features_status
861
+ )
862
+ context_eng.change(
863
+ fn=lambda c: update_features(parallel_mode.value, track_apps.value, observability.value, c),
864
+ inputs=[context_eng],
865
+ outputs=features_status
866
+ )
867
+
868
+ with gr.Accordion("πŸ“ Profile & Notes", open=True):
869
+ agent2_notes = gr.Textbox(
870
+ label="Additional Context",
871
+ value=STATE["agent2_notes"],
872
+ lines=4,
873
+ placeholder="E.g., visa requirements, years of experience, preferred technologies..."
874
+ )
875
+ def set_notes(n):
876
+ STATE["agent2_notes"] = n or ""
877
+ return "βœ… Notes saved"
878
+ notes_result = gr.Textbox(label="Status", interactive=False)
879
+ agent2_notes.change(set_notes, inputs=agent2_notes, outputs=notes_result)
880
+
881
+ with gr.Accordion("πŸ“„ Resume Settings", open=False):
882
+ cv_chat = gr.Textbox(
883
+ label="Resume Instructions",
884
+ value=STATE["cv_chat"],
885
+ lines=3,
886
+ placeholder="E.g., Emphasize leadership experience..."
887
+ )
888
+
889
+ # PowerPoint Upload
890
+ gr.Markdown("### πŸ“Š Upload PowerPoint to Extract Content")
891
+ pptx_upload = gr.File(
892
+ label="Upload PowerPoint (.pptx)",
893
+ file_types=[".pptx"],
894
+ type="filepath"
895
+ )
896
+ pptx_extract_btn = gr.Button("πŸ“₯ Extract from PowerPoint")
897
+ pptx_extract_status = gr.Textbox(label="Extraction Status", interactive=False)
898
+
899
+ cv_seed = gr.Textbox(
900
+ label="Resume Template (optional)",
901
+ value=STATE["cv_seed"] or "",
902
+ lines=10,
903
+ placeholder="Paste your existing resume here or extract from PowerPoint..."
904
+ )
905
+
906
+ def set_cv(c, s):
907
+ STATE["cv_chat"] = c or ""
908
+ STATE["cv_seed"] = s or None
909
+ return "βœ… Resume settings updated"
910
+
911
+ def handle_pptx_upload(file):
912
+ if file:
913
+ status = extract_from_powerpoint(file)
914
+ return status, STATE.get("cv_seed", "")
915
+ return "No file uploaded", STATE.get("cv_seed", "")
916
+
917
+ pptx_extract_btn.click(
918
+ fn=handle_pptx_upload,
919
+ inputs=pptx_upload,
920
+ outputs=[pptx_extract_status, cv_seed]
921
+ )
922
+
923
+ cv_info = gr.Textbox(label="Status", interactive=False)
924
+ cv_chat.change(lambda x: set_cv(x, cv_seed.value), inputs=cv_chat, outputs=cv_info)
925
+ cv_seed.change(lambda x: set_cv(cv_chat.value, x), inputs=cv_seed, outputs=cv_info)
926
+
927
+ with gr.Accordion("βœ‰οΈ Cover Letter Settings", open=False):
928
+ cover_chat = gr.Textbox(
929
+ label="Cover Letter Instructions",
930
+ value=STATE["cover_chat"],
931
+ lines=3,
932
+ placeholder="E.g., Professional tone, mention relocation..."
933
+ )
934
+ cover_seed = gr.Textbox(
935
+ label="Cover Letter Template (optional)",
936
+ value=STATE["cover_seed"] or "",
937
+ lines=10,
938
+ placeholder="Paste your existing cover letter here..."
939
+ )
940
+ def set_cover(c, s):
941
+ STATE["cover_chat"] = c or ""
942
+ STATE["cover_seed"] = s or None
943
+ return "βœ… Cover letter settings updated"
944
+ cover_info = gr.Textbox(label="Status", interactive=False)
945
+ cover_chat.change(lambda x: set_cover(x, cover_seed.value), inputs=cover_chat, outputs=cover_info)
946
+ cover_seed.change(lambda x: set_cover(cover_chat.value, x), inputs=cover_seed, outputs=cover_info)
947
+
948
+ gr.Markdown("## πŸ’Ό Jobs")
949
+
950
+ # Adzuna Job Search
951
+ if ADZUNA_APP_ID and ADZUNA_APP_KEY:
952
+ with gr.Accordion("πŸ” Search Adzuna Jobs", open=True):
953
+ with gr.Row():
954
+ adzuna_query = gr.Textbox(
955
+ label="Job Title",
956
+ value="Software Engineer",
957
+ placeholder="e.g., Python Developer"
958
+ )
959
+ adzuna_location = gr.Textbox(
960
+ label="Location",
961
+ value="London",
962
+ placeholder="e.g., New York, Remote"
963
+ )
964
+
965
+ adzuna_search_btn = gr.Button("πŸ” Search Adzuna", variant="primary")
966
+ adzuna_results = gr.Textbox(
967
+ label="Search Results",
968
+ lines=3,
969
+ interactive=False
970
+ )
971
+
972
+ def search_and_display(query, location):
973
+ jobs, message = search_adzuna_jobs(query, location)
974
+ # Add jobs to state
975
+ if jobs:
976
+ STATE["custom_jobs"].extend(jobs[:5]) # Add top 5 to available jobs
977
+ return message
978
+
979
+ adzuna_search_btn.click(
980
+ fn=search_and_display,
981
+ inputs=[adzuna_query, adzuna_location],
982
+ outputs=adzuna_results
983
+ )
984
+
985
+ with gr.Accordion("βž• Add Custom Job", open=True):
986
+ c_title = gr.Textbox(label="Job Title*", placeholder="e.g., Senior Software Engineer")
987
+ c_company = gr.Textbox(label="Company*", placeholder="e.g., Google")
988
+ c_loc = gr.Textbox(label="Location", placeholder="e.g., Remote, New York")
989
+ c_url = gr.Textbox(label="Job URL", placeholder="https://...")
990
+ c_desc = gr.Textbox(
991
+ label="Job Description*",
992
+ lines=8,
993
+ placeholder="Paste the complete job description here..."
994
+ )
995
+
996
+ with gr.Row():
997
+ add_job_btn = gr.Button("βž• Add Job", variant="primary")
998
+ load_example_btn = gr.Button("πŸ“ Load Example")
999
+
1000
+ add_job_info = gr.Textbox(label="Status", interactive=False)
1001
+
1002
+ def load_example():
1003
+ return (
1004
+ "Senior Software Engineer",
1005
+ "Tech Corp",
1006
+ "Remote",
1007
+ "",
1008
+ "We are looking for a Senior Software Engineer with 5+ years of experience in Python, AWS, and Docker. You will lead technical initiatives and build scalable systems."
1009
+ )
1010
+
1011
+ load_example_btn.click(
1012
+ fn=load_example,
1013
+ outputs=[c_title, c_company, c_loc, c_url, c_desc]
1014
+ )
1015
+
1016
+ add_job_btn.click(
1017
+ fn=add_custom_job,
1018
+ inputs=[c_title, c_company, c_loc, c_url, c_desc],
1019
+ outputs=[add_job_info, c_title]
1020
+ )
1021
+
1022
+ job_select = gr.CheckboxGroup(
1023
+ choices=list_jobs_options(),
1024
+ label="πŸ“‹ Select Jobs to Process"
1025
+ )
1026
+ refresh_jobs = gr.Button("πŸ”„ Refresh Job List")
1027
+ refresh_jobs.click(lambda: gr.update(choices=list_jobs_options()), outputs=job_select)
1028
+
1029
+ # Right column - Generation
1030
+ with gr.Column(scale=3):
1031
+ gr.Markdown("## πŸ“„ Document Generation")
1032
+
1033
+ gen_btn = gr.Button("πŸš€ Generate Documents", variant="primary", size="lg")
1034
+ out_preview = gr.Markdown("Ready to generate documents...")
1035
+ out_time = gr.Number(label="Processing Time (seconds)")
1036
+
1037
+ # PowerPoint Export Section
1038
+ with gr.Accordion("πŸ“Š Export to PowerPoint CV", open=False, visible=False) as pptx_section:
1039
+ gr.Markdown("### Convert your resume to a professional PowerPoint presentation")
1040
+ with gr.Row():
1041
+ pptx_job_select = gr.Number(
1042
+ label="Job Index (1, 2, 3...)",
1043
+ value=1,
1044
+ minimum=1,
1045
+ step=1
1046
+ )
1047
+ pptx_template = gr.Dropdown(
1048
+ choices=["modern_blue", "corporate_gray", "elegant_green", "warm_red"],
1049
+ value="modern_blue",
1050
+ label="Template Style"
1051
+ )
1052
+
1053
+ export_pptx_btn = gr.Button("πŸ“Š Create PowerPoint CV", variant="primary")
1054
+ pptx_status = gr.Textbox(label="Export Status", interactive=False)
1055
+ pptx_file = gr.File(label="Download PowerPoint", visible=False)
1056
+
1057
+ def handle_pptx_export(job_idx, template):
1058
+ status, file_path = export_to_powerpoint(int(job_idx) - 1, template)
1059
+ if file_path:
1060
+ return status, gr.update(visible=True, value=file_path)
1061
+ return status, gr.update(visible=False)
1062
+
1063
+ export_pptx_btn.click(
1064
+ fn=handle_pptx_export,
1065
+ inputs=[pptx_job_select, pptx_template],
1066
+ outputs=[pptx_status, pptx_file]
1067
+ )
1068
+
1069
+ # Word Document Export Section
1070
+ with gr.Accordion("πŸ“ Export to Word Documents", open=False, visible=False) as word_section:
1071
+ gr.Markdown("### Generate professional Word documents")
1072
+ with gr.Row():
1073
+ word_job_select = gr.Number(
1074
+ label="Job Index (1, 2, 3...)",
1075
+ value=1,
1076
+ minimum=1,
1077
+ step=1
1078
+ )
1079
+ word_template = gr.Dropdown(
1080
+ choices=["modern", "executive", "creative", "minimal", "academic"],
1081
+ value="modern",
1082
+ label="Document Style"
1083
+ )
1084
+
1085
+ with gr.Row():
1086
+ export_word_resume_btn = gr.Button("πŸ“„ Export Resume as Word", variant="primary")
1087
+ export_word_cover_btn = gr.Button("βœ‰οΈ Export Cover Letter as Word", variant="primary")
1088
+
1089
+ word_status = gr.Textbox(label="Export Status", interactive=False)
1090
+ word_files = gr.File(label="Download Word Documents", visible=False, file_count="multiple")
1091
+
1092
+ def handle_word_export(job_idx, template, doc_type="resume"):
1093
+ try:
1094
+ from services.word_cv import WordCVGenerator
1095
+ generator = WordCVGenerator()
1096
+
1097
+ candidates = STATE.get("pptx_candidates", [])
1098
+ if not candidates or job_idx > len(candidates):
1099
+ return "❌ No documents available", gr.update(visible=False)
1100
+
1101
+ resume, job = candidates[int(job_idx) - 1]
1102
+
1103
+ files = []
1104
+ if doc_type == "resume" or doc_type == "both":
1105
+ resume_path = generator.create_resume_document(resume, job, template)
1106
+ if resume_path:
1107
+ files.append(resume_path)
1108
+
1109
+ if doc_type == "cover" or doc_type == "both":
1110
+ # Get cover letter from results
1111
+ results = STATE.get("results", [])
1112
+ cover_letter = None
1113
+ for r in results:
1114
+ if r.job.id == job.id:
1115
+ cover_letter = r.cover_letter
1116
+ break
1117
+
1118
+ if cover_letter:
1119
+ cover_path = generator.create_cover_letter_document(cover_letter, job, template)
1120
+ if cover_path:
1121
+ files.append(cover_path)
1122
+
1123
+ if files:
1124
+ return f"βœ… Created {len(files)} Word document(s)", gr.update(visible=True, value=files)
1125
+ return "❌ Failed to create documents", gr.update(visible=False)
1126
+
1127
+ except Exception as e:
1128
+ return f"❌ Error: {str(e)}", gr.update(visible=False)
1129
+
1130
+ export_word_resume_btn.click(
1131
+ fn=lambda idx, tmpl: handle_word_export(idx, tmpl, "resume"),
1132
+ inputs=[word_job_select, word_template],
1133
+ outputs=[word_status, word_files]
1134
+ )
1135
+
1136
+ export_word_cover_btn.click(
1137
+ fn=lambda idx, tmpl: handle_word_export(idx, tmpl, "cover"),
1138
+ inputs=[word_job_select, word_template],
1139
+ outputs=[word_status, word_files]
1140
+ )
1141
+
1142
+ # Excel Tracker Export
1143
+ with gr.Accordion("πŸ“Š Export Excel Tracker", open=False, visible=False) as excel_section:
1144
+ gr.Markdown("### Create comprehensive job application tracker")
1145
+
1146
+ export_excel_btn = gr.Button("πŸ“ˆ Generate Excel Tracker", variant="primary")
1147
+ excel_status = gr.Textbox(label="Export Status", interactive=False)
1148
+ excel_file = gr.File(label="Download Excel Tracker", visible=False)
1149
+
1150
+ def handle_excel_export():
1151
+ try:
1152
+ from services.excel_tracker import ExcelTracker
1153
+ tracker = ExcelTracker()
1154
+
1155
+ results = STATE.get("results", [])
1156
+ if not results:
1157
+ return "❌ No results to track", gr.update(visible=False)
1158
+
1159
+ tracker_path = tracker.create_tracker(results)
1160
+ if tracker_path:
1161
+ return f"βœ… Excel tracker created with {len(results)} applications", gr.update(visible=True, value=tracker_path)
1162
+ return "❌ Failed to create tracker", gr.update(visible=False)
1163
+
1164
+ except Exception as e:
1165
+ return f"❌ Error: {str(e)}", gr.update(visible=False)
1166
+
1167
+ export_excel_btn.click(
1168
+ fn=handle_excel_export,
1169
+ outputs=[excel_status, excel_file]
1170
+ )
1171
+
1172
+ gen_btn.click(fn=generate, inputs=[job_select], outputs=[out_preview, out_time, pptx_section, word_section, excel_section])
1173
+
1174
+ gr.Markdown("## πŸ”„ Regenerate Individual Job")
1175
+
1176
+ with gr.Row():
1177
+ job_single = gr.Dropdown(choices=list_jobs_options(), label="Select Job")
1178
+ refresh_single = gr.Button("πŸ”„")
1179
+
1180
+ refresh_single.click(lambda: gr.update(choices=list_jobs_options()), outputs=job_single)
1181
+
1182
+ regen_btn = gr.Button("πŸ”„ Regenerate Selected Job")
1183
+ regen_preview = gr.Markdown()
1184
+ regen_time = gr.Number(label="Regeneration Time (seconds)")
1185
+ regen_btn.click(fn=regenerate_one, inputs=[job_single], outputs=[regen_preview, regen_time])
1186
+
1187
+ gr.Markdown("## πŸ“Š Results Summary")
1188
+
1189
+ update_summary = gr.Button("πŸ“Š Update Summary")
1190
+ table = gr.Dataframe(value=summary_table(), interactive=False)
1191
+ update_summary.click(fn=summary_table, outputs=table)
1192
+
1193
+ # Knowledge Graph Section
1194
+ if 'kg_service' in globals() and kg_service and kg_service.is_enabled():
1195
+ with gr.Accordion("πŸ“Š Knowledge Graph & Application Tracking", open=False):
1196
+ gr.Markdown("""
1197
+ ### 🧠 Application Knowledge Graph
1198
+ Track your job applications, skills, and patterns over time.
1199
+ """)
1200
+
1201
+ with gr.Row():
1202
+ with gr.Column(scale=1):
1203
+ kg_user_name = gr.Textbox(
1204
+ label="Your Name",
1205
+ value=STATE.get("user_name", "User"),
1206
+ placeholder="Enter your name for tracking"
1207
+ )
1208
+
1209
+ def update_user_name(name):
1210
+ STATE["user_name"] = name
1211
+ return f"Tracking as: {name}"
1212
+
1213
+ kg_user_status = gr.Markdown("Enter your name to start tracking")
1214
+ kg_user_name.change(update_user_name, inputs=[kg_user_name], outputs=[kg_user_status])
1215
+
1216
+ gr.Markdown("### πŸ“ˆ Quick Actions")
1217
+
1218
+ show_history_btn = gr.Button("πŸ“œ Show My History", variant="primary", size="sm")
1219
+ show_trends_btn = gr.Button("πŸ“Š Show Skill Trends", variant="secondary", size="sm")
1220
+ show_insights_btn = gr.Button("πŸ’‘ Company Insights", variant="secondary", size="sm")
1221
+
1222
+ with gr.Column(scale=2):
1223
+ kg_output = gr.JSON(label="Knowledge Graph Data", visible=True)
1224
+
1225
+ def show_user_history(user_name):
1226
+ if kg_service and kg_service.is_enabled():
1227
+ history = kg_service.get_user_history(user_name)
1228
+ return history
1229
+ return {"error": "Knowledge graph not available"}
1230
+
1231
+ def show_skill_trends():
1232
+ if kg_service and kg_service.is_enabled():
1233
+ trends = kg_service.get_skill_trends()
1234
+ return trends
1235
+ return {"error": "Knowledge graph not available"}
1236
+
1237
+ def show_company_insights():
1238
+ if kg_service and kg_service.is_enabled():
1239
+ # Get insights for all companies user applied to
1240
+ history = kg_service.get_user_history(STATE.get("user_name", "User"))
1241
+ companies = set()
1242
+ for app in history.get("applications", []):
1243
+ if isinstance(app, dict) and "properties" in app:
1244
+ company = app["properties"].get("company")
1245
+ if company:
1246
+ companies.add(company)
1247
+
1248
+ insights = {}
1249
+ for company in list(companies)[:5]: # Limit to 5 companies
1250
+ insights[company] = kg_service.get_company_insights(company)
1251
+ return insights if insights else {"message": "No companies found in history"}
1252
+ return {"error": "Knowledge graph not available"}
1253
+
1254
+ show_history_btn.click(
1255
+ show_user_history,
1256
+ inputs=[kg_user_name],
1257
+ outputs=[kg_output]
1258
+ )
1259
+
1260
+ show_trends_btn.click(
1261
+ show_skill_trends,
1262
+ inputs=[],
1263
+ outputs=[kg_output]
1264
+ )
1265
+
1266
+ show_insights_btn.click(
1267
+ show_company_insights,
1268
+ inputs=[],
1269
+ outputs=[kg_output]
1270
+ )
1271
+
1272
+ gr.Markdown("""
1273
+ ### πŸ“Š Features:
1274
+ - **Application History**: Track all your job applications
1275
+ - **Skill Analysis**: See which skills are in demand
1276
+ - **Company Insights**: Learn about companies you've applied to
1277
+ - **Pattern Recognition**: Identify successful application patterns
1278
+ - All data stored locally in SQLite - no external dependencies!
1279
+ """)
1280
+
1281
+ # Enhanced Extraction with LangExtract
1282
+ if LANGEXTRACT_AVAILABLE:
1283
+ with gr.Accordion("πŸ” Enhanced Job Analysis (LangExtract)", open=False):
1284
+ gr.Markdown("### AI-Powered Job & Resume Analysis")
1285
+
1286
+ with gr.Tabs():
1287
+ # Job Analysis Tab
1288
+ with gr.TabItem("πŸ“‹ Job Analysis"):
1289
+ job_analysis_text = gr.Textbox(
1290
+ label="Paste Job Description",
1291
+ lines=10,
1292
+ placeholder="Paste the full job description here for analysis..."
1293
+ )
1294
+ analyze_job_btn = gr.Button("πŸ” Analyze Job", variant="primary")
1295
+ job_analysis_output = gr.Markdown()
1296
+
1297
+ def analyze_job(text):
1298
+ if not text:
1299
+ return "Please paste a job description"
1300
+
1301
+ job = extract_job_info(text)
1302
+ keywords = extract_ats_keywords(text)
1303
+
1304
+ output = create_extraction_summary(job)
1305
+ output += "\n\n### 🎯 ATS Keywords\n"
1306
+ output += f"**High Priority:** {', '.join(keywords.high_priority[:10]) or 'None'}\n"
1307
+ output += f"**Medium Priority:** {', '.join(keywords.medium_priority[:10]) or 'None'}\n"
1308
+
1309
+ return output
1310
+
1311
+ analyze_job_btn.click(
1312
+ fn=analyze_job,
1313
+ inputs=job_analysis_text,
1314
+ outputs=job_analysis_output
1315
+ )
1316
+
1317
+ # ATS Optimization Tab
1318
+ with gr.TabItem("🎯 ATS Optimizer"):
1319
+ gr.Markdown("Compare your resume against job requirements")
1320
+ with gr.Row():
1321
+ ats_resume = gr.Textbox(
1322
+ label="Your Resume",
1323
+ lines=10,
1324
+ placeholder="Paste your resume text..."
1325
+ )
1326
+ ats_job = gr.Textbox(
1327
+ label="Job Description",
1328
+ lines=10,
1329
+ placeholder="Paste the job description..."
1330
+ )
1331
+
1332
+ optimize_btn = gr.Button("🎯 Optimize for ATS", variant="primary")
1333
+ ats_report = gr.Markdown()
1334
+
1335
+ def run_ats_optimization(resume, job):
1336
+ if not resume or not job:
1337
+ return "Please provide both resume and job description"
1338
+
1339
+ result = optimize_for_ats(resume, job)
1340
+ return create_ats_report(result)
1341
+
1342
+ optimize_btn.click(
1343
+ fn=run_ats_optimization,
1344
+ inputs=[ats_resume, ats_job],
1345
+ outputs=ats_report
1346
+ )
1347
+
1348
+ # Bulk Analysis Tab
1349
+ with gr.TabItem("πŸ“Š Bulk Analysis"):
1350
+ gr.Markdown("Analyze multiple jobs at once")
1351
+ bulk_jobs_text = gr.Textbox(
1352
+ label="Paste Multiple Job Descriptions (separated by ---)",
1353
+ lines=15,
1354
+ placeholder="Job 1...\n---\nJob 2...\n---\nJob 3..."
1355
+ )
1356
+ bulk_analyze_btn = gr.Button("πŸ“Š Analyze All Jobs", variant="primary")
1357
+ bulk_output = gr.Markdown()
1358
+
1359
+ def analyze_bulk_jobs(text):
1360
+ if not text:
1361
+ return "Please paste job descriptions"
1362
+
1363
+ jobs = text.split("---")
1364
+ results = []
1365
+
1366
+ for i, job_text in enumerate(jobs, 1):
1367
+ if job_text.strip():
1368
+ job = extract_job_info(job_text)
1369
+ results.append(f"### Job {i}: {job.title or 'Unknown'}")
1370
+ results.append(f"**Company:** {job.company or 'Unknown'}")
1371
+ results.append(f"**Skills:** {', '.join(job.skills[:5]) or 'None detected'}")
1372
+ results.append("")
1373
+
1374
+ return "\n".join(results) if results else "No valid jobs found"
1375
+
1376
+ bulk_analyze_btn.click(
1377
+ fn=analyze_bulk_jobs,
1378
+ inputs=bulk_jobs_text,
1379
+ outputs=bulk_output
1380
+ )
1381
+
1382
+ # Advanced Features Results
1383
+ if ADVANCED_FEATURES:
1384
+ with gr.Accordion("🎯 Advanced Analytics", open=False):
1385
+ with gr.Tabs():
1386
+ # Execution Timeline Tab
1387
+ with gr.TabItem("⚑ Execution Timeline"):
1388
+ show_timeline_btn = gr.Button("πŸ“Š Generate Timeline")
1389
+ timeline_image = gr.Image(label="Parallel Execution Timeline", visible=False)
1390
+
1391
+ def show_execution_timeline():
1392
+ if parallel_executor and hasattr(parallel_executor, 'execution_history'):
1393
+ try:
1394
+ import matplotlib.pyplot as plt
1395
+ fig = parallel_executor.plot_timeline()
1396
+ timeline_path = "execution_timeline.png"
1397
+ fig.savefig(timeline_path)
1398
+ plt.close()
1399
+ return gr.update(visible=True, value=timeline_path)
1400
+ except Exception as e:
1401
+ logger.error(f"Timeline generation error: {e}")
1402
+ return gr.update(visible=False)
1403
+
1404
+ show_timeline_btn.click(fn=show_execution_timeline, outputs=timeline_image)
1405
+
1406
+ # Application History Tab
1407
+ with gr.TabItem("πŸ“œ Application History"):
1408
+ history_btn = gr.Button("πŸ“‹ Show History")
1409
+ history_text = gr.Textbox(label="Application Timeline", lines=10, interactive=False)
1410
+
1411
+ def show_application_history():
1412
+ if temporal_tracker:
1413
+ try:
1414
+ active = temporal_tracker.get_active_applications()
1415
+ patterns = temporal_tracker.analyze_patterns()
1416
+
1417
+ history = "πŸ“Š Application Patterns:\n"
1418
+ history += f"β€’ Total applications: {patterns.get('total_applications', 0)}\n"
1419
+ history += f"β€’ This week: {patterns.get('applications_this_week', 0)}\n"
1420
+ history += f"β€’ Response rate: {patterns.get('response_rate', '0%')}\n\n"
1421
+
1422
+ history += "πŸ“‹ Active Applications:\n"
1423
+ for app in active[:5]:
1424
+ history += f"β€’ {app['company']} - {app['position']} ({app['status']})\n"
1425
+
1426
+ return history
1427
+ except Exception as e:
1428
+ return f"Error retrieving history: {e}"
1429
+ return "Temporal tracking not available"
1430
+
1431
+ history_btn.click(fn=show_application_history, outputs=history_text)
1432
+
1433
+ # Observability Tab
1434
+ with gr.TabItem("πŸ” Agent Tracing"):
1435
+ trace_btn = gr.Button("πŸ“ Show Agent Trace")
1436
+ trace_text = gr.Textbox(label="Agent Interaction Flow", lines=15, interactive=False)
1437
+
1438
+ def show_agent_trace():
1439
+ if agent_tracer:
1440
+ try:
1441
+ import io
1442
+ from contextlib import redirect_stdout
1443
+
1444
+ f = io.StringIO()
1445
+ with redirect_stdout(f):
1446
+ agent_tracer.print_interaction_flow()
1447
+
1448
+ trace_output = f.getvalue()
1449
+
1450
+ # Also get metrics
1451
+ metrics = agent_tracer.get_metrics()
1452
+ trace_output += f"\n\nπŸ“Š Metrics:\n"
1453
+ trace_output += f"β€’ Total events: {metrics['total_events']}\n"
1454
+ trace_output += f"β€’ Agents involved: {metrics['agents_involved']}\n"
1455
+ trace_output += f"β€’ Tool calls: {metrics['tool_calls']}\n"
1456
+ trace_output += f"β€’ Errors: {metrics['errors']}\n"
1457
+
1458
+ return trace_output
1459
+ except Exception as e:
1460
+ return f"Error generating trace: {e}"
1461
+ return "Observability not available"
1462
+
1463
+ trace_btn.click(fn=show_agent_trace, outputs=trace_text)
1464
+
1465
+ # Context Engineering Tab
1466
+ with gr.TabItem("🧠 Context Insights"):
1467
+ context_btn = gr.Button("πŸ“Š Show Context Stats")
1468
+ context_text = gr.Textbox(label="Context Engineering Insights", lines=10, interactive=False)
1469
+
1470
+ def show_context_insights():
1471
+ if context_engineer:
1472
+ try:
1473
+ # Get flywheel recommendations
1474
+ sample_query = "Generate resume for software engineer"
1475
+ recommended = context_engineer.flywheel.get_recommended_sources(sample_query)
1476
+
1477
+ insights = "🧠 Context Engineering Insights:\n\n"
1478
+ insights += f"πŸ“Š Flywheel Learning:\n"
1479
+ insights += f"β€’ Successful contexts: {len(context_engineer.flywheel.successful_contexts)}\n"
1480
+ insights += f"β€’ Pattern cache size: {len(context_engineer.flywheel.pattern_cache)}\n\n"
1481
+
1482
+ if recommended:
1483
+ insights += f"πŸ’‘ Recommended sources for '{sample_query}':\n"
1484
+ for source in recommended:
1485
+ insights += f" β€’ {source}\n"
1486
+
1487
+ # Memory hierarchy stats
1488
+ insights += f"\nπŸ“š Memory Hierarchy:\n"
1489
+ insights += f"β€’ L1 Cache: {len(context_engineer.memory.l1_cache)} items\n"
1490
+ insights += f"β€’ L2 Memory: {len(context_engineer.memory.l2_memory)} items\n"
1491
+ insights += f"β€’ L3 Storage: {len(context_engineer.memory.l3_index)} indexed\n"
1492
+
1493
+ return insights
1494
+ except Exception as e:
1495
+ return f"Error getting insights: {e}"
1496
+ return "Context engineering not available"
1497
+
1498
+ context_btn.click(fn=show_context_insights, outputs=context_text)
1499
+
1500
+ # Configuration status
1501
+ config_status = []
1502
+
1503
+ # LinkedIn OAuth
1504
+ if not MOCK_MODE and LINKEDIN_CLIENT_ID:
1505
+ config_status.append(f"βœ… LinkedIn OAuth ({LINKEDIN_CLIENT_ID[:8]}...)")
1506
+
1507
+ # Adzuna
1508
+ if ADZUNA_APP_ID and ADZUNA_APP_KEY:
1509
+ config_status.append(f"βœ… Adzuna API ({ADZUNA_APP_ID})")
1510
+
1511
+ # Gemini
1512
+ if os.getenv("GEMINI_API_KEY"):
1513
+ config_status.append("βœ… Gemini AI")
1514
+
1515
+ # Tavily
1516
+ if os.getenv("TAVILY_API_KEY"):
1517
+ config_status.append("βœ… Tavily Research")
1518
+
1519
+ if not config_status:
1520
+ config_status.append("ℹ️ Add API keys to .env for full functionality")
1521
+
1522
+ gr.Markdown(f"""
1523
+ ---
1524
+ ### πŸ”§ Active Services: {' | '.join(config_status)}
1525
+
1526
+ ### πŸ’‘ Quick Start:
1527
+ 1. **Sign in** with LinkedIn (if configured)
1528
+ 2. **Search** for jobs on Adzuna or add custom jobs
1529
+ 3. **Configure** advanced features (if available)
1530
+ 4. **Select** jobs and click "Generate Documents"
1531
+ 5. **Review** AI-generated resume and cover letter
1532
+ 6. **Export** to Word/PowerPoint/Excel
1533
+ 7. **Analyze** with advanced analytics (if enabled)
1534
+
1535
+ ### πŸ“Š Current Capabilities:
1536
+ - **Job Sources**: {
1537
+ 'Adzuna (5000/month)' if ADZUNA_APP_ID else 'Mock Data'
1538
+ }
1539
+ - **Authentication**: {
1540
+ 'LinkedIn OAuth' if not MOCK_MODE and LINKEDIN_CLIENT_ID else 'Mock Mode'
1541
+ }
1542
+ - **AI Generation**: {
1543
+ 'Gemini' if os.getenv("GEMINI_API_KEY") else 'Template Mode'
1544
+ }
1545
+ - **Advanced AI**: {
1546
+ 'Parallel + Temporal + Observability + Context' if ADVANCED_FEATURES else 'Not Available'
1547
+ }
1548
+
1549
+ ### πŸš€ Performance Enhancements:
1550
+ - **Parallel Processing**: 3-5x faster document generation
1551
+ - **Temporal Tracking**: Complete application history with versioning
1552
+ - **Observability**: Full agent tracing and debugging
1553
+ - **Context Engineering**: Continuous learning and optimization
1554
+ - **Memory Hierarchy**: L1/L2/L3 caching for instant retrieval
1555
+ - **Compression**: Handle 1M+ tokens with intelligent scaling
1556
+ """)
1557
+
1558
+ return demo
1559
+
1560
+
1561
+ if __name__ == "__main__":
1562
+ print("=" * 60)
1563
+ print("Job Application Assistant - Gradio Interface")
1564
+ print("=" * 60)
1565
+
1566
+ # Check configuration
1567
+ if USE_SYSTEM_AGENTS:
1568
+ print("βœ… Full system mode - all features available")
1569
+ else:
1570
+ print("⚠️ Standalone mode - basic features only")
1571
+ print(" Place this file in the project directory for full features")
1572
+
1573
+ if ADVANCED_FEATURES:
1574
+ print("πŸš€ Advanced AI Agent Features Loaded:")
1575
+ print(" ⚑ Parallel Processing (3-5x faster)")
1576
+ print(" πŸ“Š Temporal Tracking (complete history)")
1577
+ print(" πŸ” Observability (full tracing)")
1578
+ print(" 🧠 Context Engineering (continuous learning)")
1579
+ print(" πŸ“ˆ Context Scaling (1M+ tokens)")
1580
+
1581
+ if os.getenv("GEMINI_API_KEY"):
1582
+ print("βœ… Gemini API configured")
1583
+ else:
1584
+ print("ℹ️ No Gemini API key - using fallback generation")
1585
+
1586
+ if os.getenv("TAVILY_API_KEY"):
1587
+ print("βœ… Tavily API configured for web research")
1588
+
1589
+ if ADZUNA_APP_ID:
1590
+ print("βœ… Adzuna API configured for job search")
1591
+
1592
+ if LINKEDIN_CLIENT_ID:
1593
+ print("βœ… LinkedIn OAuth configured")
1594
+
1595
+ print("\nStarting Gradio app...")
1596
+ print("=" * 60)
1597
+
1598
+ try:
1599
+ app = build_app()
1600
+ app.launch(
1601
+ server_name="0.0.0.0",
1602
+ server_port=int(os.getenv("PORT", 7860)),
1603
+ share=False,
1604
+ show_error=True
1605
+ )
1606
+ except Exception as e:
1607
+ logger.error(f"Failed to start app: {e}")
1608
+ print(f"\n❌ Error: {e}")
1609
+ print("\nTroubleshooting:")
1610
+ print("1. Install required packages: pip install gradio pandas python-dotenv")
1611
+ print("2. Check your .env file exists and is valid")
1612
+ print("3. Ensure port 7860 is not in use")
1613
+ raise
mcp/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ # mcp servers package
mcp/__pycache__/__init__.cpython-313.pyc ADDED
Binary file (114 Bytes). View file
 
mcp/__pycache__/cover_letter_server.cpython-313.pyc ADDED
Binary file (1.41 kB). View file
 
mcp/__pycache__/cv_owner_server.cpython-313.pyc ADDED
Binary file (1.37 kB). View file
 
mcp/__pycache__/orchestrator_server.cpython-313.pyc ADDED
Binary file (1.92 kB). View file
 
mcp/__pycache__/server_common.cpython-313.pyc ADDED
Binary file (1.73 kB). View file
 
mcp/cover_letter_server.py ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+ from mcp.server import Server
3
+
4
+ from mcp.server_common import create_common_tools, run_server
5
+ from agents.cover_letter_agent import CoverLetterAgent
6
+ from agents.linkedin_manager import LinkedInManagerAgent
7
+
8
+
9
+ def build_server() -> Server:
10
+ server = Server("cover_letter_mcp")
11
+ create_common_tools(server)
12
+
13
+ agent = CoverLetterAgent()
14
+ li = LinkedInManagerAgent()
15
+
16
+ @server.tool()
17
+ async def draft_cover_letter(job_id: str, user_id: str = "default_user") -> str:
18
+ job = li.get_job(job_id)
19
+ profile = li.get_profile()
20
+ draft = agent.create_cover_letter(profile, job, user_id=user_id)
21
+ return draft.text
22
+
23
+ return server
24
+
25
+
26
+ if __name__ == "__main__":
27
+ run_server(build_server())
mcp/cv_owner_server.py ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+ from mcp.server import Server
3
+
4
+ from mcp.server_common import create_common_tools, run_server
5
+ from agents.cv_owner import CVOwnerAgent
6
+ from agents.linkedin_manager import LinkedInManagerAgent
7
+
8
+
9
+ def build_server() -> Server:
10
+ server = Server("cv_owner_mcp")
11
+ create_common_tools(server)
12
+
13
+ cv = CVOwnerAgent()
14
+ li = LinkedInManagerAgent()
15
+
16
+ @server.tool()
17
+ async def draft_resume(job_id: str, user_id: str = "default_user") -> str:
18
+ job = li.get_job(job_id)
19
+ profile = li.get_profile()
20
+ draft = cv.create_resume(profile, job, user_id=user_id)
21
+ return draft.text
22
+
23
+ return server
24
+
25
+
26
+ if __name__ == "__main__":
27
+ run_server(build_server())
mcp/orchestrator_server.py ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+ from typing import List
3
+ from mcp.server import Server
4
+
5
+ from mcp.server_common import create_common_tools, run_server
6
+ from agents.orchestrator import OrchestratorAgent
7
+ from models.schemas import JobPosting
8
+
9
+
10
+ def build_server() -> Server:
11
+ server = Server("orchestrator_mcp")
12
+ create_common_tools(server)
13
+
14
+ orch = OrchestratorAgent()
15
+
16
+ @server.tool()
17
+ async def list_jobs() -> List[dict]:
18
+ jobs: List[JobPosting] = orch.get_saved_jobs()
19
+ return [job.model_dump() for job in jobs]
20
+
21
+ @server.tool()
22
+ async def run_for_jobs(job_ids: List[str], user_id: str = "default_user") -> List[dict]:
23
+ jobs = [j for j in orch.get_saved_jobs() if j.id in job_ids]
24
+ results = orch.run_for_jobs(jobs, user_id=user_id)
25
+ return [r.model_dump() for r in results]
26
+
27
+ return server
28
+
29
+
30
+ if __name__ == "__main__":
31
+ run_server(build_server())
mcp/server_common.py ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+ import asyncio
3
+ from typing import Callable, Awaitable
4
+
5
+ from mcp.server import Server
6
+
7
+ from services.web_research import get_role_guidelines
8
+ from services.llm import llm
9
+
10
+
11
+ def create_common_tools(server: Server) -> None:
12
+ @server.tool()
13
+ async def research_guidelines(role_title: str, job_description: str) -> str:
14
+ """Fetch latest best-practice guidance for a role (uses Tavily if configured)."""
15
+ return get_role_guidelines(role_title, job_description)
16
+
17
+ @server.tool()
18
+ async def llm_refine(system_prompt: str, user_prompt: str, max_tokens: int = 800) -> str:
19
+ """Refine a text snippet using the configured LLM provider (OpenAI/Anthropic/Gemini)."""
20
+ return llm.generate(system_prompt, user_prompt, max_tokens=max_tokens)
21
+
22
+
23
+ def run_server(server: Server, host: str = "127.0.0.1", port: int = 8765) -> None:
24
+ # Minimal run loop for development embedding
25
+ asyncio.run(server.run_stdio_async())
memory/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ # memory package
memory/__pycache__/__init__.cpython-313.pyc ADDED
Binary file (186 Bytes). View file
 
memory/__pycache__/store.cpython-313.pyc ADDED
Binary file (7.15 kB). View file
 
memory/data/anthony_test__capco_lead_ai_2024__cover_letter.json ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "job_id": "capco_lead_ai_2024",
3
+ "final": true,
4
+ "keywords_used": [
5
+ "architectures",
6
+ "agent"
7
+ ],
8
+ "draft": "With experience across Python, LLMs, GPT, Claude, Gemma, Multi-modal Models, RAG, Prompt Engineering, I can quickly contribute to your team. I value impact, ownership Relevant focus: mlops\n\nRelevant focus: agent, architectures"
9
+ }
memory/data/anthony_test__capco_lead_ai_2024__cv_owner.json ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "job_id": "capco_lead_ai_2024",
3
+ "cycle": 1,
4
+ "coverage": 0.5384615384615384,
5
+ "conciseness": 1.0,
6
+ "keywords_used": [
7
+ "frameworks",
8
+ "architectures",
9
+ "agent",
10
+ "prompt engineering",
11
+ "financial",
12
+ "ai deployment",
13
+ "multi",
14
+ "advanced",
15
+ "rag",
16
+ "advanced prompt engineering",
17
+ "experience",
18
+ "model",
19
+ "prompt",
20
+ "deployment",
21
+ "solutions",
22
+ "production",
23
+ "advanced prompt",
24
+ "mlops",
25
+ "engineering",
26
+ "systems",
27
+ "agentic"
28
+ ],
29
+ "guidance": "Use concise, achievement-oriented bullets with metrics; prioritize recent, role-relevant skills; ensure ATS-friendly formatting; avoid images/tables; tailor keywords to the job posting; keep resume to 1-2 pages and cover letter to <= 1 page; reflect current tooling (e.g., modern cloud, MLOps/DevOps practices) only if you have real experience.",
30
+ "user_chat": "Emphasize multi-agent AI systems and production LLM deployment",
31
+ "agent2_notes": "British/Australian citizen, no visa required. CQF certified.",
32
+ "draft": "- CORE TECHNICAL COMPETENCIES\n━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\nβ€’ AI/ML Engineering: Python, LLMs (GPT, Claude, Gemma), Multi-modal Models, RAG, Prompt Engineering\nβ€’ Agentic Systems: Multi-agent AI Architectures, Autonomous Workflows, API Integration\nβ€’ MLOps & Deployment: Production AI Pipelines, Model Optimization, Cloud AI (AWS, GCP, Azure)\nβ€’ Scalable Systems: Full-stack Applications, API Development, Performance Optimization\nβ€’ Frameworks: Experience with LangChain/LlamaIndex patterns, Model Context Protocol\nβ€’ Financial Services: HSBC, AmEx, Quantitative Finance (CQF - 87%), Regulatory Compliance\n\nPROFESSIONAL EXPERIENCE\n━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\nCognizant, London, UK 2021 - Present\nAI Value Engineer - Associate Director | Lead GenAI Solution Architect\n\nProduction AI & MLOps Leadership:\nβ€’ Architected and deployed autonomous AI systems for Tier 1 financial institutions (HSBC, AmEx)\n implementing production-grade LLM solutions with 99.9% uptime\nβ€’ Built scalable MLOps pipelines processing Β£100k-Β£1M monthly transactions across government,\n healthcare, and financial services sectors\nβ€’ Pioneered multi-agent AI systems in August 2024, implementing agentic workflows before \n industry-wide adoption\n\nTechnical Innovation & Optimization:\nβ€’ Developed RAG architectures with advanced prompt engineering reducing response latency by 60%\nβ€’ Fine-tuned and optimized multi-modal models achieving 90% accuracy in specialized domains\nβ€’ Implemented Model Context Protocol for hallucination mitigation in production systems\nβ€’ Created full-stack AI applications integrating Claude, GPT, and custom models via APIs\n\nStrategic Partnership & Delivery:\nβ€’ Led cloud AI deployments across AWS, GCP, and Azure for enterprise financial services\nβ€’ Delivered AI programs consistently 4 weeks ahead of schedule through agile methodologies\nβ€’ Guided multidisciplinary teams of 8+ engineers through strategic AI architecture decisions\nβ€’ Published thought leadership on MCP vs RAG architectures and Federated Learning\n\nEDUCATION\n━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\nCertificate in Quantitative Finance (CQF) - 87% average 2020 - 2021\nFitch Learning / CQF Institute\n- ANTHONY LUI\nLead AI Engineer | GenAI Solution Architect\n\nTel: +44 7545 128 601 | Email: luianthony@yahoo.com\nLocation: London | Citizenship: British/Australian (no visa required)\n\nPROFESSIONAL SUMMARY\n━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\nLead AI Engineer and one of two primary GenAI Solution Architects at Cognizant, with 3+ years \ndeploying production-grade LLMs, multi-modal models, and agentic workflows for Tier 1 financial \ninstitutions including HSBC and AmEx.\n- Expert in architecting autonomous AI systems, implementing \nRAG architectures, and building scalable MLOps pipelines.\n- Proven track record of delivering \nenterprise GenAI solutions 4 weeks ahead of schedule with budgets ranging from Β£100k-Β£1M monthly.\n\nANTHONY LUI\nLead AI Engineer | GenAI Solution Architect\n\nTel: +44 7545 128 601 | Email: luianthony@yahoo.com\nLocation: London | Citizenship: British/Australian (no visa required)\n\nPROFESSIONAL SUMMARY\n━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\nLead AI Engineer and one of two primary GenAI Solution Architects at Cognizant, with 3+ years \ndeploying production-grade LLMs, multi-modal models, and agentic workflows for Tier 1 financial \ninstitutions including HSBC and AmEx. Expert in architecting autonomous AI systems, implementing \nRAG architectures, and building scalable MLOps pipelines. Proven track record of delivering \nenterprise GenAI solutions 4 weeks ahead of schedule with budgets ranging from Β£100k-Β£1M monthly.\n\nCORE TECHNICAL COMPETENCIES\n━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\nβ€’ AI/ML Engineering: Python, LLMs (GPT, Claude, Gemma), Multi-modal Models, RAG, Prompt Engineering\nβ€’ Agentic Systems: Multi-agent AI Architectures, Autonomous Workflows, API Integration\nβ€’ MLOps & Deployment: Production AI Pipelines, Model Optimization, Cloud AI (AWS, GCP, Azure)\nβ€’ Scalable Systems: Full-stack Applications, API Development, Performance Optimization\nβ€’ Frameworks: Experience with LangChain/LlamaIndex patterns, Model Context Protocol\nβ€’ Financial Services: HSBC, AmEx, Quantitative Finance (CQF - 87%), Regulatory Compliance\n\nPROFESSIONAL EXPERIENCE\n━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\nCognizant, London, UK 2021 - Present\nAI Value Engineer - Associate Director | Lead GenAI Solution Architect\n\nProduction AI & MLOps Leadership:\nβ€’ Architected and deployed autonomous AI systems for Tier 1 financial institutions (HSBC, AmEx)\n implementing production-grade LLM solutions with 99.9% uptime\nβ€’ Built scalable MLOps pipelines processing Β£100k-Β£1M monthly transactions across government,\n healthcare, and financial services sectors\nβ€’ Pioneered multi-agent AI systems in August 2024, implementing agentic workflows before \n industry-wide adoption\n\nTechnical Innovation & Optimization:\nβ€’ Developed RAG architectures with advanced prompt engineering reducing response latency by 60%\nβ€’ Fine-tuned and optimized multi-modal models achieving 90% accuracy in specialized domains\nβ€’ Implemented Model Context Protocol for hallucination mitigation in production systems\nβ€’ Created full-stack AI applications integrating Claude, GPT, and custom models via APIs\n\nStrategic Partnership & Delivery:\nβ€’ Led cloud AI deployments across AWS, GCP, and Azure for enterprise financial services\nβ€’ Delivered AI programs consistently 4 weeks ahead of schedule through agile methodologies\nβ€’ Guided multidisciplinary teams of 8+ engineers through strategic AI architecture decisions\nβ€’ Published thought leadership on MCP vs RAG architectures and Federated Learning\n\nEDUCATION\n━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\nCertificate in Quantitative Finance (CQF) - 87% average 2020 - 2021\nFitch Learning / CQF Institute\n",
33
+ "signals": {
34
+ "bullet_density": 0.038,
35
+ "quant_count": 124,
36
+ "email_ok": true,
37
+ "gap_years_flag": false,
38
+ "skills_split_hint": false,
39
+ "languages_section": false,
40
+ "links_present": false,
41
+ "action_verb_count": 7,
42
+ "approx_pages": 2.56,
43
+ "approx_one_page": false
44
+ }
45
+ }