Spaces:
Sleeping
Sleeping
| ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| β π SKILL-TO-CODE MAPPING: Where Each Skill Applies in RagBot β | |
| β Reference guide showing skill application locations β | |
| ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| This document maps each of the 34 skills to specific code files and critical | |
| issues they resolve. Use this for quick lookup: "Where do I apply Skill #X?" | |
| ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| CRITICAL ISSUES MAPPING TO SKILLS | |
| ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| ISSUE #1: biomarker_flags & safety_alerts not propagating through workflow | |
| ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| Problem Location: src/state.py, src/agents/*.py, src/workflow.py | |
| Affected Code: | |
| ββ GuildState (missing fields) | |
| ββ BiomarkerAnalyzerAgent.invoke() (only returns biomarkers) | |
| ββ ResponseSynthesizerAgent.invoke() (fields missing in input) | |
| ββ Workflow edges (state not fully passed) | |
| Primary Skills: | |
| β #2 Workflow Orchestration Patterns β Fix state passing | |
| β #3 Multi-Agent Orchestration β Ensure deterministic flow | |
| β #16 Structured Output β Enforce complete schema | |
| Secondary Skills: | |
| β’ #22 Testing Patterns β Write tests for state flow | |
| β’ #27 Observability β Log state changes | |
| Action: Read src/state.py β identify missing fields β update all agents to | |
| return complete state β test end-to-end | |
| ISSUE #2: Schema mismatch between workflow output & API formatter | |
| ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| Problem Location: src/workflow.py, api/app/models/ (missing or inconsistent) | |
| Affected Code: | |
| ββ ResponseSynthesizerAgent output structure (varies) | |
| ββ api/app/services/ragbot.py format_response() (expects different keys) | |
| ββ CLI scripts/chat.py (different field names) | |
| ββ Tests referencing old schema | |
| Primary Skills: | |
| β #16 AI Wrapper/Structured Output β Create unified Pydantic model | |
| β #22 Testing Patterns β Write schema validation tests | |
| Secondary Skills: | |
| β’ #27 Observability β Log schema mismatches (debugging) | |
| Action: Create api/app/models/response.py with BaseAnalysisResponse β | |
| update all agents to return it β validate in API | |
| ISSUE #3: Prediction confidence forced to 0.5 (dangerous for medical) | |
| ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| Problem Location: src/agents/confidence_assessor.py, api/app/routes/analyze.py | |
| Affected Code: | |
| ββ ConfidenceAssessorAgent.invoke() (ignores actual assessment) | |
| ββ Default response in analyze endpoint (hardcoded 0.5) | |
| ββ CLI logic (no failure path for low confidence) | |
| Primary Skills: | |
| β #13 Senior Prompt Engineer β Better reasoning in assessor | |
| β #14 LLM Evaluation β Benchmark accuracy | |
| Secondary Skills: | |
| β’ #4 Agentic Development β Decision logic improvements | |
| β’ #22 Testing Patterns β Test confidence boundaries | |
| β’ #27 Observability β Track confidence distributions | |
| Action: Update confidence_assessor.py to use actual evidence β test with | |
| multiple biomarker scenarios β Add high/medium/low confidence paths | |
| ISSUE #4: Biomarker naming inconsistency (API vs CLI) | |
| ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| Problem Location: config/biomarker_references.json, src/agents/*, api/* | |
| Affected Code: | |
| ββ config/biomarker_references.json (canonical list) | |
| ββ BiomarkerAnalyzerAgent (validation against reference) | |
| ββ CLI scripts/chat.py (different naming) | |
| ββ API endpoints (naming transformation) | |
| Primary Skills: | |
| β #9 Chunking Strategy β Include standard names in embedding | |
| β #16 Structured Output β Enforce standard field names | |
| Secondary Skills: | |
| β’ #10 Embedding Pipeline β Index with canonical names | |
| β’ #22 Testing Patterns β Test name transformation | |
| β’ #27 Observability β Log name mismatches | |
| Action: Create biomarker_normalizer() β apply in all code paths β add | |
| mapping tests | |
| ISSUE #5: JSON parsing breaks on malformed LLM output | |
| ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| Problem Location: api/app/services/extraction.py, src/agents/extraction code | |
| Affected Code: | |
| ββ LLM.predict() returns text | |
| ββ json.loads() has no error handling | |
| ββ Invalid JSON crashes endpoint | |
| ββ No fallback strategy | |
| Primary Skills: | |
| β #5 Tool/Function Calling β Use function calling instead | |
| β #21 Python Error Handling β Graceful degradation | |
| Secondary Skills: | |
| β’ #16 Structured Output β Pydantic validation | |
| β’ #19 LLM Security β Prevent injection in JSON | |
| β’ #27 Observability β Log parsing failures | |
| β’ #14 LLM Evaluation β Track failure rate | |
| Action: Replace json.loads() with Pydantic validator β implement retry logic | |
| β add function calling as fallback | |
| ISSUE #6: No citation enforcement in RAG outputs | |
| ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| Problem Location: src/agents/disease_explainer.py, response synthesis | |
| Affected Code: | |
| ββ retriever.retrieve() returns docs but citations dropped | |
| ββ DiseaseExplainerAgent doesn't track sources | |
| ββ ResponseSynthesizerAgent loses citation info | |
| ββ API response has no source attribution | |
| Primary Skills: | |
| β #11 RAG Implementation β Enforce citations in loop | |
| β #8 Hybrid Search β Better relevance = better cites | |
| β #12 Knowledge Graph β Link to authoritative sources | |
| Secondary Skills: | |
| β’ #1 LangChain Architecture β Tool for citation tracking | |
| β’ #7 RAG Agent Builder β Full RAG best practices | |
| β’ #14 LLM Evaluation β Test for hallucinations | |
| β’ #27 Observability β Track citation frequency | |
| Action: Modify disease_explainer.py to preserve doc metadata β add citation | |
| validation β return sources in API response | |
| ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| SKILL-BY-SKILL APPLICATION GUIDE | |
| ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| #1 LangChain Architecture | |
| Phase: 3, Week 7 | |
| Apply To: src/agents/, src/services/ | |
| Key Files: | |
| ββ src/agents/base_agent.py (NEW) - Create BaseAgent with LangChain patterns | |
| ββ src/agents/*/invoke() - Add callbacks, chains, tools | |
| ββ src/services/*.py - RunnableWithMessageHistory for conversation | |
| Integration: Advanced chain composition, callbacks for metrics | |
| Outcome: More sophisticated agent orchestration | |
| Effort: 3-4 hours | |
| #2 Workflow Orchestration Patterns | |
| Phase: 1, Week 1 / Phase 4, Week 12 (final review) | |
| Apply To: src/workflow.py, src/state.py | |
| Key Files: | |
| ββ src/state.py - REFACTOR GuildState with all fields | |
| ββ src/workflow.py - REFACTOR state passing between agents | |
| ββ src/agents/biomarker_analyzer.py - Return complete state | |
| ββ src/agents/disease_explainer.py - Preserve incoming state | |
| Integration: Fix Issue #1 (state propagation) | |
| Outcome: biomarker_flags & safety_alerts flow through entire workflow | |
| Effort: 4-6 hours (Week 1) + 2 hours (Week 12 refine) | |
| #3 Multi-Agent Orchestration | |
| Phase: 1, Week 2 | |
| Apply To: src/workflow.py | |
| Key Files: | |
| ββ src/workflow.py - Ensure deterministic agent order | |
| ββ Parallel execution order documentation | |
| Integration: Ensure agents execute in correct order with proper state passing | |
| Outcome: Deterministic workflow execution | |
| Effort: 3-4 hours | |
| #4 Agentic Development | |
| Phase: 2, Week 3 | |
| Apply To: src/agents/biomarker_analyzer.py, confidence_assessor.py | |
| Key Files: | |
| ββ BiomarkerAnalyzerAgent.invoke() - Add confidence thresholds | |
| ββ ConfidenceAssessorAgent - Better decision logic | |
| ββ Add reasoning trace to responses | |
| Integration: Better medical decisions, alternatives for low confidence | |
| Outcome: More reliable biomarker analysis | |
| Effort: 3-4 hours | |
| #5 Tool/Function Calling Patterns | |
| Phase: 2, Week 4 | |
| Apply To: api/app/services/extraction.py, src/agents/extraction.py | |
| Key Files: | |
| ββ api/app/services/extraction.py - Define extraction tools/functions | |
| ββ src/agents/ - Use function returns instead of JSON parsing | |
| Integration: Fix Issue #5 (JSON parsing fragility) | |
| Outcome: Structured LLM outputs guaranteed valid | |
| Effort: 3-4 hours | |
| #6 LLM Application Dev with LangChain | |
| Phase: 4, Week 11 | |
| Apply To: src/agents/ (production patterns) | |
| Key Files: | |
| ββ src/agents/base_agent.py - Implement lifecycle (setup, execute, cleanup) | |
| ββ All agents - Add retry logic, graceful degradation | |
| ββ Agent composition patterns - Chain agents | |
| Integration: Production-ready agent code | |
| Outcome: Robust, maintainable agents with error recovery | |
| Effort: 4-5 hours | |
| #7 RAG Agent Builder | |
| Phase: 4, Week 12 | |
| Apply To: src/agents/ (full review) | |
| Key Files: | |
| ββ src/agents/disease_explainer.py - RAG pattern review | |
| ββ Ensure all responses cite sources | |
| ββ Verify accuracy benchmarks | |
| Integration: Full RAG agent validation before production | |
| Outcome: Production-ready RAG agents | |
| Effort: 4-5 hours | |
| #8 Hybrid Search Implementation | |
| Phase: 3, Week 6 | |
| Apply To: src/retrievers/ (NEW) | |
| Key Files: | |
| ββ src/retrievers/hybrid_retriever.py (NEW) - Combine BM25 + FAISS | |
| ββ src/agents/disease_explainer.py - Use hybrid retriever | |
| Integration: Better document retrieval (semantic + keyword) | |
| Outcome: +15% recall on rare disease queries | |
| Effort: 4-6 hours | |
| #9 Chunking Strategy | |
| Phase: 3, Week 6 | |
| Apply To: src/chunking_strategy.py (NEW), src/pdf_processor.py | |
| Key Files: | |
| ββ src/chunking_strategy.py (NEW) - Split by medical sections | |
| ββ scripts/setup_embeddings.py - Use new chunking | |
| ββ Re-chunk and re-embed medical_knowledge.faiss | |
| Integration: Fix Issue #4 (naming), improve context window usage | |
| Outcome: Better semantic chunks, improved retrieval quality | |
| Effort: 4-5 hours | |
| #10 Embedding Pipeline Builder | |
| Phase: 3, Week 6 | |
| Apply To: src/llm_config.py, scripts/setup_embeddings.py | |
| Key Files: | |
| ββ src/llm_config.py - Consider medical embedding models | |
| ββ scripts/setup_embeddings.py - Use new embeddings | |
| ββ Benchmark embedding quality | |
| Integration: Better semantic search for medical terminology | |
| Outcome: Improved document relevance ranking | |
| Effort: 3-4 hours | |
| #11 RAG Implementation | |
| Phase: 3, Week 6 | |
| Apply To: src/agents/disease_explainer.py | |
| Key Files: | |
| ββ src/agents/disease_explainer.py - Track and enforce citations | |
| ββ src/models/response.py - Add sources field | |
| ββ api/app/routes/analyze.py - Return sources | |
| Integration: Fix Issue #6 (no citations), enforce medical accuracy | |
| Outcome: All claims backed by sources | |
| Effort: 3-4 hours | |
| #12 Knowledge Graph Builder | |
| Phase: 3, Week 7 | |
| Apply To: src/knowledge_graph.py (NEW) | |
| Key Files: | |
| ββ src/knowledge_graph.py (NEW) - Disease β Biomarker β Treatment graph | |
| ββ Extract entities from medical PDFs | |
| ββ src/agents/biomarker_analyzer.py - Use knowledge graph | |
| ββ Create graph.html visualization | |
| Integration: Better disease prediction via relationships | |
| Outcome: Knowledge graph with 100+ nodes, 500+ edges | |
| Effort: 6-8 hours | |
| #13 Senior Prompt Engineer | |
| Phase: 2, Week 3 | |
| Apply To: src/agents/ (all agent prompts) | |
| Key Files: | |
| ββ src/agents/biomarker_analyzer.py - Prompt: few-shot extraction | |
| ββ src/agents/disease_explainer.py - Prompt: chain-of-thought reasoning | |
| ββ src/agents/confidence_assessor.py - Prompt: decision logic | |
| ββ src/agents/clinical_guidelines.py - Prompt: evidence-based | |
| Integration: Fix Issue #3 (confidence), improve medical reasoning | |
| Outcome: +15% accuracy improvement | |
| Effort: 5-6 hours | |
| #14 LLM Evaluation | |
| Phase: 2, Week 4 | |
| Apply To: tests/evaluation_metrics.py (NEW) | |
| Key Files: | |
| ββ tests/evaluation_metrics.py (NEW) - Benchmarking suite | |
| ββ tests/fixtures/evaluation_patients.py - Test scenarios | |
| ββ Benchmark Groq vs Gemini performance | |
| ββ Track before/after improvements | |
| Integration: Measure all improvements quantitatively | |
| Outcome: Clear metrics showing progress | |
| Effort: 4-5 hours | |
| #15 Cost-Aware LLM Pipeline | |
| Phase: 3, Week 8 | |
| Apply To: src/llm_config.py | |
| Key Files: | |
| ββ src/llm_config.py - Model routing by complexity | |
| ββ Implement caching (hash β result) | |
| ββ Cost tracking and reporting | |
| ββ Target: -40% cost reduction | |
| Integration: Optimize API costs without sacrificing accuracy | |
| Outcome: Lower operational costs | |
| Effort: 4-5 hours | |
| #16 AI Wrapper/Structured Output | |
| Phase: 1, Week 1 | |
| Apply To: api/app/models/ (NEW and REFACTORED) | |
| Key Files: | |
| ββ api/app/models/response.py (NEW) - Create unified BaseAnalysisResponse | |
| ββ api/app/services/ragbot.py - Use unified schema | |
| ββ All agents - Match unified output | |
| ββ API responses - Validate with Pydantic | |
| Integration: Fix Issues #1, #2, #4, #5 (schema consistency) | |
| Outcome: Single canonical response format | |
| Effort: 3-5 hours | |
| #17 API Security Hardening | |
| Phase: 1, Week 1 | |
| Apply To: api/app/middleware/, api/main.py | |
| Key Files: | |
| ββ api/app/middleware/auth.py (NEW) - JWT auth | |
| ββ api/main.py - Add security middleware chain | |
| ββ CORS, headers, rate limiting | |
| Integration: Secure REST API endpoints | |
| Outcome: API hardened against common attacks | |
| Effort: 4-6 hours | |
| #18 OWASP Security Check | |
| Phase: 1, Week 1 | |
| Apply To: docs/ (audit report) | |
| Key Files: | |
| ββ docs/SECURITY_AUDIT.md (NEW) - Security findings | |
| ββ Scan api/ and src/ for vulnerabilities | |
| ββ Create tickets for each issue | |
| Integration: Establish security baseline | |
| Outcome: All vulnerabilities documented and prioritized | |
| Effort: 2-3 hours | |
| #19 LLM Security | |
| Phase: 1, Week 2 | |
| Apply To: api/app/middleware/input_validation.py (NEW) | |
| Key Files: | |
| ββ api/app/middleware/input_validation.py (NEW) - Input sanitization | |
| ββ Detect prompt injection attempts | |
| ββ Validate biomarker inputs | |
| ββ Escape special characters | |
| Integration: Fix Issue #5 (JSON safety), prevent prompt injection | |
| Outcome: Inputs validated before LLM processing | |
| Effort: 3-4 hours | |
| #20 API Rate Limiting | |
| Phase: 1, Week 1 | |
| Apply To: api/app/middleware/rate_limiter.py (NEW) | |
| Key Files: | |
| ββ api/app/middleware/rate_limiter.py (NEW) - Token bucket limiter | |
| ββ api/main.py - Add to middleware chain | |
| ββ Tiered limits (free/pro based on API key) | |
| Integration: Protect API from abuse | |
| Outcome: Rate limiting in place | |
| Effort: 2-3 hours | |
| #21 Python Error Handling | |
| Phase: 2, Week 2 | |
| Apply To: src/exceptions.py (NEW), src/agents/ | |
| Key Files: | |
| ββ src/exceptions.py (NEW) - Custom exception hierarchy | |
| ββ RagBotException, BiomarkerValidationError, LLMTimeoutError, etc. | |
| ββ All agents - Replace generic try-except | |
| ββ API - Proper error responses | |
| Integration: Graceful error handling throughout system | |
| Outcome: No uncaught exceptions, useful error messages | |
| Effort: 3-4 hours | |
| #22 Python Testing Patterns | |
| Phase: 1, Week 1 + Phase 2, Week 3 (primary), Week 4 | |
| Apply To: tests/ (throughout project) | |
| Key Files: | |
| ββ tests/conftest.py - Shared fixtures | |
| ββ tests/fixtures/ - auth, biomarkers, patients | |
| ββ tests/test_api_auth.py - Auth tests (Week 1) | |
| ββ tests/test_parametrized_*.py - 50+ parametrized tests (Week 3) | |
| ββ tests/test_response_schema.py - Schema validation (Week 1) | |
| ββ 80-90% code coverage | |
| Integration: Comprehensive test suite ensures reliability | |
| Outcome: 125+ tests, 90%+ coverage | |
| Effort: 10-13 hours total | |
| #23 Code Review Excellence | |
| Phase: 4, Week 10 | |
| Apply To: docs/REVIEW_GUIDELINES.md (NEW), all PRs | |
| Key Files: | |
| ββ docs/REVIEW_GUIDELINES.md (NEW) - Medical code review standards | |
| ββ Apply to all Phase 1-3 pull requests | |
| ββ Self-review checklist | |
| Integration: Maintain code quality | |
| Outcome: Clear review guidelines | |
| Effort: 2-3 hours | |
| #24 GitHub Actions Templates | |
| Phase: 1, Week 2 | |
| Apply To: .github/workflows/ (NEW) | |
| Key Files: | |
| ββ .github/workflows/test.yml - Run tests on PR | |
| ββ .github/workflows/security.yml - Security checks | |
| ββ .github/workflows/docker.yml - Build Docker images | |
| Integration: Automated CI/CD pipeline | |
| Outcome: Tests run automatically | |
| Effort: 2-3 hours | |
| #25 FastAPI Templates | |
| Phase: 4, Week 9 | |
| Apply To: api/app/main.py, api/app/dependencies.py | |
| Key Files: | |
| ββ api/app/main.py - REFACTOR with best practices | |
| ββ Async patterns, dependency injection | |
| ββ Connection pooling, caching headers | |
| ββ Health check endpoints | |
| Integration: Production-grade FastAPI configuration | |
| Outcome: Optimized API performance | |
| Effort: 3-4 hours | |
| #26 Python Design Patterns | |
| Phase: 2, Week 3 | |
| Apply To: src/agents/base_agent.py (NEW), src/agents/ | |
| Key Files: | |
| ββ src/agents/base_agent.py (NEW) - Extract common pattern | |
| ββ Factory pattern for agent creation | |
| ββ Composition over inheritance | |
| ββ Refactor BiomarkerAnalyzerAgent, etc. | |
| Integration: Cleaner, more maintainable code | |
| Outcome: Reduced coupling, better abstractions | |
| Effort: 4-5 hours | |
| #27 Python Observability | |
| Phase: 1, Week 2 (logging) / Phase 4, Week 10 (metrics) / Phase 2, Week 5 | |
| Apply To: src/, api/app/ | |
| Key Files: | |
| ββ src/observability.py (NEW) - Logging infrastructure (Week 2) | |
| ββ All agents - Add structured JSON logging | |
| ββ src/monitoring/ (NEW) - Prometheus metrics (Week 10) | |
| ββ Track latency, accuracy, costs | |
| Integration: Visibility into system behavior | |
| Outcome: JSON logs, metrics at /metrics | |
| Effort: 12-15 hours total | |
| #28 Memory Management | |
| Phase: 3, Week 7 | |
| Apply To: src/memory_manager.py (NEW) | |
| Key Files: | |
| ββ src/memory_manager.py (NEW) - Sliding window memory | |
| ββ Context compression for conversation history | |
| ββ Token usage optimization | |
| Integration: Handle long conversations without exceeding limits | |
| Outcome: 20-30% token savings | |
| Effort: 3-4 hours | |
| #29 API Docs Generator | |
| Phase: 4, Week 9 | |
| Apply To: api/app/routes/ (documentation) | |
| Key Files: | |
| ββ api/app/routes/*.py - Enhance docstrings | |
| ββ Add examples to endpoints | |
| ββ Auto-generates /docs (Swagger UI), /redoc | |
| Integration: API discoverable by developers | |
| Outcome: Interactive API documentation | |
| Effort: 2-3 hours | |
| #30 GitHub PR Review Workflow | |
| Phase: 4, Week 9 | |
| Apply To: .github/ (NEW) | |
| Key Files: | |
| ββ .github/CODEOWNERS - Code ownership rules | |
| ββ .github/pull_request_template.md - PR checklist | |
| ββ Branch protection rules | |
| Integration: Establish code review standards | |
| Outcome: Consistent PR quality | |
| Effort: 2-3 hours | |
| #31 CI-CD Best Practices | |
| Phase: 4, Week 10 | |
| Apply To: .github/workflows/deploy.yml (NEW) | |
| Key Files: | |
| ββ .github/workflows/deploy.yml (NEW) - Deployment pipeline | |
| ββ Build β Test β Staging β Canary β Production | |
| ββ Environment management (.env files) | |
| Integration: Automated, safe deployments | |
| Outcome: Confident production deployments | |
| Effort: 3-4 hours | |
| #32 Frontend Accessibility (OPTIONAL) | |
| Phase: 4, Week 10 | |
| Apply To: examples/web_interface/ (if building web UI) | |
| Key Files: | |
| ββ examples/web_interface/ - WCAG 2.1 AA compliance | |
| Integration: Accessible web interface (if needed) | |
| Outcome: Screen-reader friendly, keyboard navigable | |
| Effort: 2-3 hours (skip if CLI only) | |
| #33 Webhook Receiver Hardener (OPTIONAL) | |
| Phase: 4, Week 11 | |
| Apply To: api/app/webhooks/ (NEW, if integrations needed) | |
| Key Files: | |
| ββ api/app/webhooks/ (NEW) - Webhook handlers | |
| ββ Signature verification, replay protection | |
| Integration: Secure webhook handling for EHR integrations | |
| Outcome: Protected webhook endpoints | |
| Effort: 2-3 hours (skip if no webhooks) | |
| ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| QUICK LOOKUP: BY FILE | |
| api/app/main.py | |
| ββ #17 API Security Hardening (JWT middleware) | |
| ββ #20 Rate Limiting (rate limiter middleware) | |
| ββ #25 FastAPI Templates (async patterns) | |
| ββ #24 GitHub Actions (workflow) (CI/CD reference) | |
| ββ #29 API Docs Generator (docstrings) | |
| api/app/models/response.py (NEW) | |
| ββ #16 AI Wrapper/Structured Output (unified schema) | |
| ββ #22 Testing Patterns (Pydantic validation) | |
| api/app/middleware/ (NEW) | |
| ββ auth.py #17 API Security Hardening | |
| ββ input_validation.py #19 LLM Security | |
| ββ rate_limiter.py #20 API Rate Limiting | |
| src/state.py | |
| ββ #2 Workflow Orchestration (fix state fields) | |
| ββ #16 Structured Output (enforce schema) | |
| ββ #22 Testing Patterns (state tests) | |
| src/workflow.py | |
| ββ #2 Workflow Orchestration (state passing) | |
| ββ #3 Multi-Agent Orchestration (agent order) | |
| ββ #27 Observability (logging) | |
| src/agents/base_agent.py (NEW) | |
| ββ #26 Python Design Patterns (factory, composition) | |
| ββ #6 LLM App Dev LangChain (lifecycle) | |
| ββ #21 Error Handling (graceful degradation) | |
| ββ #27 Observability (logging) | |
| src/agents/biomarker_analyzer.py | |
| ββ #4 Agentic Development (confidence thresholds) | |
| ββ #13 Senior Prompt Engineer (prompt optimization) | |
| ββ #2 Workflow Orchestration (return complete state) | |
| ββ #12 Knowledge Graph (use relationships) | |
| src/agents/disease_explainer.py | |
| ββ #8 Hybrid Search (retriever) | |
| ββ #11 RAG Implementation (enforcement) | |
| ββ #13 Senior Prompt Engineer (chain-of-thought) | |
| ββ #1 LangChain Architecture (advanced patterns) | |
| ββ #7 RAG Agent Builder (RAG best practices) | |
| src/agents/confidence_assessor.py | |
| ββ #4 Agentic Development (decision logic) | |
| ββ #13 Senior Prompt Engineer (better reasoning) | |
| ββ #14 LLM Evaluation (benchmark) | |
| ββ #22 Testing Patterns (confidence tests) | |
| src/agents/clinical_guidelines.py | |
| ββ #13 Senior Prompt Engineer (evidence-based) | |
| ββ #1 LangChain Architecture (advanced retrieval) | |
| src/exceptions.py (NEW) | |
| ββ #21 Python Error Handling (exception hierarchy) | |
| ββ #27 Observability (error logging) | |
| src/retrievers/hybrid_retriever.py (NEW) | |
| ββ #8 Hybrid Search Implementation (BM25 + FAISS) | |
| ββ #9 Chunking Strategy (better chunks) | |
| ββ #10 Embedding Pipeline (semantic search) | |
| ββ #27 Observability (retrieval metrics) | |
| src/chunking_strategy.py (NEW) | |
| ββ #9 Chunking Strategy (medical section splitting) | |
| ββ #10 Embedding Pipeline (prepare for embedding) | |
| ββ #4 Agentic Development (standardization) | |
| src/knowledge_graph.py (NEW) | |
| ββ #12 Knowledge Graph Builder (extract relationships) | |
| ββ #13 Senior Prompt Engineer (entity extraction prompt) | |
| ββ #1 LangChain Architecture (graph traversal) | |
| src/memory_manager.py (NEW) | |
| ββ #28 Memory Management (sliding window, compression) | |
| ββ #15 Cost-Aware Pipeline (token optimization) | |
| src/llm_config.py | |
| ββ #15 Cost-Aware LLM Pipeline (model routing, caching) | |
| ββ #10 Embedding Pipeline (embedding model config) | |
| ββ #27 Observability (cost tracking) | |
| src/observability.py (NEW) | |
| ββ #27 Python Observability (logging, metrics) | |
| ββ #21 Error Handling (error tracking) | |
| ββ #14 LLM Evaluation (metric collection) | |
| src/monitoring/ (NEW) | |
| ββ #27 Python Observability (metrics, dashboards) | |
| tests/conftest.py | |
| ββ #22 Python Testing Patterns (shared fixtures) | |
| tests/fixtures/ | |
| ββ auth.py #22 Testing Patterns | |
| ββ biomarkers.py #22 Testing Patterns | |
| ββ evaluation_patients.py #14 LLM Evaluation | |
| tests/test_api_auth.py (NEW) | |
| ββ #22 Python Testing Patterns | |
| ββ #17 API Security Hardening | |
| ββ #25 FastAPI Templates | |
| tests/test_parametrized_*.py (NEW) | |
| ββ #22 Python Testing Patterns | |
| tests/evaluation_metrics.py (NEW) | |
| ββ #14 LLM Evaluation | |
| .github/workflows/ | |
| ββ test.yml #24 GitHub Actions Templates | |
| ββ security.yml #18 OWASP Check + #24 Actions | |
| ββ docker.yml #24 Actions | |
| ββ deploy.yml #31 CI-CD Best Practices | |
| .github/ | |
| ββ CODEOWNERS #30 GitHub PR Review Workflow | |
| ββ pull_request_template.md #30 Workflow | |
| ββ branch protection rules | |
| docs/ | |
| ββ SECURITY_AUDIT.md #18 OWASP Check | |
| ββ REVIEW_GUIDELINES.md #23 Code Review Excellence | |
| ββ API.md (updated by #29 API Docs Generator) | |
| ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| SKILL DEPENDENCY GRAPH | |
| ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| Phase 1 must finish before Phase 2: | |
| #18, #17, #22, #2, #16, #20, #3, #19, #21, #27, #24 | |
| β | |
| Phase 2 requires Phase 1: | |
| #22, #26, #4, #13, #14, #5 | |
| β | |
| Phase 3 requires Phases 1-2: | |
| #8, #9, #10, #11, #12, #1, #28, #15 | |
| β | |
| Phase 4 requires Phases 1-3: | |
| #25, #29, #30, #27, #23, #31, #32*, #6, #33*, #7 | |
| Within phases, some order dependencies: | |
| - #16 should complete before other Phase 1 work finalizes | |
| - #13 should complete before #14 evaluation | |
| - #8, #9, #10 should coordinate (hybrid search β chunking β embeddings) | |
| - #11 depends on #8 (retriever first) | |
| - #12 depends on #13 (prompt engineering for entity extraction) | |
| - #27 used 3 times (Week 2, Week 5, Week 10) | |
| - #22 used 2 times (Week 1, Weeks 3-4) | |
| ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| DAILY WORKFLOW | |
| ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| 1. Open the skill SKILL.md documented in ~/.agents/skills/<skill-name>/ | |
| 2. Read the relevant section for your task | |
| 3. Apply to specific code files listed above | |
| 4. Write tests immediately (use #22 Testing Patterns) | |
| 5. Commit with clear message: "feat: [Skill #X] [Description]" | |
| 6. Track in IMPLEMENTATION_STATUS_TRACKER.md | |
| ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |