rooting-future / docs /ARCHITECTURE.md
mtornani's picture
Initial HF Spaces deployment (clean branch without large binaries)
38f9c15

Rooting Future - Architecture & Patterns

πŸ—οΈ System Architecture

High-Level Overview

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        Flask App (app.py)                    β”‚
β”‚  Routes | Auth | Webhooks | SSE Streaming | Admin Panel     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                        β”‚
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚                               β”‚
        β–Ό                               β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Multi-Agent     β”‚          β”‚  Structured      β”‚
β”‚  Orchestrator    β”‚          β”‚  Orchestrator    β”‚
β”‚  (agents.py)     β”‚          β”‚  (structured_    β”‚
β”‚                  β”‚          β”‚   agent.py)      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜          β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚                             β”‚
         β”‚ spawns 6+1 agents           β”‚ generates tables/data
         β”‚                             β”‚
         β–Ό                             β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚            Gemini 2.0 Flash API                β”‚
β”‚         (google-generativeai SDK)              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚                             β”‚
         β”‚ stores/retrieves            β”‚
         β–Ό                             β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Knowledge Store    β”‚    β”‚   Export Pipeline    β”‚
β”‚  (SQLite + RAG)     β”‚    β”‚   (PDF/DOCX/HTML)    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ“¦ Core Components

1. Multi-Agent Orchestrator (agents.py)

Purpose: Coordinates 6 specialized agents + 1 coordinator to generate strategic plans.

Architecture:

class MultiAgentOrchestrator:
    def __init__(self, knowledge_store, file_search_store_name):
        self.agents = {
            AgentRole.COORDINATOR: StrategicAgent(...)
            AgentRole.STW_SPORTIVI: StrategicAgent(...)
            AgentRole.STW_STRUTTURALI: StrategicAgent(...)
            AgentRole.STW_MARKETING: StrategicAgent(...)
            AgentRole.STW_SOCIALI: StrategicAgent(...)
            AgentRole.FINANCIAL: StrategicAgent(...)
        }

    def generate_strategic_plan(self, club_data, parallel=True):
        # Execute agents
        # Synthesize with coordinator
        # Return complete plan

Key Methods:

  • generate_strategic_plan() - Main entry point
  • _generate_sequential() - Execute agents one by one
  • _generate_parallel() - Execute agents concurrently (⚠️ currently fake async)
  • generate_single_section() - Regenerate specific section

Agent Roles:

Role Responsibility Output Section
COORDINATOR Executive summary synthesis executive_summary
STW_SPORTIVI 8 sports objectives (MACRO 1-8) stw_sportivi
STW_STRUTTURALI 2 infrastructure objectives stw_strutturali
STW_MARKETING 4 marketing objectives stw_marketing
STW_SOCIALI 7 social objectives stw_sociali
FINANCIAL Economic-financial plan financial_plan

2. Strategic Agent (agents.py:StrategicAgent)

Purpose: Individual agent that generates content for one domain.

Pattern:

class StrategicAgent:
    def __init__(self, spec: AgentSpec, file_search_store_name: str):
        self.spec = spec  # System prompt, expertise
        self.model = genai.GenerativeModel(MODEL_CONFIG.name)
        self.cache = AICache()  # File-based cache
        self.sourcer = SourcedContentGenerator()

    def generate(self, club_data, research_data, rag_context):
        # 1. Build prompt with RAG context
        prompt = self._build_simple_prompt(club_data, rag_context)

        # 2. Check AI cache
        cached = self.cache.get(prompt)
        if cached:
            return cached

        # 3. Call Gemini
        response = self.model.generate_content(prompt)

        # 4. Post-process and cache
        content = self._post_process(response.text)
        self.cache.set(prompt, content)

        return {'content': content, 'sources': [...]}

Critical Details:

  • ⚠️ Gemini calls are synchronous - blocking I/O
  • βœ… AI Cache reduces redundant calls (~30% hit rate)
  • βœ… Fallback mechanism if tools fail (retry without tools)
  • ⚠️ No rate limiting - can hit API limits

3. RAG System (Knowledge Store + File Search)

Purpose: Learn from previous successful plans to improve quality.

Flow:

Previous Plans β†’ SQLite β†’ Embeddings β†’ File Search Store (Google)
                                              ↓
New Plan Generation ← RAG Context Fetch β†β”€β”€β”€β”€β”€β”˜

Implementation:

# knowledge_store.py
class SQLiteKnowledgeStore:
    def save_plan(self, plan_id, club_data, plan_content):
        # Save to SQLite
        # Upload to Gemini File Search Store for RAG

    def get_context_for_generation(self, club_category, section_type):
        # Query similar plans by category
        # Return 1-2 best examples

⚠️ Performance Issue:

  • No indices on category, section_type β†’ linear scan
  • Queried for EVERY agent (6x per plan) β†’ bottleneck

4. Export Pipeline

Current State (6 separate modules):

export_pdf_server.py    β†’ WeasyPrint (HTML β†’ PDF)
export_html.py          β†’ Jinja2 templates
export_docx.py          β†’ python-docx
export_paged.py         β†’ Paged CSS media (print-optimized HTML)
export_onepager.py      β†’ Infographic summary
export_package.py       β†’ ZIP bundler (all formats)

Common Pattern (duplicated 6 times):

def export(club_name, plan_data, metadata):
    # 1. Apply branding (colors, logo)
    # 2. Render template
    # 3. Add methodology section
    # 4. Generate output file
    # 5. Return filepath

⚠️ Problem: 60% code duplication across modules.

Solution Needed: BaseExporter abstract class.

πŸ”„ Data Flow

Plan Generation Flow

1. User Input (Web Form)
   ↓
2. app.py: /api/generate endpoint
   ↓
3. Background task: generate_strategic_plan_background()
   ↓
4. MultiAgentOrchestrator.generate_strategic_plan()
   β”œβ”€ RAG context fetch (6x parallel)
   β”œβ”€ Agent 1: STW Sportivi β†’ Gemini API
   β”œβ”€ Agent 2: STW Strutturali β†’ Gemini API
   β”œβ”€ Agent 3: STW Marketing β†’ Gemini API
   β”œβ”€ Agent 4: STW Sociali β†’ Gemini API
   β”œβ”€ Agent 5: Financial β†’ Gemini API
   └─ Agent 6: Coordinator β†’ Gemini API (synthesizes)
   ↓
5. Post-Production Editor (cleanup, validation)
   ↓
6. Knowledge Store: save plan + upload to File Search
   ↓
7. Export Pipeline: Generate PDF, DOCX, OnePager
   ↓
8. Response: URLs to download files

Timing (current):

  • Steps 1-3: ~2s
  • Step 4 (Generation): ~60s ⚠️ (should be 15-20s)
  • Steps 5-7: ~15s
  • Total: ~77s

🧩 Key Design Patterns

1. Multi-Agent Coordination

Pattern: Specialized agents with single responsibility.

Benefits:

  • βœ… Modular: Each agent can be improved independently
  • βœ… Testable: Can test individual agents
  • βœ… Scalable: Add new agents without touching existing

Trade-offs:

  • ⚠️ Sequential execution slow (60s)
  • ⚠️ Coordinator depends on all agents completing

2. RAG Learning Loop

Pattern: Every generated plan becomes training data.

# After generation
knowledge_store.save_plan(plan_id, club_data, plan_content)
knowledge_store.upload_to_file_search(plan_id)

# Before next generation
rag_context = knowledge_store.get_context_for_generation(category, section)
# Agent sees examples from similar clubs

Benefits:

  • βœ… Quality improves over time
  • βœ… Consistent tone and structure
  • βœ… Learns domain-specific patterns

3. AI Cache Layer

Pattern: File-based cache keyed by prompt MD5 hash.

class AICache:
    def get(self, prompt: str) -> Optional[str]:
        cache_path = cache_dir / f"{md5(prompt)}.txt"
        if cache_path.exists():
            return cache_path.read_text()
        return None

    def set(self, prompt: str, response: str):
        cache_path = cache_dir / f"{md5(prompt)}.txt"
        cache_path.write_text(response)

Benefits:

  • βœ… Reduces API calls (~30% hit rate)
  • βœ… Faster iteration during development
  • βœ… Cost savings

Trade-offs:

  • ⚠️ Cache invalidation manual
  • ⚠️ No TTL (time to live)
  • ⚠️ Not distributed (single machine only)

4. Data Sourcing (3-Tier System)

Pattern: Transparent data provenance.

# Tier 1: Real data (highest trust)
if club_data.get(f"{field}_source") == "questionnaire":
    value = club_data[field]
    source = "(fonte: questionario)"

# Tier 2: Web research
elif research_data.get(field):
    value = research_data[field]
    source = "(fonte: ricerca web)"

# Tier 3: AI estimation (lowest trust)
else:
    value = estimate_from_benchmarks(field, category)
    source = "(fonte: stima AI)"

Output: Every data point tagged with source badge.

🚨 Known Gotchas

1. Gemini API Quirks

Issue: Tool declarations must match exact API schema.

# ❌ WRONG (causes "Unknown field" error)
tool = genai.protos.Tool(google_search={})

# βœ… CORRECT (no tools, or use supported tools only)
model = genai.GenerativeModel(MODEL_CONFIG.name)

Fix Applied: Removed google_search tool in v5.5.

2. WeasyPrint Performance

Issue: PDF generation slow (~10-30s per document).

# Slow operation
weasyprint.HTML(string=html_content).write_pdf(filepath)

Workarounds:

  • Use simpler CSS (avoid heavy calc())
  • Optimize images (compress before embedding)
  • Consider alternative: wkhtmltopdf or puppeteer

3. SQLite Locking

Issue: Concurrent writes block.

# Multiple agents writing logs simultaneously β†’ lock
logger.info("Agent started")  # Writes to DB-backed log

Solution: Enable WAL mode (OPT-001).

PRAGMA journal_mode=WAL;

4. Context Window Limits

Issue: Prompts can exceed Gemini's 200k token limit.

Current Mitigation:

  • Truncate RAG context to 2 examples max
  • Summarize research data before injection

Better Solution: Stratified prompts (OPT-004).

5. Async/Await Confusion

Issue: asyncio.gather() used but calls are synchronous.

# ❌ FAKE ASYNC (no actual parallelism)
async def run_agent(role, agent):
    output = agent.generate(club_data)  # Synchronous blocking call
    return role.value, output

results = await asyncio.gather(*tasks)  # Doesn't parallelize

Why it doesn't work: agent.generate() calls model.generate_content() which is synchronous I/O.

Fix (OPT-002):

executor = ThreadPoolExecutor(max_workers=6)
futures = [executor.submit(agent.generate, club_data) for agent in agents]
results = [future.result() for future in as_completed(futures)]

πŸ“Š Performance Characteristics

Current Bottlenecks

Operation Time Bottleneck
Agent generation (6 agents) ~60s Fake async
RAG context fetch (6x) ~3-5s No indices
PDF export ~10-30s WeasyPrint CPU-bound
Knowledge store save ~2s SQLite write locks

Target Performance (v6.0)

Operation Current Target Improvement
Agent generation 60s 15-20s 66% faster
RAG context fetch 5s 1-2s 60% faster
Database queries 500ms 50ms 90% faster

πŸ” Security Patterns

1. License Enforcement

Pattern: Hardware-bound licensing.

# auth_manager.py
def check_license():
    hwid = get_machine_id()
    license_data = load_license_key()

    if not validate_license(hwid, license_data):
        abort(403, "Invalid license")

Trigger: Middleware on every request.

2. Multi-Tenancy Isolation

Pattern: User-scoped data access.

@login_required
def get_plan(plan_id):
    plan = db.get_plan(plan_id)

    if plan.owner_id != current_user.id:
        abort(403)  # Cannot access other users' plans

    return plan

3. Webhook Signature Validation

Pattern: HMAC verification for n8n webhooks.

@app.route("/webhook/n8n", methods=["POST"])
def n8n_webhook():
    signature = request.headers.get("X-API-Key")

    if signature != WEBHOOK_SECRET:
        abort(401)

    # Process webhook

πŸ› οΈ Development Patterns

1. Configuration Management

Pattern: Centralized config with environment overrides.

# config.py
GEMINI_API_KEY = os.environ.get("GEMINI_API_KEY", "")
MODEL_CONFIG = ModelConfig(
    name="gemini-2.0-flash",
    temperature=0.7,
    max_tokens=8192
)

# Usage
from config import GEMINI_API_KEY, MODEL_CONFIG

2. Logging Standards

Pattern: Structured logging with context.

import logging
logger = logging.getLogger(__name__)

# βœ… Good
logger.info(f"Agent {agent.spec.name} completed in {elapsed:.2f}s")
logger.error(f"Generation failed: {e}", exc_info=True)

# ❌ Bad
print("Done")  # Not logged
raise Exception(f"Error: {e}")  # No context

3. Error Handling Strategy

Pattern: Fail gracefully with fallbacks.

try:
    response = model.generate_content(prompt)
except InvalidArgument as e:
    logger.warning(f"Tool error, retrying without tools: {e}")
    fallback_model = genai.GenerativeModel(MODEL_CONFIG.name)
    response = fallback_model.generate_content(prompt)
except Exception as e:
    logger.error(f"Generation failed: {e}")
    return {'content': 'Error generating content', 'sources': []}

πŸ“š Module Dependencies

Core Dependencies

agents.py
β”œβ”€ config.py (MODEL_CONFIG, GEMINI_API_KEY)
β”œβ”€ data_sourcing.py (SourcedContentGenerator)
β”œβ”€ data_estimator.py (estimate_missing_financials)
└─ knowledge_store.py (SQLiteKnowledgeStore)

app.py
β”œβ”€ agents.py (MultiAgentOrchestrator)
β”œβ”€ structured_agent.py (StructuredOrchestrator)
β”œβ”€ export_*.py (all exporters)
β”œβ”€ knowledge_store.py
β”œβ”€ auth_manager.py (login_required decorator)
└─ license_manager.py (check_license)

External Dependencies (Critical)

  • google-generativeai==0.8.5 - Gemini API client
  • Flask==3.1.2 - Web framework
  • weasyprint==67.0 - PDF generation
  • python-docx==1.2.0 - DOCX generation
  • pytest==9.0.2 - Testing

πŸ”„ Refactoring Opportunities

High Priority

  1. Real Async (OPT-002)

    • Replace asyncio with ThreadPoolExecutor
    • Impact: 66% generation time reduction
  2. SQLite Indexing (OPT-001)

    • Add indices, enable WAL
    • Impact: 70% query speed improvement
  3. Export Unification (OPT-003)

    • Create BaseExporter class
    • Impact: -500 LOC, easier maintenance

Medium Priority

  1. Stratified Prompts (OPT-004)
  2. Circuit Breaker (OPT-005)
  3. Storage Layer Split (OPT-006)

Low Priority (Post-v6.0)

  1. FastAPI migration
  2. PostgreSQL + Redis
  3. Celery task queue

This document is living documentation. Update when patterns change.

Last Updated: 2026-01-23