Spaces:

mtornani
/

rooting-future

Sleeping

App Files Files Community

rooting-future / docs /ARCHITECTURE.md

mtornani

Initial HF Spaces deployment (clean branch without large binaries)

38f9c15 3 months ago

preview code

raw

history blame contribute delete

16.1 kB

Rooting Future - Architecture & Patterns

🏗️ System Architecture

High-Level Overview

┌─────────────────────────────────────────────────────────────┐
│                        Flask App (app.py)                    │
│  Routes | Auth | Webhooks | SSE Streaming | Admin Panel     │
└───────────────────────┬─────────────────────────────────────┘
                        │
        ┌───────────────┴───────────────┐
        │                               │
        ▼                               ▼
┌──────────────────┐          ┌──────────────────┐
│  Multi-Agent     │          │  Structured      │
│  Orchestrator    │          │  Orchestrator    │
│  (agents.py)     │          │  (structured_    │
│                  │          │   agent.py)      │
└────────┬─────────┘          └────────┬─────────┘
         │                             │
         │ spawns 6+1 agents           │ generates tables/data
         │                             │
         ▼                             ▼
┌────────────────────────────────────────────────┐
│            Gemini 2.0 Flash API                │
│         (google-generativeai SDK)              │
└────────────────────────────────────────────────┘
         │                             │
         │ stores/retrieves            │
         ▼                             ▼
┌─────────────────────┐    ┌──────────────────────┐
│  Knowledge Store    │    │   Export Pipeline    │
│  (SQLite + RAG)     │    │   (PDF/DOCX/HTML)    │
└─────────────────────┘    └──────────────────────┘

📦 Core Components

1. Multi-Agent Orchestrator (`agents.py`)

Purpose: Coordinates 6 specialized agents + 1 coordinator to generate strategic plans.

Architecture:

class MultiAgentOrchestrator:
    def __init__(self, knowledge_store, file_search_store_name):
        self.agents = {
            AgentRole.COORDINATOR: StrategicAgent(...)
            AgentRole.STW_SPORTIVI: StrategicAgent(...)
            AgentRole.STW_STRUTTURALI: StrategicAgent(...)
            AgentRole.STW_MARKETING: StrategicAgent(...)
            AgentRole.STW_SOCIALI: StrategicAgent(...)
            AgentRole.FINANCIAL: StrategicAgent(...)
        }

    def generate_strategic_plan(self, club_data, parallel=True):
        # Execute agents
        # Synthesize with coordinator
        # Return complete plan

Key Methods:

generate_strategic_plan() - Main entry point
_generate_sequential() - Execute agents one by one
_generate_parallel() - Execute agents concurrently (⚠️ currently fake async)
generate_single_section() - Regenerate specific section

Agent Roles:

Role	Responsibility	Output Section
COORDINATOR	Executive summary synthesis	`executive_summary`
STW_SPORTIVI	8 sports objectives (MACRO 1-8)	`stw_sportivi`
STW_STRUTTURALI	2 infrastructure objectives	`stw_strutturali`
STW_MARKETING	4 marketing objectives	`stw_marketing`
STW_SOCIALI	7 social objectives	`stw_sociali`
FINANCIAL	Economic-financial plan	`financial_plan`

2. Strategic Agent (`agents.py:StrategicAgent`)

Purpose: Individual agent that generates content for one domain.

Pattern:

class StrategicAgent:
    def __init__(self, spec: AgentSpec, file_search_store_name: str):
        self.spec = spec  # System prompt, expertise
        self.model = genai.GenerativeModel(MODEL_CONFIG.name)
        self.cache = AICache()  # File-based cache
        self.sourcer = SourcedContentGenerator()

    def generate(self, club_data, research_data, rag_context):
        # 1. Build prompt with RAG context
        prompt = self._build_simple_prompt(club_data, rag_context)

        # 2. Check AI cache
        cached = self.cache.get(prompt)
        if cached:
            return cached

        # 3. Call Gemini
        response = self.model.generate_content(prompt)

        # 4. Post-process and cache
        content = self._post_process(response.text)
        self.cache.set(prompt, content)

        return {'content': content, 'sources': [...]}

Critical Details:

⚠️ Gemini calls are synchronous - blocking I/O
✅ AI Cache reduces redundant calls (~30% hit rate)
✅ Fallback mechanism if tools fail (retry without tools)
⚠️ No rate limiting - can hit API limits

3. RAG System (Knowledge Store + File Search)

Purpose: Learn from previous successful plans to improve quality.

Flow:

Previous Plans → SQLite → Embeddings → File Search Store (Google)
                                              ↓
New Plan Generation ← RAG Context Fetch ←─────┘

Implementation:

# knowledge_store.py
class SQLiteKnowledgeStore:
    def save_plan(self, plan_id, club_data, plan_content):
        # Save to SQLite
        # Upload to Gemini File Search Store for RAG

    def get_context_for_generation(self, club_category, section_type):
        # Query similar plans by category
        # Return 1-2 best examples

⚠️ Performance Issue:

No indices on category, section_type → linear scan
Queried for EVERY agent (6x per plan) → bottleneck

4. Export Pipeline

Current State (6 separate modules):

export_pdf_server.py    → WeasyPrint (HTML → PDF)
export_html.py          → Jinja2 templates
export_docx.py          → python-docx
export_paged.py         → Paged CSS media (print-optimized HTML)
export_onepager.py      → Infographic summary
export_package.py       → ZIP bundler (all formats)

Common Pattern (duplicated 6 times):

def export(club_name, plan_data, metadata):
    # 1. Apply branding (colors, logo)
    # 2. Render template
    # 3. Add methodology section
    # 4. Generate output file
    # 5. Return filepath

⚠️ Problem: 60% code duplication across modules.

Solution Needed: BaseExporter abstract class.

🔄 Data Flow

Plan Generation Flow

1. User Input (Web Form)
   ↓
2. app.py: /api/generate endpoint
   ↓
3. Background task: generate_strategic_plan_background()
   ↓
4. MultiAgentOrchestrator.generate_strategic_plan()
   ├─ RAG context fetch (6x parallel)
   ├─ Agent 1: STW Sportivi → Gemini API
   ├─ Agent 2: STW Strutturali → Gemini API
   ├─ Agent 3: STW Marketing → Gemini API
   ├─ Agent 4: STW Sociali → Gemini API
   ├─ Agent 5: Financial → Gemini API
   └─ Agent 6: Coordinator → Gemini API (synthesizes)
   ↓
5. Post-Production Editor (cleanup, validation)
   ↓
6. Knowledge Store: save plan + upload to File Search
   ↓
7. Export Pipeline: Generate PDF, DOCX, OnePager
   ↓
8. Response: URLs to download files

Timing (current):

Steps 1-3: ~2s
Step 4 (Generation): ~60s ⚠️ (should be 15-20s)
Steps 5-7: ~15s
Total: ~77s

🧩 Key Design Patterns

1. Multi-Agent Coordination

Pattern: Specialized agents with single responsibility.

Benefits:

✅ Modular: Each agent can be improved independently
✅ Testable: Can test individual agents
✅ Scalable: Add new agents without touching existing

Trade-offs:

⚠️ Sequential execution slow (60s)
⚠️ Coordinator depends on all agents completing

2. RAG Learning Loop

Pattern: Every generated plan becomes training data.

# After generation
knowledge_store.save_plan(plan_id, club_data, plan_content)
knowledge_store.upload_to_file_search(plan_id)

# Before next generation
rag_context = knowledge_store.get_context_for_generation(category, section)
# Agent sees examples from similar clubs

Benefits:

✅ Quality improves over time
✅ Consistent tone and structure
✅ Learns domain-specific patterns

3. AI Cache Layer

Pattern: File-based cache keyed by prompt MD5 hash.

class AICache:
    def get(self, prompt: str) -> Optional[str]:
        cache_path = cache_dir / f"{md5(prompt)}.txt"
        if cache_path.exists():
            return cache_path.read_text()
        return None

    def set(self, prompt: str, response: str):
        cache_path = cache_dir / f"{md5(prompt)}.txt"
        cache_path.write_text(response)

Benefits:

✅ Reduces API calls (~30% hit rate)
✅ Faster iteration during development
✅ Cost savings

Trade-offs:

⚠️ Cache invalidation manual
⚠️ No TTL (time to live)
⚠️ Not distributed (single machine only)

4. Data Sourcing (3-Tier System)

Pattern: Transparent data provenance.

# Tier 1: Real data (highest trust)
if club_data.get(f"{field}_source") == "questionnaire":
    value = club_data[field]
    source = "(fonte: questionario)"

# Tier 2: Web research
elif research_data.get(field):
    value = research_data[field]
    source = "(fonte: ricerca web)"

# Tier 3: AI estimation (lowest trust)
else:
    value = estimate_from_benchmarks(field, category)
    source = "(fonte: stima AI)"

Output: Every data point tagged with source badge.

🚨 Known Gotchas

1. Gemini API Quirks

Issue: Tool declarations must match exact API schema.

# ❌ WRONG (causes "Unknown field" error)
tool = genai.protos.Tool(google_search={})

# ✅ CORRECT (no tools, or use supported tools only)
model = genai.GenerativeModel(MODEL_CONFIG.name)

Fix Applied: Removed google_search tool in v5.5.

2. WeasyPrint Performance

Issue: PDF generation slow (~10-30s per document).

# Slow operation
weasyprint.HTML(string=html_content).write_pdf(filepath)

Workarounds:

Use simpler CSS (avoid heavy calc())
Optimize images (compress before embedding)
Consider alternative: wkhtmltopdf or puppeteer

3. SQLite Locking

Issue: Concurrent writes block.

# Multiple agents writing logs simultaneously → lock
logger.info("Agent started")  # Writes to DB-backed log

Solution: Enable WAL mode (OPT-001).

PRAGMA journal_mode=WAL;

4. Context Window Limits

Issue: Prompts can exceed Gemini's 200k token limit.

Current Mitigation:

Truncate RAG context to 2 examples max
Summarize research data before injection

Better Solution: Stratified prompts (OPT-004).

5. Async/Await Confusion

Issue: asyncio.gather() used but calls are synchronous.

# ❌ FAKE ASYNC (no actual parallelism)
async def run_agent(role, agent):
    output = agent.generate(club_data)  # Synchronous blocking call
    return role.value, output

results = await asyncio.gather(*tasks)  # Doesn't parallelize

Why it doesn't work: agent.generate() calls model.generate_content() which is synchronous I/O.

Fix (OPT-002):

executor = ThreadPoolExecutor(max_workers=6)
futures = [executor.submit(agent.generate, club_data) for agent in agents]
results = [future.result() for future in as_completed(futures)]

📊 Performance Characteristics

Current Bottlenecks

Operation	Time	Bottleneck
Agent generation (6 agents)	~60s	Fake async
RAG context fetch (6x)	~3-5s	No indices
PDF export	~10-30s	WeasyPrint CPU-bound
Knowledge store save	~2s	SQLite write locks

Target Performance (v6.0)

Operation	Current	Target	Improvement
Agent generation	60s	15-20s	66% faster
RAG context fetch	5s	1-2s	60% faster
Database queries	500ms	50ms	90% faster

🔐 Security Patterns

1. License Enforcement

Pattern: Hardware-bound licensing.

# auth_manager.py
def check_license():
    hwid = get_machine_id()
    license_data = load_license_key()

    if not validate_license(hwid, license_data):
        abort(403, "Invalid license")

Trigger: Middleware on every request.

2. Multi-Tenancy Isolation

Pattern: User-scoped data access.

@login_required
def get_plan(plan_id):
    plan = db.get_plan(plan_id)

    if plan.owner_id != current_user.id:
        abort(403)  # Cannot access other users' plans

    return plan

3. Webhook Signature Validation

Pattern: HMAC verification for n8n webhooks.

@app.route("/webhook/n8n", methods=["POST"])
def n8n_webhook():
    signature = request.headers.get("X-API-Key")

    if signature != WEBHOOK_SECRET:
        abort(401)

    # Process webhook

🛠️ Development Patterns

1. Configuration Management

Pattern: Centralized config with environment overrides.

# config.py
GEMINI_API_KEY = os.environ.get("GEMINI_API_KEY", "")
MODEL_CONFIG = ModelConfig(
    name="gemini-2.0-flash",
    temperature=0.7,
    max_tokens=8192
)

# Usage
from config import GEMINI_API_KEY, MODEL_CONFIG

2. Logging Standards

Pattern: Structured logging with context.

import logging
logger = logging.getLogger(__name__)

# ✅ Good
logger.info(f"Agent {agent.spec.name} completed in {elapsed:.2f}s")
logger.error(f"Generation failed: {e}", exc_info=True)

# ❌ Bad
print("Done")  # Not logged
raise Exception(f"Error: {e}")  # No context

3. Error Handling Strategy

Pattern: Fail gracefully with fallbacks.

try:
    response = model.generate_content(prompt)
except InvalidArgument as e:
    logger.warning(f"Tool error, retrying without tools: {e}")
    fallback_model = genai.GenerativeModel(MODEL_CONFIG.name)
    response = fallback_model.generate_content(prompt)
except Exception as e:
    logger.error(f"Generation failed: {e}")
    return {'content': 'Error generating content', 'sources': []}

📚 Module Dependencies

Core Dependencies

agents.py
├─ config.py (MODEL_CONFIG, GEMINI_API_KEY)
├─ data_sourcing.py (SourcedContentGenerator)
├─ data_estimator.py (estimate_missing_financials)
└─ knowledge_store.py (SQLiteKnowledgeStore)

app.py
├─ agents.py (MultiAgentOrchestrator)
├─ structured_agent.py (StructuredOrchestrator)
├─ export_*.py (all exporters)
├─ knowledge_store.py
├─ auth_manager.py (login_required decorator)
└─ license_manager.py (check_license)

External Dependencies (Critical)

google-generativeai==0.8.5 - Gemini API client
Flask==3.1.2 - Web framework
weasyprint==67.0 - PDF generation
python-docx==1.2.0 - DOCX generation
pytest==9.0.2 - Testing

🔄 Refactoring Opportunities

High Priority

Real Async (OPT-002)
- Replace asyncio with ThreadPoolExecutor
- Impact: 66% generation time reduction
SQLite Indexing (OPT-001)
- Add indices, enable WAL
- Impact: 70% query speed improvement
Export Unification (OPT-003)
- Create BaseExporter class
- Impact: -500 LOC, easier maintenance

Medium Priority

Stratified Prompts (OPT-004)
Circuit Breaker (OPT-005)
Storage Layer Split (OPT-006)

Low Priority (Post-v6.0)

FastAPI migration
PostgreSQL + Redis
Celery task queue

This document is living documentation. Update when patterns change.

Last Updated: 2026-01-23

Rooting Future - Architecture & Patterns

🏗️ System Architecture

High-Level Overview

📦 Core Components

1. Multi-Agent Orchestrator (agents.py)

2. Strategic Agent (agents.py:StrategicAgent)

3. RAG System (Knowledge Store + File Search)

4. Export Pipeline

🔄 Data Flow

Plan Generation Flow

🧩 Key Design Patterns

1. Multi-Agent Coordination

2. RAG Learning Loop

3. AI Cache Layer

4. Data Sourcing (3-Tier System)

🚨 Known Gotchas

1. Gemini API Quirks

2. WeasyPrint Performance

3. SQLite Locking

4. Context Window Limits

5. Async/Await Confusion

📊 Performance Characteristics

Current Bottlenecks

Target Performance (v6.0)

🔐 Security Patterns

1. License Enforcement

2. Multi-Tenancy Isolation

3. Webhook Signature Validation

🛠️ Development Patterns

1. Configuration Management

2. Logging Standards

3. Error Handling Strategy

📚 Module Dependencies

Core Dependencies

External Dependencies (Critical)

🔄 Refactoring Opportunities

High Priority

Medium Priority

Low Priority (Post-v6.0)

1. Multi-Agent Orchestrator (`agents.py`)

2. Strategic Agent (`agents.py:StrategicAgent`)