sml-agents-publish-subscribe-dbvec / CONFIGURATION_GUIDE.md
santanche's picture
refactor (model): replacing phi3 by phi4-mini
f369a7d

Configuration Save/Load Guide

Overview

The Pub/Sub Multi-Agent System now supports saving and loading complete configurations, allowing you to:

  • Save your entire setup (data sources + agents) as a JSON file
  • Share configurations with teammates
  • Version control your agent pipelines
  • Quickly switch between different workflows

Save Configuration

How to Save

  1. Configure your data sources and agents
  2. (Optional) Check "☑ Save results" to include execution results
  3. Click the "Save Config" button in the top-right header
  4. A JSON file will download automatically with the name pattern:
    pubsub-config-YYYY-MM-DD.json
    

Save Results Checkbox

The "☑ Save results" checkbox allows you to include execution results in the saved configuration.

When checked, the config includes:

  • All configuration data (agents, data sources)
  • Final Result box content
  • NER Result box content
  • Execution Log content

When unchecked (default):

  • Only configuration data is saved
  • No results or logs

Use cases for saving results:

  • Document successful executions
  • Share complete analysis with team
  • Archive results with configuration
  • Review past executions later

What Gets Saved

The configuration file includes:

  • Version: Configuration format version (currently 1.0)
  • Timestamp: When the config was saved
  • User Question: Current question text
  • Data Sources: All data sources with labels and content
  • Agents: All agent configurations including:
    • Title
    • Prompt template
    • Model selection
    • Subscribe/Publish topics
    • Show result checkbox state
  • Results (if "Save results" checked):
    • Final Result box content
    • NER Result box content
    • Execution Log content

Example Configuration File

Without Results:

{
  "version": "1.0",
  "timestamp": "2026-02-01T10:30:00.000Z",
  "userQuestion": "What are the top 10 customers?",
  "dataSources": [
    {
      "label": "Schema",
      "content": "Tables:\n- customers (id, name, email)\n- orders (id, customer_id, total)"
    }
  ],
  "agents": [
    {
      "title": "SQL Generator",
      "prompt": "Generate SQL for: {question}\nSchema: {schema}",
      "model": "phi4-mini",
      "subscribeTopic": "START",
      "publishTopic": "SQL_GENERATED",
      "showResult": true
    }
  ]
}

With Results (when "Save results" is checked):

{
  "version": "1.0",
  "timestamp": "2026-02-01T10:30:00.000Z",
  "userQuestion": "Extract medical entities from patient note",
  "dataSources": [...],
  "agents": [...],
  "results": {
    "finalResult": "--- Entity Extractor ---\n[{\"text\": \"diabetes\", \"entity_type\": \"PROBLEM\"}]",
    "nerResult": "Patient has [diabetes:PROBLEM] and takes [metformin:TREATMENT]",
    "executionLog": "[10:30:00] ℹ️ Starting...\n[10:30:05] ✅ Complete"
  }
}

Load Configuration

How to Load

  1. Click the "Load Config" button in the top-right header
  2. Select a previously saved JSON configuration file
  3. The system will:
    • Clear current configuration
    • Load all data sources
    • Load all agents
    • Restore the user question
    • Display success message in logs

What Happens on Load

  • Current config is replaced: All existing data sources and agents are removed
  • New IDs assigned: Loaded items get new unique IDs
  • Results restored (if saved with results):
    • Final Result box populated
    • NER Result box populated
    • Execution Log populated
  • Empty boxes (if no results saved):
    • All result boxes cleared
  • Validation: File is checked for proper format before loading

Error Handling

If the configuration file is invalid, you'll see an error message:

Failed to load configuration: Invalid configuration file

Common issues:

  • Wrong file format (not JSON)
  • Missing required fields (version, dataSources, agents)
  • Corrupted file

Use Cases

Use Case 1: Template Workflows

Save common workflows as templates:

sql-analysis-template.json

{
  "version": "1.0",
  "dataSources": [
    {"label": "Schema", "content": ""},
    {"label": "SampleData", "content": ""}
  ],
  "agents": [
    {"title": "Analyzer", "prompt": "...", ...},
    {"title": "Generator", "prompt": "...", ...},
    {"title": "Validator", "prompt": "...", ...}
  ]
}

Load this template and just fill in the Schema!

Use Case 2: Team Collaboration

Share configurations with your team:

  1. Developer A creates optimal pipeline
  2. Saves config: customer-analysis-pipeline.json
  3. Commits to Git repository
  4. Developer B loads config
  5. Everyone uses same proven workflow

Use Case 3: A/B Testing Prompts

Compare different prompt strategies:

Workflow:

  1. Create pipeline with Approach A
  2. Save as approach-a.json
  3. Modify prompts for Approach B
  4. Save as approach-b.json
  5. Load each config and compare results

Use Case 4: Different Data Sources

Same agents, different data:

Workflow:

  1. Create agent pipeline once
  2. Save config with empty data sources
  3. For each new dataset:
    • Load config
    • Add new data sources
    • Execute
    • Save results

Use Case 5: Version Control

Track evolution of your pipelines:

git/
├── configs/
│   ├── v1-basic-sql.json
│   ├── v2-with-validation.json
│   ├── v3-multi-step.json
│   └── v4-production.json

Load previous versions to compare performance.

Best Practices

1. Naming Conventions

Use descriptive filenames:

✅ Good:
- medical-diagnosis-workflow-v2.json
- sql-generator-with-validation.json
- customer-analysis-pipeline.json

❌ Bad:
- config.json
- test.json
- backup.json

2. Documentation in Configs

Add comments in data sources:

{
  "label": "Schema",
  "content": "# Customer Database Schema v2.0\n# Last updated: 2026-02-01\n\nTables:\n- customers ..."
}

3. Version Your Configs

Include version info in data sources:

{
  "label": "ConfigInfo",
  "content": "Pipeline Version: 3.0\nAuthor: Jane Doe\nPurpose: SQL generation with validation\nLast Modified: 2026-02-01"
}

4. Organize by Purpose

Create folder structure:

configs/
├── sql-generation/
│   ├── basic.json
│   ├── with-validation.json
│   └── with-optimization.json
├── medical-analysis/
│   ├── symptom-analysis.json
│   └── diagnosis-support.json
└── data-analysis/
    ├── sales-report.json
    └── customer-segmentation.json

5. Template Strategy

Create base templates without data:

{
  "dataSources": [
    {"label": "Schema", "content": ""},
    {"label": "Data", "content": ""}
  ],
  "agents": [ /* fully configured */ ]
}

Load template, add data, execute!

6. Backup Before Experiments

Before trying new approaches:

  1. Save current config
  2. Make experimental changes
  3. If it works: save new version
  4. If it fails: reload backup

Configuration File Structure

Required Fields

{
  "version": "1.0",           // Required: config format version
  "dataSources": [],          // Required: array (can be empty)
  "agents": []                // Required: array (can be empty)
}

Optional Fields

{
  "timestamp": "...",         // Optional: when saved
  "userQuestion": "..."       // Optional: user question text
}

Data Source Object

{
  "label": "string",          // Required: reference name
  "content": "string"         // Required: content (can be empty)
}

Agent Object

{
  "title": "string",          // Required: agent name
  "prompt": "string",         // Required: prompt template
  "model": "string",          // Required: model name
  "subscribeTopic": "string", // Required: topic to listen to
  "publishTopic": "string",   // Optional: topic to publish to (can be null/empty)
  "showResult": boolean       // Required: whether to show in results
}

Advanced Usage

Programmatic Config Generation

Generate configs programmatically:

import json

config = {
    "version": "1.0",
    "timestamp": "2026-02-01T10:00:00Z",
    "dataSources": [
        {"label": "Schema", "content": load_schema_from_db()},
        {"label": "Rules", "content": load_business_rules()}
    ],
    "agents": [
        {
            "title": "SQL Generator",
            "prompt": "...",
            "model": "phi4-mini",
            "subscribeTopic": "START",
            "publishTopic": "SQL",
            "showResult": True
        }
    ]
}

with open('auto-generated-config.json', 'w') as f:
    json.dump(config, f, indent=2)

Config Validation Script

Validate configs before loading:

import json

def validate_config(filepath):
    with open(filepath) as f:
        config = json.load(f)
    
    # Check required fields
    assert "version" in config
    assert "dataSources" in config
    assert "agents" in config
    
    # Validate data sources
    for ds in config["dataSources"]:
        assert "label" in ds
        assert "content" in ds
    
    # Validate agents
    for agent in config["agents"]:
        assert "title" in agent
        assert "prompt" in agent
        assert "model" in agent
        assert "subscribeTopic" in agent
        assert "showResult" in agent
    
    print(f"✓ Config is valid: {len(config['dataSources'])} data sources, {len(config['agents'])} agents")

validate_config("my-config.json")

Merge Configs

Combine multiple configs:

import json

def merge_configs(config1_path, config2_path, output_path):
    with open(config1_path) as f1, open(config2_path) as f2:
        c1 = json.load(f1)
        c2 = json.load(f2)
    
    merged = {
        "version": "1.0",
        "dataSources": c1["dataSources"] + c2["dataSources"],
        "agents": c1["agents"] + c2["agents"],
        "userQuestion": c1.get("userQuestion", "")
    }
    
    with open(output_path, 'w') as f:
        json.dump(merged, f, indent=2)

merge_configs("pipeline-a.json", "pipeline-b.json", "merged-pipeline.json")

Troubleshooting

Issue: "Invalid configuration file"

Cause: File format is incorrect
Solution:

  1. Open file in text editor
  2. Verify it's valid JSON
  3. Check required fields exist

Issue: Data sources empty after load

Cause: Content wasn't saved
Solution: Check original file has "content" fields populated

Issue: Agents not working after load

Cause: Model might not be available
Solution: Check agent "model" field matches available models (phi4-mini, cniongolo/biomistral)

Issue: Topics not matching after load

Cause: Topic names might have changed
Solution: Topics are case-insensitive now, but check for typos

Tips

  1. Always test after loading: Execute pipeline to verify everything works
  2. Keep configs small: Separate large data sources into multiple configs
  3. Use version control: Track configs in Git for history
  4. Document changes: Add comments in data source content
  5. Share wisely: Remove sensitive data before sharing configs