sml-agents-publish-subscribe-dbvec

Sleeping

App Files Files Community

sml-agents-publish-subscribe-dbvec / CONFIGURATION_GUIDE.md

santanche

refactor (model): replacing phi3 by phi4-mini

f369a7d about 1 month ago

preview code

raw

history blame contribute delete

11.2 kB

	# Configuration Save/Load Guide

	## Overview

	The Pub/Sub Multi-Agent System now supports saving and loading complete configurations, allowing you to:
	- Save your entire setup (data sources + agents) as a JSON file
	- Share configurations with teammates
	- Version control your agent pipelines
	- Quickly switch between different workflows

	## Save Configuration

	### How to Save

	1. Configure your data sources and agents
	2. (Optional) Check "☑ Save results" to include execution results
	3. Click the "Save Config" button in the top-right header
	4. A JSON file will download automatically with the name pattern:
	```
	pubsub-config-YYYY-MM-DD.json
	```

	### Save Results Checkbox

	The "☑ Save results" checkbox allows you to include execution results in the saved configuration.

	When checked, the config includes:
	- All configuration data (agents, data sources)
	- Final Result box content
	- NER Result box content
	- Execution Log content

	When unchecked (default):
	- Only configuration data is saved
	- No results or logs

	Use cases for saving results:
	- Document successful executions
	- Share complete analysis with team
	- Archive results with configuration
	- Review past executions later

	### What Gets Saved

	The configuration file includes:
	- Version: Configuration format version (currently 1.0)
	- Timestamp: When the config was saved
	- User Question: Current question text
	- Data Sources: All data sources with labels and content
	- Agents: All agent configurations including:
	- Title
	- Prompt template
	- Model selection
	- Subscribe/Publish topics
	- Show result checkbox state
	- Results (if "Save results" checked):
	- Final Result box content
	- NER Result box content
	- Execution Log content

	### Example Configuration File

	Without Results:
	```json
	{
	"version": "1.0",
	"timestamp": "2026-02-01T10:30:00.000Z",
	"userQuestion": "What are the top 10 customers?",
	"dataSources": [
	{
	"label": "Schema",
	"content": "Tables:\n- customers (id, name, email)\n- orders (id, customer_id, total)"
	}
	],
	"agents": [
	{
	"title": "SQL Generator",
	"prompt": "Generate SQL for: {question}\nSchema: {schema}",
	"model": "phi4-mini",
	"subscribeTopic": "START",
	"publishTopic": "SQL_GENERATED",
	"showResult": true
	}
	]
	}
	```

	With Results (when "Save results" is checked):
	```json
	{
	"version": "1.0",
	"timestamp": "2026-02-01T10:30:00.000Z",
	"userQuestion": "Extract medical entities from patient note",
	"dataSources": [...],
	"agents": [...],
	"results": {
	"finalResult": "--- Entity Extractor ---\n[{\"text\": \"diabetes\", \"entity_type\": \"PROBLEM\"}]",
	"nerResult": "Patient has [diabetes:PROBLEM] and takes [metformin:TREATMENT]",
	"executionLog": "[10:30:00] ℹ️ Starting...\n[10:30:05] ✅ Complete"
	}
	}
	```

	## Load Configuration

	### How to Load

	1. Click the "Load Config" button in the top-right header
	2. Select a previously saved JSON configuration file
	3. The system will:
	- Clear current configuration
	- Load all data sources
	- Load all agents
	- Restore the user question
	- Display success message in logs

	### What Happens on Load

	- Current config is replaced: All existing data sources and agents are removed
	- New IDs assigned: Loaded items get new unique IDs
	- Results restored (if saved with results):
	- Final Result box populated
	- NER Result box populated
	- Execution Log populated
	- Empty boxes (if no results saved):
	- All result boxes cleared
	- Validation: File is checked for proper format before loading

	### Error Handling

	If the configuration file is invalid, you'll see an error message:
	```
	Failed to load configuration: Invalid configuration file
	```

	Common issues:
	- Wrong file format (not JSON)
	- Missing required fields (version, dataSources, agents)
	- Corrupted file

	## Use Cases

	### Use Case 1: Template Workflows

	Save common workflows as templates:

	sql-analysis-template.json
	```json
	{
	"version": "1.0",
	"dataSources": [
	{"label": "Schema", "content": ""},
	{"label": "SampleData", "content": ""}
	],
	"agents": [
	{"title": "Analyzer", "prompt": "...", ...},
	{"title": "Generator", "prompt": "...", ...},
	{"title": "Validator", "prompt": "...", ...}
	]
	}
	```

	Load this template and just fill in the Schema!

	### Use Case 2: Team Collaboration

	Share configurations with your team:

	1. Developer A creates optimal pipeline
	2. Saves config: `customer-analysis-pipeline.json`
	3. Commits to Git repository
	4. Developer B loads config
	5. Everyone uses same proven workflow

	### Use Case 3: A/B Testing Prompts

	Compare different prompt strategies:

	Workflow:
	1. Create pipeline with Approach A
	2. Save as `approach-a.json`
	3. Modify prompts for Approach B
	4. Save as `approach-b.json`
	5. Load each config and compare results

	### Use Case 4: Different Data Sources

	Same agents, different data:

	Workflow:
	1. Create agent pipeline once
	2. Save config with empty data sources
	3. For each new dataset:
	- Load config
	- Add new data sources
	- Execute
	- Save results

	### Use Case 5: Version Control

	Track evolution of your pipelines:

	```bash
	git/
	├── configs/
	│ ├── v1-basic-sql.json
	│ ├── v2-with-validation.json
	│ ├── v3-multi-step.json
	│ └── v4-production.json
	```

	Load previous versions to compare performance.

	## Best Practices

	### 1. Naming Conventions

	Use descriptive filenames:
	```
	✅ Good:
	- medical-diagnosis-workflow-v2.json
	- sql-generator-with-validation.json
	- customer-analysis-pipeline.json

	❌ Bad:
	- config.json
	- test.json
	- backup.json
	```

	### 2. Documentation in Configs

	Add comments in data sources:
	```json
	{
	"label": "Schema",
	"content": "# Customer Database Schema v2.0\n# Last updated: 2026-02-01\n\nTables:\n- customers ..."
	}
	```

	### 3. Version Your Configs

	Include version info in data sources:
	```json
	{
	"label": "ConfigInfo",
	"content": "Pipeline Version: 3.0\nAuthor: Jane Doe\nPurpose: SQL generation with validation\nLast Modified: 2026-02-01"
	}
	```

	### 4. Organize by Purpose

	Create folder structure:
	```
	configs/
	├── sql-generation/
	│ ├── basic.json
	│ ├── with-validation.json
	│ └── with-optimization.json
	├── medical-analysis/
	│ ├── symptom-analysis.json
	│ └── diagnosis-support.json
	└── data-analysis/
	├── sales-report.json
	└── customer-segmentation.json
	```

	### 5. Template Strategy

	Create base templates without data:
	```json
	{
	"dataSources": [
	{"label": "Schema", "content": ""},
	{"label": "Data", "content": ""}
	],
	"agents": [ /* fully configured */ ]
	}
	```

	Load template, add data, execute!

	### 6. Backup Before Experiments

	Before trying new approaches:
	1. Save current config
	2. Make experimental changes
	3. If it works: save new version
	4. If it fails: reload backup

	## Configuration File Structure

	### Required Fields

	```json
	{
	"version": "1.0", // Required: config format version
	"dataSources": [], // Required: array (can be empty)
	"agents": [] // Required: array (can be empty)
	}
	```

	### Optional Fields

	```json
	{
	"timestamp": "...", // Optional: when saved
	"userQuestion": "..." // Optional: user question text
	}
	```

	### Data Source Object

	```json
	{
	"label": "string", // Required: reference name
	"content": "string" // Required: content (can be empty)
	}
	```

	### Agent Object

	```json
	{
	"title": "string", // Required: agent name
	"prompt": "string", // Required: prompt template
	"model": "string", // Required: model name
	"subscribeTopic": "string", // Required: topic to listen to
	"publishTopic": "string", // Optional: topic to publish to (can be null/empty)
	"showResult": boolean // Required: whether to show in results
	}
	```

	## Advanced Usage

	### Programmatic Config Generation

	Generate configs programmatically:

	```python
	import json

	config = {
	"version": "1.0",
	"timestamp": "2026-02-01T10:00:00Z",
	"dataSources": [
	{"label": "Schema", "content": load_schema_from_db()},
	{"label": "Rules", "content": load_business_rules()}
	],
	"agents": [
	{
	"title": "SQL Generator",
	"prompt": "...",
	"model": "phi4-mini",
	"subscribeTopic": "START",
	"publishTopic": "SQL",
	"showResult": True
	}
	]
	}

	with open('auto-generated-config.json', 'w') as f:
	json.dump(config, f, indent=2)
	```

	### Config Validation Script

	Validate configs before loading:

	```python
	import json

	def validate_config(filepath):
	with open(filepath) as f:
	config = json.load(f)

	# Check required fields
	assert "version" in config
	assert "dataSources" in config
	assert "agents" in config

	# Validate data sources
	for ds in config["dataSources"]:
	assert "label" in ds
	assert "content" in ds

	# Validate agents
	for agent in config["agents"]:
	assert "title" in agent
	assert "prompt" in agent
	assert "model" in agent
	assert "subscribeTopic" in agent
	assert "showResult" in agent

	print(f"✓ Config is valid: {len(config['dataSources'])} data sources, {len(config['agents'])} agents")

	validate_config("my-config.json")
	```

	### Merge Configs

	Combine multiple configs:

	```python
	import json

	def merge_configs(config1_path, config2_path, output_path):
	with open(config1_path) as f1, open(config2_path) as f2:
	c1 = json.load(f1)
	c2 = json.load(f2)

	merged = {
	"version": "1.0",
	"dataSources": c1["dataSources"] + c2["dataSources"],
	"agents": c1["agents"] + c2["agents"],
	"userQuestion": c1.get("userQuestion", "")
	}

	with open(output_path, 'w') as f:
	json.dump(merged, f, indent=2)

	merge_configs("pipeline-a.json", "pipeline-b.json", "merged-pipeline.json")
	```

	## Troubleshooting

	### Issue: "Invalid configuration file"

	Cause: File format is incorrect
	Solution:
	1. Open file in text editor
	2. Verify it's valid JSON
	3. Check required fields exist

	### Issue: Data sources empty after load

	Cause: Content wasn't saved
	Solution: Check original file has "content" fields populated

	### Issue: Agents not working after load

	Cause: Model might not be available
	Solution: Check agent "model" field matches available models (phi4-mini, cniongolo/biomistral)

	### Issue: Topics not matching after load

	Cause: Topic names might have changed
	Solution: Topics are case-insensitive now, but check for typos

	## Tips

	1. Always test after loading: Execute pipeline to verify everything works
	2. Keep configs small: Separate large data sources into multiple configs
	3. Use version control: Track configs in Git for history
	4. Document changes: Add comments in data source content
	5. Share wisely: Remove sensitive data before sharing configs