sml-agents-publish-subscribe-dbvec / CONFIGURATION_GUIDE.md
santanche's picture
refactor (model): replacing phi3 by phi4-mini
f369a7d
# Configuration Save/Load Guide
## Overview
The Pub/Sub Multi-Agent System now supports saving and loading complete configurations, allowing you to:
- Save your entire setup (data sources + agents) as a JSON file
- Share configurations with teammates
- Version control your agent pipelines
- Quickly switch between different workflows
## Save Configuration
### How to Save
1. Configure your data sources and agents
2. **(Optional)** Check "☑ Save results" to include execution results
3. Click the **"Save Config"** button in the top-right header
4. A JSON file will download automatically with the name pattern:
```
pubsub-config-YYYY-MM-DD.json
```
### Save Results Checkbox
The **"☑ Save results"** checkbox allows you to include execution results in the saved configuration.
**When checked**, the config includes:
- All configuration data (agents, data sources)
- Final Result box content
- NER Result box content
- Execution Log content
**When unchecked** (default):
- Only configuration data is saved
- No results or logs
**Use cases for saving results**:
- Document successful executions
- Share complete analysis with team
- Archive results with configuration
- Review past executions later
### What Gets Saved
The configuration file includes:
- **Version**: Configuration format version (currently 1.0)
- **Timestamp**: When the config was saved
- **User Question**: Current question text
- **Data Sources**: All data sources with labels and content
- **Agents**: All agent configurations including:
- Title
- Prompt template
- Model selection
- Subscribe/Publish topics
- Show result checkbox state
- **Results** (if "Save results" checked):
- Final Result box content
- NER Result box content
- Execution Log content
### Example Configuration File
**Without Results**:
```json
{
"version": "1.0",
"timestamp": "2026-02-01T10:30:00.000Z",
"userQuestion": "What are the top 10 customers?",
"dataSources": [
{
"label": "Schema",
"content": "Tables:\n- customers (id, name, email)\n- orders (id, customer_id, total)"
}
],
"agents": [
{
"title": "SQL Generator",
"prompt": "Generate SQL for: {question}\nSchema: {schema}",
"model": "phi4-mini",
"subscribeTopic": "START",
"publishTopic": "SQL_GENERATED",
"showResult": true
}
]
}
```
**With Results** (when "Save results" is checked):
```json
{
"version": "1.0",
"timestamp": "2026-02-01T10:30:00.000Z",
"userQuestion": "Extract medical entities from patient note",
"dataSources": [...],
"agents": [...],
"results": {
"finalResult": "--- Entity Extractor ---\n[{\"text\": \"diabetes\", \"entity_type\": \"PROBLEM\"}]",
"nerResult": "Patient has [diabetes:PROBLEM] and takes [metformin:TREATMENT]",
"executionLog": "[10:30:00] ℹ️ Starting...\n[10:30:05] ✅ Complete"
}
}
```
## Load Configuration
### How to Load
1. Click the **"Load Config"** button in the top-right header
2. Select a previously saved JSON configuration file
3. The system will:
- Clear current configuration
- Load all data sources
- Load all agents
- Restore the user question
- Display success message in logs
### What Happens on Load
- **Current config is replaced**: All existing data sources and agents are removed
- **New IDs assigned**: Loaded items get new unique IDs
- **Results restored** (if saved with results):
- Final Result box populated
- NER Result box populated
- Execution Log populated
- **Empty boxes** (if no results saved):
- All result boxes cleared
- **Validation**: File is checked for proper format before loading
### Error Handling
If the configuration file is invalid, you'll see an error message:
```
Failed to load configuration: Invalid configuration file
```
Common issues:
- Wrong file format (not JSON)
- Missing required fields (version, dataSources, agents)
- Corrupted file
## Use Cases
### Use Case 1: Template Workflows
Save common workflows as templates:
**sql-analysis-template.json**
```json
{
"version": "1.0",
"dataSources": [
{"label": "Schema", "content": ""},
{"label": "SampleData", "content": ""}
],
"agents": [
{"title": "Analyzer", "prompt": "...", ...},
{"title": "Generator", "prompt": "...", ...},
{"title": "Validator", "prompt": "...", ...}
]
}
```
Load this template and just fill in the Schema!
### Use Case 2: Team Collaboration
Share configurations with your team:
1. Developer A creates optimal pipeline
2. Saves config: `customer-analysis-pipeline.json`
3. Commits to Git repository
4. Developer B loads config
5. Everyone uses same proven workflow
### Use Case 3: A/B Testing Prompts
Compare different prompt strategies:
**Workflow:**
1. Create pipeline with Approach A
2. Save as `approach-a.json`
3. Modify prompts for Approach B
4. Save as `approach-b.json`
5. Load each config and compare results
### Use Case 4: Different Data Sources
Same agents, different data:
**Workflow:**
1. Create agent pipeline once
2. Save config with empty data sources
3. For each new dataset:
- Load config
- Add new data sources
- Execute
- Save results
### Use Case 5: Version Control
Track evolution of your pipelines:
```bash
git/
├── configs/
│ ├── v1-basic-sql.json
│ ├── v2-with-validation.json
│ ├── v3-multi-step.json
│ └── v4-production.json
```
Load previous versions to compare performance.
## Best Practices
### 1. Naming Conventions
Use descriptive filenames:
```
✅ Good:
- medical-diagnosis-workflow-v2.json
- sql-generator-with-validation.json
- customer-analysis-pipeline.json
❌ Bad:
- config.json
- test.json
- backup.json
```
### 2. Documentation in Configs
Add comments in data sources:
```json
{
"label": "Schema",
"content": "# Customer Database Schema v2.0\n# Last updated: 2026-02-01\n\nTables:\n- customers ..."
}
```
### 3. Version Your Configs
Include version info in data sources:
```json
{
"label": "ConfigInfo",
"content": "Pipeline Version: 3.0\nAuthor: Jane Doe\nPurpose: SQL generation with validation\nLast Modified: 2026-02-01"
}
```
### 4. Organize by Purpose
Create folder structure:
```
configs/
├── sql-generation/
│ ├── basic.json
│ ├── with-validation.json
│ └── with-optimization.json
├── medical-analysis/
│ ├── symptom-analysis.json
│ └── diagnosis-support.json
└── data-analysis/
├── sales-report.json
└── customer-segmentation.json
```
### 5. Template Strategy
Create base templates without data:
```json
{
"dataSources": [
{"label": "Schema", "content": ""},
{"label": "Data", "content": ""}
],
"agents": [ /* fully configured */ ]
}
```
Load template, add data, execute!
### 6. Backup Before Experiments
Before trying new approaches:
1. Save current config
2. Make experimental changes
3. If it works: save new version
4. If it fails: reload backup
## Configuration File Structure
### Required Fields
```json
{
"version": "1.0", // Required: config format version
"dataSources": [], // Required: array (can be empty)
"agents": [] // Required: array (can be empty)
}
```
### Optional Fields
```json
{
"timestamp": "...", // Optional: when saved
"userQuestion": "..." // Optional: user question text
}
```
### Data Source Object
```json
{
"label": "string", // Required: reference name
"content": "string" // Required: content (can be empty)
}
```
### Agent Object
```json
{
"title": "string", // Required: agent name
"prompt": "string", // Required: prompt template
"model": "string", // Required: model name
"subscribeTopic": "string", // Required: topic to listen to
"publishTopic": "string", // Optional: topic to publish to (can be null/empty)
"showResult": boolean // Required: whether to show in results
}
```
## Advanced Usage
### Programmatic Config Generation
Generate configs programmatically:
```python
import json
config = {
"version": "1.0",
"timestamp": "2026-02-01T10:00:00Z",
"dataSources": [
{"label": "Schema", "content": load_schema_from_db()},
{"label": "Rules", "content": load_business_rules()}
],
"agents": [
{
"title": "SQL Generator",
"prompt": "...",
"model": "phi4-mini",
"subscribeTopic": "START",
"publishTopic": "SQL",
"showResult": True
}
]
}
with open('auto-generated-config.json', 'w') as f:
json.dump(config, f, indent=2)
```
### Config Validation Script
Validate configs before loading:
```python
import json
def validate_config(filepath):
with open(filepath) as f:
config = json.load(f)
# Check required fields
assert "version" in config
assert "dataSources" in config
assert "agents" in config
# Validate data sources
for ds in config["dataSources"]:
assert "label" in ds
assert "content" in ds
# Validate agents
for agent in config["agents"]:
assert "title" in agent
assert "prompt" in agent
assert "model" in agent
assert "subscribeTopic" in agent
assert "showResult" in agent
print(f"✓ Config is valid: {len(config['dataSources'])} data sources, {len(config['agents'])} agents")
validate_config("my-config.json")
```
### Merge Configs
Combine multiple configs:
```python
import json
def merge_configs(config1_path, config2_path, output_path):
with open(config1_path) as f1, open(config2_path) as f2:
c1 = json.load(f1)
c2 = json.load(f2)
merged = {
"version": "1.0",
"dataSources": c1["dataSources"] + c2["dataSources"],
"agents": c1["agents"] + c2["agents"],
"userQuestion": c1.get("userQuestion", "")
}
with open(output_path, 'w') as f:
json.dump(merged, f, indent=2)
merge_configs("pipeline-a.json", "pipeline-b.json", "merged-pipeline.json")
```
## Troubleshooting
### Issue: "Invalid configuration file"
**Cause**: File format is incorrect
**Solution**:
1. Open file in text editor
2. Verify it's valid JSON
3. Check required fields exist
### Issue: Data sources empty after load
**Cause**: Content wasn't saved
**Solution**: Check original file has "content" fields populated
### Issue: Agents not working after load
**Cause**: Model might not be available
**Solution**: Check agent "model" field matches available models (phi4-mini, cniongolo/biomistral)
### Issue: Topics not matching after load
**Cause**: Topic names might have changed
**Solution**: Topics are case-insensitive now, but check for typos
## Tips
1. **Always test after loading**: Execute pipeline to verify everything works
2. **Keep configs small**: Separate large data sources into multiple configs
3. **Use version control**: Track configs in Git for history
4. **Document changes**: Add comments in data source content
5. **Share wisely**: Remove sensitive data before sharing configs