File size: 3,644 Bytes
fb867c3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
# LM Studio Multi-Server Configuration

This directory contains configuration files for running multiple LM Studio servers simultaneously with the Felix Framework.

## Configuration Files

### `multi_model_config.json` ⭐ **RECOMMENDED**
Multi-model configuration for single LM Studio server:
- **Research Model**: `qwen/qwen3-4b-2507` - Fast exploration
- **Analysis Model**: `qwen/qwen3-4b-thinking-2507` - Reasoning focused
- **Synthesis Model**: `google/gemma-3-12b` - High-quality output
- All models use same server: `http://127.0.0.1:1234/v1`

### `server_config.json`
Multi-server configuration supporting up to 4 different LM Studio servers:
- **Creative Server** (port 1234): Fast model for research agents
- **Analytical Server** (port 1235): Balanced model for analysis/critic agents  
- **Synthesis Server** (port 1236): High-quality model for synthesis agents
- **Fallback Server** (port 1237): Fast backup model (disabled by default)

### `single_server_config.json`
Single-server configuration for comparison testing.

## Setting Up Multiple LM Studio Servers

1. **Start LM Studio instances on different ports:**
   ```bash
   # Terminal 1: Creative server
   lms server start --port 1234 --model mistral-7b-instruct
   
   # Terminal 2: Analytical server  
   lms server start --port 1235 --model llama-3.1-8b-instruct
   
   # Terminal 3: Synthesis server
   lms server start --port 1236 --model mixtral-8x7b-instruct
   ```

2. **Or use LM Studio GUI:**
   - Launch multiple LM Studio instances
   - Set different ports in Settings → Developer → Port
   - Load different models in each instance
   - Start servers

## Agent Type Mappings

- **Research agents** → Creative server (Mistral) for broad exploration
- **Analysis agents** → Analytical server (Llama) for focused analysis  
- **Synthesis agents** → Synthesis server (Mixtral) for high-quality output
- **Critic agents** → Analytical server (Llama) for reasoning/validation

## Usage

### With Multi-Model Config (Recommended):
```bash
python examples/blog_writer.py "AI safety" --server-config config/multi_model_config.json --debug
```

### Test Multi-Model Setup:
```bash
python examples/test_multi_model.py
```

### With Multi-Server Config:
```bash
python examples/blog_writer.py "AI safety" --server-config config/server_config.json --debug
```

### With Single-Server Config:
```bash
python examples/blog_writer.py "AI safety" --server-config config/single_server_config.json --debug
```

### Performance Comparison:
```bash
python examples/test_multi_server_performance.py
```

## Configuration Options

### Server Settings:
- `name`: Unique server identifier
- `url`: LM Studio server URL
- `model`: Model name to use
- `timeout`: Request timeout in seconds
- `max_concurrent`: Maximum concurrent requests
- `weight`: Load balancing weight
- `enabled`: Whether server is active

### Load Balancing Strategies:
- `agent_type_mapping`: Use agent type mappings (recommended)
- `round_robin`: Rotate between servers
- `least_busy`: Use server with lowest load
- `fastest_response`: Use server with best response time

## Health Monitoring

The system automatically:
- Checks server health every 30 seconds
- Fails over to available servers
- Monitors response times and load
- Displays server status in debug mode

## Performance Benefits

Multi-server setup provides:
- **True parallelism**: Agents process simultaneously
- **Model specialization**: Each agent type uses optimal model
- **Load distribution**: Spread across multiple GPUs/servers
- **Fault tolerance**: Continue if one server fails
- **3-4x performance**: With proper server setup