Spaces:
Paused
Paused
A newer version of the Gradio SDK is available:
6.5.1
LM Studio Multi-Server Configuration
This directory contains configuration files for running multiple LM Studio servers simultaneously with the Felix Framework.
Configuration Files
multi_model_config.json ⭐ RECOMMENDED
Multi-model configuration for single LM Studio server:
- Research Model:
qwen/qwen3-4b-2507- Fast exploration - Analysis Model:
qwen/qwen3-4b-thinking-2507- Reasoning focused - Synthesis Model:
google/gemma-3-12b- High-quality output - All models use same server:
http://127.0.0.1:1234/v1
server_config.json
Multi-server configuration supporting up to 4 different LM Studio servers:
- Creative Server (port 1234): Fast model for research agents
- Analytical Server (port 1235): Balanced model for analysis/critic agents
- Synthesis Server (port 1236): High-quality model for synthesis agents
- Fallback Server (port 1237): Fast backup model (disabled by default)
single_server_config.json
Single-server configuration for comparison testing.
Setting Up Multiple LM Studio Servers
Start LM Studio instances on different ports:
# Terminal 1: Creative server lms server start --port 1234 --model mistral-7b-instruct # Terminal 2: Analytical server lms server start --port 1235 --model llama-3.1-8b-instruct # Terminal 3: Synthesis server lms server start --port 1236 --model mixtral-8x7b-instructOr use LM Studio GUI:
- Launch multiple LM Studio instances
- Set different ports in Settings → Developer → Port
- Load different models in each instance
- Start servers
Agent Type Mappings
- Research agents → Creative server (Mistral) for broad exploration
- Analysis agents → Analytical server (Llama) for focused analysis
- Synthesis agents → Synthesis server (Mixtral) for high-quality output
- Critic agents → Analytical server (Llama) for reasoning/validation
Usage
With Multi-Model Config (Recommended):
python examples/blog_writer.py "AI safety" --server-config config/multi_model_config.json --debug
Test Multi-Model Setup:
python examples/test_multi_model.py
With Multi-Server Config:
python examples/blog_writer.py "AI safety" --server-config config/server_config.json --debug
With Single-Server Config:
python examples/blog_writer.py "AI safety" --server-config config/single_server_config.json --debug
Performance Comparison:
python examples/test_multi_server_performance.py
Configuration Options
Server Settings:
name: Unique server identifierurl: LM Studio server URLmodel: Model name to usetimeout: Request timeout in secondsmax_concurrent: Maximum concurrent requestsweight: Load balancing weightenabled: Whether server is active
Load Balancing Strategies:
agent_type_mapping: Use agent type mappings (recommended)round_robin: Rotate between serversleast_busy: Use server with lowest loadfastest_response: Use server with best response time
Health Monitoring
The system automatically:
- Checks server health every 30 seconds
- Fails over to available servers
- Monitors response times and load
- Displays server status in debug mode
Performance Benefits
Multi-server setup provides:
- True parallelism: Agents process simultaneously
- Model specialization: Each agent type uses optimal model
- Load distribution: Spread across multiple GPUs/servers
- Fault tolerance: Continue if one server fails
- 3-4x performance: With proper server setup