Spaces:

jkbennitt
/

felix-framework

Paused

App Files Files Community

felix-framework / scripts /config /README.md

jkbennitt

Clean hf-space branch and prepare for HuggingFace Spaces deployment

fb867c3 5 months ago

preview code

raw

history blame contribute delete

3.64 kB

	# LM Studio Multi-Server Configuration

	This directory contains configuration files for running multiple LM Studio servers simultaneously with the Felix Framework.

	## Configuration Files

	### `multi_model_config.json` ⭐ RECOMMENDED
	Multi-model configuration for single LM Studio server:
	- Research Model: `qwen/qwen3-4b-2507` - Fast exploration
	- Analysis Model: `qwen/qwen3-4b-thinking-2507` - Reasoning focused
	- Synthesis Model: `google/gemma-3-12b` - High-quality output
	- All models use same server: `http://127.0.0.1:1234/v1`

	### `server_config.json`
	Multi-server configuration supporting up to 4 different LM Studio servers:
	- Creative Server (port 1234): Fast model for research agents
	- Analytical Server (port 1235): Balanced model for analysis/critic agents
	- Synthesis Server (port 1236): High-quality model for synthesis agents
	- Fallback Server (port 1237): Fast backup model (disabled by default)

	### `single_server_config.json`
	Single-server configuration for comparison testing.

	## Setting Up Multiple LM Studio Servers

	1. Start LM Studio instances on different ports:
	```bash
	# Terminal 1: Creative server
	lms server start --port 1234 --model mistral-7b-instruct

	# Terminal 2: Analytical server
	lms server start --port 1235 --model llama-3.1-8b-instruct

	# Terminal 3: Synthesis server
	lms server start --port 1236 --model mixtral-8x7b-instruct
	```

	2. Or use LM Studio GUI:
	- Launch multiple LM Studio instances
	- Set different ports in Settings → Developer → Port
	- Load different models in each instance
	- Start servers

	## Agent Type Mappings

	- Research agents → Creative server (Mistral) for broad exploration
	- Analysis agents → Analytical server (Llama) for focused analysis
	- Synthesis agents → Synthesis server (Mixtral) for high-quality output
	- Critic agents → Analytical server (Llama) for reasoning/validation

	## Usage

	### With Multi-Model Config (Recommended):
	```bash
	python examples/blog_writer.py "AI safety" --server-config config/multi_model_config.json --debug
	```

	### Test Multi-Model Setup:
	```bash
	python examples/test_multi_model.py
	```

	### With Multi-Server Config:
	```bash
	python examples/blog_writer.py "AI safety" --server-config config/server_config.json --debug
	```

	### With Single-Server Config:
	```bash
	python examples/blog_writer.py "AI safety" --server-config config/single_server_config.json --debug
	```

	### Performance Comparison:
	```bash
	python examples/test_multi_server_performance.py
	```

	## Configuration Options

	### Server Settings:
	- `name`: Unique server identifier
	- `url`: LM Studio server URL
	- `model`: Model name to use
	- `timeout`: Request timeout in seconds
	- `max_concurrent`: Maximum concurrent requests
	- `weight`: Load balancing weight
	- `enabled`: Whether server is active

	### Load Balancing Strategies:
	- `agent_type_mapping`: Use agent type mappings (recommended)
	- `round_robin`: Rotate between servers
	- `least_busy`: Use server with lowest load
	- `fastest_response`: Use server with best response time

	## Health Monitoring

	The system automatically:
	- Checks server health every 30 seconds
	- Fails over to available servers
	- Monitors response times and load
	- Displays server status in debug mode

	## Performance Benefits

	Multi-server setup provides:
	- True parallelism: Agents process simultaneously
	- Model specialization: Each agent type uses optimal model
	- Load distribution: Spread across multiple GPUs/servers
	- Fault tolerance: Continue if one server fails
	- 3-4x performance: With proper server setup