LocalAI / docs /content /features /runtime-settings.md

Upload folder using huggingface_hub

0f07ba7 verified 20 days ago

8.47 kB

	+++
	disableToc = false
	title = "⚙️ Runtime Settings"
	weight = 25
	url = '/features/runtime-settings'
	+++

	LocalAI provides a web-based interface for managing application settings at runtime. These settings can be configured through the web UI and are automatically persisted to a configuration file, allowing changes to take effect immediately without requiring a restart.

	## Accessing Runtime Settings

	Navigate to the Settings page from the management interface at `http://localhost:8080/manage`. The settings page provides a comprehensive interface for configuring various aspects of LocalAI.

	## Available Settings

	### Watchdog Settings

	The watchdog monitors backend activity and can automatically stop idle or overly busy models to free up resources.

	- Watchdog Enabled: Master switch to enable/disable the watchdog
	- Watchdog Idle Enabled: Enable stopping backends that are idle longer than the idle timeout
	- Watchdog Busy Enabled: Enable stopping backends that are busy longer than the busy timeout
	- Watchdog Idle Timeout: Duration threshold for idle backends (default: `15m`)
	- Watchdog Busy Timeout: Duration threshold for busy backends (default: `5m`)

	Changes to watchdog settings are applied immediately by restarting the watchdog service.

	### Backend Configuration

	- Max Active Backends: Maximum number of active backends (loaded models). When exceeded, the least recently used model is automatically evicted. Set to `0` for unlimited, `1` for single-backend mode
	- Parallel Backend Requests: Enable backends to handle multiple requests in parallel if supported
	- Force Eviction When Busy: Allow evicting models even when they have active API calls (default: disabled for safety). Warning: Enabling this can interrupt active requests
	- LRU Eviction Max Retries: Maximum number of retries when waiting for busy models to become idle before eviction (default: 30)
	- LRU Eviction Retry Interval: Interval between retries when waiting for busy models (default: `1s`)

	> Note: The "Single Backend" setting is deprecated. Use "Max Active Backends" set to `1` for single-backend behavior.

	#### LRU Eviction Behavior

	By default, LocalAI will skip evicting models that have active API calls to prevent interrupting ongoing requests. When all models are busy and eviction is needed:

	1. The system will wait for models to become idle
	2. It will retry eviction up to the configured maximum number of retries
	3. The retry interval determines how long to wait between attempts
	4. If all retries are exhausted, the system will proceed (which may cause out-of-memory errors if resources are truly exhausted)

	You can configure these settings via the web UI or through environment variables. See [VRAM Management]({{%relref "advanced/vram-management" %}}) for more details.

	### Performance Settings

	- Threads: Number of threads used for parallel computation (recommended: number of physical cores)
	- Context Size: Default context size for models (default: `512`)
	- F16: Enable GPU acceleration using 16-bit floating point

	### Debug and Logging

	- Debug Mode: Enable debug logging (deprecated, use log-level instead)

	### API Security

	- CORS: Enable Cross-Origin Resource Sharing
	- CORS Allow Origins: Comma-separated list of allowed CORS origins
	- CSRF: Enable CSRF protection middleware
	- API Keys: Manage API keys for authentication (one per line or comma-separated)

	### P2P Settings

	Configure peer-to-peer networking for distributed inference:

	- P2P Token: Authentication token for P2P network
	- P2P Network ID: Network identifier for P2P connections
	- Federated Mode: Enable federated mode for P2P network

	Changes to P2P settings automatically restart the P2P stack with the new configuration.

	### Gallery Settings

	Manage model and backend galleries:

	- Model Galleries: JSON array of gallery objects with `url` and `name` fields
	- Backend Galleries: JSON array of backend gallery objects
	- Autoload Galleries: Automatically load model galleries on startup
	- Autoload Backend Galleries: Automatically load backend galleries on startup

	## Configuration Persistence

	All settings are automatically saved to `runtime_settings.json` in the `LOCALAI_CONFIG_DIR` directory (default: `BASEPATH/configuration`). This file is watched for changes, so modifications made directly to the file will also be applied at runtime.

	## Environment Variable Precedence

	Environment variables take precedence over settings configured via the web UI or configuration files. If a setting is controlled by an environment variable, it cannot be modified through the web interface. The settings page will indicate when a setting is controlled by an environment variable.

	The precedence order is:
	1. Environment variables (highest priority)
	2. Configuration files (`runtime_settings.json`, `api_keys.json`)
	3. Default values (lowest priority)

	## Example Configuration

	The `runtime_settings.json` file follows this structure:

	```json
	{
	"watchdog_enabled": true,
	"watchdog_idle_enabled": true,
	"watchdog_busy_enabled": false,
	"watchdog_idle_timeout": "15m",
	"watchdog_busy_timeout": "5m",
	"max_active_backends": 0,
	"parallel_backend_requests": true,
	"force_eviction_when_busy": false,
	"lru_eviction_max_retries": 30,
	"lru_eviction_retry_interval": "1s",
	"threads": 8,
	"context_size": 2048,
	"f16": false,
	"debug": false,
	"cors": true,
	"csrf": false,
	"cors_allow_origins": "*",
	"p2p_token": "",
	"p2p_network_id": "",
	"federated": false,
	"galleries": [
	{
	"url": "github:mudler/LocalAI/gallery/index.yaml@master",
	"name": "localai"
	}
	],
	"backend_galleries": [
	{
	"url": "github:mudler/LocalAI/backend/index.yaml@master",
	"name": "localai"
	}
	],
	"autoload_galleries": true,
	"autoload_backend_galleries": true,
	"api_keys": []
	}
	```

	## API Keys Management

	API keys can be managed through the runtime settings interface. Keys can be entered one per line or comma-separated.

	Important Notes:
	- API keys from environment variables are always included and cannot be removed via the UI
	- Runtime API keys are stored in `runtime_settings.json`
	- For backward compatibility, API keys can also be managed via `api_keys.json`
	- Empty arrays will clear all runtime API keys (but preserve environment variable keys)

	## Dynamic Configuration

	The runtime settings system supports dynamic configuration file watching. When `LOCALAI_CONFIG_DIR` is set, LocalAI monitors the following files for changes:

	- `runtime_settings.json` - Unified runtime settings
	- `api_keys.json` - API keys (for backward compatibility)
	- `external_backends.json` - External backend configurations

	Changes to these files are automatically detected and applied without requiring a restart.

	## Best Practices

	1. Use Environment Variables for Production: For production deployments, use environment variables for critical settings to ensure they cannot be accidentally changed via the web UI.

	2. Backup Configuration Files: Before making significant changes, consider backing up your `runtime_settings.json` file.

	3. Monitor Resource Usage: When enabling watchdog features, monitor your system to ensure the timeout values are appropriate for your workload.

	4. Secure API Keys: API keys are sensitive information. Ensure proper file permissions on configuration files (they should be readable only by the LocalAI process).

	5. Test Changes: Some settings (like watchdog timeouts) may require testing to find optimal values for your specific use case.

	## Troubleshooting

	### Settings Not Applying

	If settings are not being applied:
	1. Check if the setting is controlled by an environment variable
	2. Verify the `LOCALAI_CONFIG_DIR` is set correctly
	3. Check file permissions on `runtime_settings.json`
	4. Review application logs for configuration errors

	### Watchdog Not Working

	If the watchdog is not functioning:
	1. Ensure "Watchdog Enabled" is turned on
	2. Verify at least one of the idle or busy watchdogs is enabled
	3. Check that timeout values are reasonable for your workload
	4. Review logs for watchdog-related messages

	### P2P Not Starting

	If P2P is not starting:
	1. Verify the P2P token is set (non-empty)
	2. Check network connectivity
	3. Ensure the P2P network ID matches across nodes (if using federated mode)
	4. Review logs for P2P-related errors