File size: 8,473 Bytes
0f07ba7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
+++
disableToc = false
title = "⚙️ Runtime Settings"
weight = 25
url = '/features/runtime-settings'
+++

LocalAI provides a web-based interface for managing application settings at runtime. These settings can be configured through the web UI and are automatically persisted to a configuration file, allowing changes to take effect immediately without requiring a restart.

## Accessing Runtime Settings

Navigate to the **Settings** page from the management interface at `http://localhost:8080/manage`. The settings page provides a comprehensive interface for configuring various aspects of LocalAI.

## Available Settings

### Watchdog Settings

The watchdog monitors backend activity and can automatically stop idle or overly busy models to free up resources.

- **Watchdog Enabled**: Master switch to enable/disable the watchdog
- **Watchdog Idle Enabled**: Enable stopping backends that are idle longer than the idle timeout
- **Watchdog Busy Enabled**: Enable stopping backends that are busy longer than the busy timeout
- **Watchdog Idle Timeout**: Duration threshold for idle backends (default: `15m`)
- **Watchdog Busy Timeout**: Duration threshold for busy backends (default: `5m`)

Changes to watchdog settings are applied immediately by restarting the watchdog service.

### Backend Configuration

- **Max Active Backends**: Maximum number of active backends (loaded models). When exceeded, the least recently used model is automatically evicted. Set to `0` for unlimited, `1` for single-backend mode
- **Parallel Backend Requests**: Enable backends to handle multiple requests in parallel if supported
- **Force Eviction When Busy**: Allow evicting models even when they have active API calls (default: disabled for safety). **Warning:** Enabling this can interrupt active requests
- **LRU Eviction Max Retries**: Maximum number of retries when waiting for busy models to become idle before eviction (default: 30)
- **LRU Eviction Retry Interval**: Interval between retries when waiting for busy models (default: `1s`)

> **Note:** The "Single Backend" setting is deprecated. Use "Max Active Backends" set to `1` for single-backend behavior.

#### LRU Eviction Behavior

By default, LocalAI will skip evicting models that have active API calls to prevent interrupting ongoing requests. When all models are busy and eviction is needed:

1. The system will wait for models to become idle
2. It will retry eviction up to the configured maximum number of retries
3. The retry interval determines how long to wait between attempts
4. If all retries are exhausted, the system will proceed (which may cause out-of-memory errors if resources are truly exhausted)

You can configure these settings via the web UI or through environment variables. See [VRAM Management]({{%relref "advanced/vram-management" %}}) for more details.

### Performance Settings

- **Threads**: Number of threads used for parallel computation (recommended: number of physical cores)
- **Context Size**: Default context size for models (default: `512`)
- **F16**: Enable GPU acceleration using 16-bit floating point

### Debug and Logging

- **Debug Mode**: Enable debug logging (deprecated, use log-level instead)

### API Security

- **CORS**: Enable Cross-Origin Resource Sharing
- **CORS Allow Origins**: Comma-separated list of allowed CORS origins
- **CSRF**: Enable CSRF protection middleware
- **API Keys**: Manage API keys for authentication (one per line or comma-separated)

### P2P Settings

Configure peer-to-peer networking for distributed inference:

- **P2P Token**: Authentication token for P2P network
- **P2P Network ID**: Network identifier for P2P connections
- **Federated Mode**: Enable federated mode for P2P network

Changes to P2P settings automatically restart the P2P stack with the new configuration.

### Gallery Settings

Manage model and backend galleries:

- **Model Galleries**: JSON array of gallery objects with `url` and `name` fields
- **Backend Galleries**: JSON array of backend gallery objects
- **Autoload Galleries**: Automatically load model galleries on startup
- **Autoload Backend Galleries**: Automatically load backend galleries on startup

## Configuration Persistence

All settings are automatically saved to `runtime_settings.json` in the `LOCALAI_CONFIG_DIR` directory (default: `BASEPATH/configuration`). This file is watched for changes, so modifications made directly to the file will also be applied at runtime.

## Environment Variable Precedence

Environment variables take precedence over settings configured via the web UI or configuration files. If a setting is controlled by an environment variable, it cannot be modified through the web interface. The settings page will indicate when a setting is controlled by an environment variable.

The precedence order is:
1. **Environment variables** (highest priority)
2. **Configuration files** (`runtime_settings.json`, `api_keys.json`)
3. **Default values** (lowest priority)

## Example Configuration

The `runtime_settings.json` file follows this structure:

```json
{
  "watchdog_enabled": true,
  "watchdog_idle_enabled": true,
  "watchdog_busy_enabled": false,
  "watchdog_idle_timeout": "15m",
  "watchdog_busy_timeout": "5m",
  "max_active_backends": 0,
  "parallel_backend_requests": true,
  "force_eviction_when_busy": false,
  "lru_eviction_max_retries": 30,
  "lru_eviction_retry_interval": "1s",
  "threads": 8,
  "context_size": 2048,
  "f16": false,
  "debug": false,
  "cors": true,
  "csrf": false,
  "cors_allow_origins": "*",
  "p2p_token": "",
  "p2p_network_id": "",
  "federated": false,
  "galleries": [
    {
      "url": "github:mudler/LocalAI/gallery/index.yaml@master",
      "name": "localai"
    }
  ],
  "backend_galleries": [
    {
      "url": "github:mudler/LocalAI/backend/index.yaml@master",
      "name": "localai"
    }
  ],
  "autoload_galleries": true,
  "autoload_backend_galleries": true,
  "api_keys": []
}
```

## API Keys Management

API keys can be managed through the runtime settings interface. Keys can be entered one per line or comma-separated. 

**Important Notes:**
- API keys from environment variables are always included and cannot be removed via the UI
- Runtime API keys are stored in `runtime_settings.json`
- For backward compatibility, API keys can also be managed via `api_keys.json`
- Empty arrays will clear all runtime API keys (but preserve environment variable keys)

## Dynamic Configuration

The runtime settings system supports dynamic configuration file watching. When `LOCALAI_CONFIG_DIR` is set, LocalAI monitors the following files for changes:

- `runtime_settings.json` - Unified runtime settings
- `api_keys.json` - API keys (for backward compatibility)
- `external_backends.json` - External backend configurations

Changes to these files are automatically detected and applied without requiring a restart.

## Best Practices

1. **Use Environment Variables for Production**: For production deployments, use environment variables for critical settings to ensure they cannot be accidentally changed via the web UI.

2. **Backup Configuration Files**: Before making significant changes, consider backing up your `runtime_settings.json` file.

3. **Monitor Resource Usage**: When enabling watchdog features, monitor your system to ensure the timeout values are appropriate for your workload.

4. **Secure API Keys**: API keys are sensitive information. Ensure proper file permissions on configuration files (they should be readable only by the LocalAI process).

5. **Test Changes**: Some settings (like watchdog timeouts) may require testing to find optimal values for your specific use case.

## Troubleshooting

### Settings Not Applying

If settings are not being applied:
1. Check if the setting is controlled by an environment variable
2. Verify the `LOCALAI_CONFIG_DIR` is set correctly
3. Check file permissions on `runtime_settings.json`
4. Review application logs for configuration errors

### Watchdog Not Working

If the watchdog is not functioning:
1. Ensure "Watchdog Enabled" is turned on
2. Verify at least one of the idle or busy watchdogs is enabled
3. Check that timeout values are reasonable for your workload
4. Review logs for watchdog-related messages

### P2P Not Starting

If P2P is not starting:
1. Verify the P2P token is set (non-empty)
2. Check network connectivity
3. Ensure the P2P network ID matches across nodes (if using federated mode)
4. Review logs for P2P-related errors