File size: 14,051 Bytes
c59d808
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
# Model Configuration Guide

This guide focuses on the technical configuration, settings management, parameter handling, and troubleshooting for LLM providers in the Recipe Chatbot project.

> πŸ“š **Looking for model recommendations?** See [Model Selection Guide](./model-selection-guide.md) for detailed model comparisons and use case recommendations.

## πŸ”§ Configuration System Overview

### Settings Architecture
The project uses a centralized configuration system in `config/settings.py` with environment variable overrides:

```python
# Configuration loading flow
Environment Variables (.env) β†’ settings.py β†’ LLM Service β†’ Provider APIs
```

### Temperature Management
Each provider has different temperature constraints that are automatically handled:

| Provider | Range | Auto-Handling | Special Cases |
|----------|-------|---------------|---------------|
| **OpenAI** | 0.0 - 2.0 | βœ… GPT-5-nano β†’ 1.0 | Nano models fixed |
| **Google** | 0.0 - 1.0 | βœ… Clamp to range | Strict validation |
| **Ollama** | 0.0 - 2.0 | ⚠️ Model dependent | Local processing |
| **HuggingFace** | Fixed ~0.7 | ❌ API ignores setting | Read-only |

## πŸ› οΈ Provider Configuration Details

### OpenAI Configuration

#### Environment Variables
```bash
# Core settings
OPENAI_API_KEY=sk-proj-xxxxx
OPENAI_MODEL=gpt-4o-mini
OPENAI_TEMPERATURE=0.7
OPENAI_MAX_TOKENS=1000

# Advanced parameters (optional)
OPENAI_TOP_P=1.0
OPENAI_FREQUENCY_PENALTY=0.0
OPENAI_PRESENCE_PENALTY=0.0
```

#### Automatic Temperature Override
```python
# Implemented in services/llm_service.py
if "gpt-5-nano" in model_name.lower():
    temperature = 1.0  # Only supported value
    logger.info(f"Auto-adjusting temperature to 1.0 for {model_name}")
```

#### Parameter Validation
- **Temperature**: `0.0 - 2.0` (except nano models: fixed `1.0`)
- **Max Tokens**: `1 - 4096` (model-dependent)
- **Top P**: `0.0 - 1.0`

### Google (Gemini) Configuration

#### Environment Variables
```bash
# Core settings
GOOGLE_API_KEY=AIzaSyxxxxx
GOOGLE_MODEL=gemini-2.5-flash
GOOGLE_TEMPERATURE=0.7
GOOGLE_MAX_TOKENS=1000

# Advanced parameters (optional)
GOOGLE_TOP_P=0.95
GOOGLE_TOP_K=40
```

#### Temperature Clamping
```python
# Auto-clamping to Google's range
google_temp = max(0.0, min(1.0, configured_temperature))
if google_temp != configured_temperature:
    logger.info(f"Clamping temperature from {configured_temperature} to {google_temp}")
```

#### Parameter Constraints
- **Temperature**: `0.0 - 1.0` (strictly enforced)
- **Max Tokens**: `1 - 8192`
- **Top K**: `1 - 40`

### Ollama Configuration

#### Environment Variables
```bash
# Core settings
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=llama3.1:8b
OLLAMA_TEMPERATURE=0.7
OLLAMA_MAX_TOKENS=1000

# Connection settings
OLLAMA_TIMEOUT=30
OLLAMA_KEEP_ALIVE=5m
```

#### Service Management
```bash
# Start Ollama service
ollama serve &

# Verify service status
curl http://localhost:11434/api/version

# Model management
ollama pull llama3.1:8b
ollama list
ollama rm unused_model
```

#### Parameter Flexibility
- **Temperature**: `0.0 - 2.0` (widest range)
- **Context Length**: Model-dependent (2K - 128K)
- **Custom Parameters**: Model-specific options available

### HuggingFace Configuration

#### Environment Variables
```bash
# Core settings
HUGGINGFACE_API_KEY=hf_xxxxx
HUGGINGFACE_MODEL=microsoft/DialoGPT-medium
HUGGINGFACE_TEMPERATURE=0.7  # Often ignored
HUGGINGFACE_MAX_TOKENS=500

# API settings
HUGGINGFACE_WAIT_FOR_MODEL=true
HUGGINGFACE_USE_CACHE=true
```

#### API Limitations
```python
# Note: Temperature is often ignored by Inference API
logger.warning(f"HuggingFace model {model_name} may ignore temperature setting")
return 0.7  # API typically uses this default
```

## βš™οΈ Advanced Configuration

### Dynamic Provider Switching
```python
# config/settings.py implementation
def get_llm_config():
    provider = os.getenv("LLM_PROVIDER", "openai").lower()
    fallback = os.getenv("LLM_FALLBACK_PROVIDER", "google").lower()
    
    return {
        "provider": provider,
        "fallback_provider": fallback,
        **get_provider_config(provider)
    }

def get_provider_config(provider):
    """Get provider-specific configuration."""
    configs = {
        "openai": {
            "api_key": os.getenv("OPENAI_API_KEY"),
            "model": os.getenv("OPENAI_MODEL", "gpt-4o-mini"),
            "temperature": float(os.getenv("OPENAI_TEMPERATURE", "0.7")),
            "max_tokens": int(os.getenv("OPENAI_MAX_TOKENS", "1000")),
        },
        "google": {
            "api_key": os.getenv("GOOGLE_API_KEY"),
            "model": os.getenv("GOOGLE_MODEL", "gemini-2.5-flash"),
            "temperature": float(os.getenv("GOOGLE_TEMPERATURE", "0.7")),
            "max_tokens": int(os.getenv("GOOGLE_MAX_TOKENS", "1000")),
        },
        # ... other providers
    }
    return configs.get(provider, {})
```

### Fallback Configuration
```python
# Automatic fallback on provider failure
def get_llm_response(message):
    try:
        return primary_provider.chat_completion(message)
    except Exception as e:
        logger.warning(f"Primary provider failed: {e}")
        return fallback_provider.chat_completion(message)
```

### Environment-Specific Configs

#### Development (.env.development)
```bash
# Fast, free/cheap for testing
LLM_PROVIDER=google
GOOGLE_MODEL=gemini-2.5-flash
GOOGLE_TEMPERATURE=0.8  # More creative for testing
LLM_FALLBACK_PROVIDER=ollama
```

#### Production (.env.production)
```bash
# Reliable, consistent for production
LLM_PROVIDER=openai
OPENAI_MODEL=gpt-4o-mini
OPENAI_TEMPERATURE=0.7  # Consistent responses
LLM_FALLBACK_PROVIDER=google
```

#### Local Development (.env.local)
```bash
# Self-hosted for offline development
LLM_PROVIDER=ollama
OLLAMA_MODEL=llama3.1:8b
OLLAMA_TEMPERATURE=0.7
# No fallback - fully local
```

## 🚨 Configuration Troubleshooting

### Issue: GPT-5-nano Temperature Error
**Error**: `Temperature must be 1.0 for gpt-5-nano`
**Status**: βœ… Auto-fixed in `services/llm_service.py`
**Verification**:
```bash
python -c "
import os
os.environ['OPENAI_MODEL'] = 'gpt-5-nano'
os.environ['OPENAI_TEMPERATURE'] = '0.5'
from services.llm_service import LLMService
LLMService()  # Should log temperature override
"
```

### Issue: Google Temperature Out of Range
**Error**: `Temperature must be between 0.0 and 1.0`
**Solution**: Automatic clamping implemented
**Test**:
```bash
python -c "
import os
os.environ['LLM_PROVIDER'] = 'google'
os.environ['GOOGLE_TEMPERATURE'] = '1.5'
from services.llm_service import LLMService
LLMService()  # Should clamp to 1.0
"
```

### Issue: Ollama Connection Failed
**Error**: `ConnectionError: Could not connect to Ollama`
**Diagnosis**:
```bash
# Check if Ollama is running
curl -f http://localhost:11434/api/version || echo "Ollama not running"

# Check if model exists
ollama list | grep "llama3.1:8b" || echo "Model not found"

# Check system resources
free -h  # RAM usage
df -h    # Disk space
```

**Fix**:
```bash
# Start Ollama service
ollama serve &

# Pull required model
ollama pull llama3.1:8b

# Test connection
curl -d '{"model":"llama3.1:8b","prompt":"test","stream":false}' \
     http://localhost:11434/api/generate
```

### Issue: HuggingFace Temperature Ignored
**Issue**: Settings have no effect on response
**Explanation**: This is expected behavior - HuggingFace Inference API typically ignores temperature
**Workaround**: Use different models or providers for temperature control

### Issue: Missing API Keys
**Error**: `AuthenticationError: Invalid API key`
**Diagnosis**:
```bash
# Check environment variables
echo "OpenAI: ${OPENAI_API_KEY:0:10}..." 
echo "Google: ${GOOGLE_API_KEY:0:10}..."
echo "HuggingFace: ${HUGGINGFACE_API_KEY:0:10}..."

# Test API key validity
curl -H "Authorization: Bearer $OPENAI_API_KEY" \
     https://api.openai.com/v1/models | jq '.data[0].id' || echo "Invalid OpenAI key"
```

## πŸ” Configuration Validation

### Automated Configuration Check
```bash
# Run comprehensive configuration validation
python -c "
from config.settings import get_llm_config
from services.llm_service import LLMService
import json

print('πŸ”§ Configuration Validation')
print('=' * 40)

# Load configuration
try:
    config = get_llm_config()
    print('βœ… Configuration loaded successfully')
    print(f'Provider: {config.get(\"provider\")}')
    print(f'Model: {config.get(\"model\")}')
    print(f'Temperature: {config.get(\"temperature\")}')
except Exception as e:
    print(f'❌ Configuration error: {e}')
    exit(1)

# Test service initialization
try:
    service = LLMService()
    print('βœ… LLM Service initialized')
except Exception as e:
    print(f'❌ Service initialization failed: {e}')
    exit(1)

# Test simple completion
try:
    response = service.simple_chat_completion('Test message')
    print('βœ… Chat completion successful')
    print(f'Response length: {len(response)} characters')
except Exception as e:
    print(f'❌ Chat completion failed: {e}')
    exit(1)

print('πŸŽ‰ All configuration checks passed!')
"
```

### Provider-Specific Health Checks
```bash
# OpenAI health check
curl -H "Authorization: Bearer $OPENAI_API_KEY" \
     https://api.openai.com/v1/models | jq '.data | length'

# Google health check  
curl "https://generativelanguage.googleapis.com/v1beta/models?key=$GOOGLE_API_KEY" | jq '.models | length'

# Ollama health check
curl http://localhost:11434/api/tags | jq '.models | length'

# HuggingFace health check
curl -H "Authorization: Bearer $HUGGINGFACE_API_KEY" \
     https://huggingface.co/api/whoami | jq '.name'
```

### Configuration Diff Tool
```bash
# Compare current config with defaults
python -c "
import os
from config.settings import get_llm_config

defaults = {
    'openai': {'temperature': 0.7, 'max_tokens': 1000},
    'google': {'temperature': 0.7, 'max_tokens': 1000},
    'ollama': {'temperature': 0.7, 'max_tokens': 1000},
}

current = get_llm_config()
provider = current.get('provider')
default = defaults.get(provider, {})

print(f'Configuration for {provider}:')
for key, default_val in default.items():
    current_val = current.get(key)
    status = 'βœ…' if current_val == default_val else '⚠️'
    print(f'{status} {key}: {current_val} (default: {default_val})')
"
```

## πŸ“‹ Configuration Templates

### Minimal Setup (Single Provider)
```bash
# .env.minimal
LLM_PROVIDER=google
GOOGLE_API_KEY=your_api_key
GOOGLE_MODEL=gemini-2.5-flash
```

### Robust Setup (Primary + Fallback)
```bash
# .env.robust  
LLM_PROVIDER=openai
OPENAI_API_KEY=your_primary_key
OPENAI_MODEL=gpt-4o-mini
LLM_FALLBACK_PROVIDER=google
GOOGLE_API_KEY=your_fallback_key
GOOGLE_MODEL=gemini-2.5-flash
```

### Local-First Setup
```bash
# .env.local-first
LLM_PROVIDER=ollama
OLLAMA_MODEL=llama3.1:8b
LLM_FALLBACK_PROVIDER=google
GOOGLE_API_KEY=your_cloud_backup_key
```

### Budget-Conscious Setup
```bash
# .env.budget
LLM_PROVIDER=openai
OPENAI_MODEL=gpt-5-nano
OPENAI_TEMPERATURE=1.0  # Fixed for nano
OPENAI_MAX_TOKENS=500   # Reduce costs
```

## πŸ” Security Best Practices

### API Key Management
```bash
# Use environment variables
export OPENAI_API_KEY="sk-..."

# Never commit keys to git
echo "*.env*" >> .gitignore
echo ".env" >> .gitignore

# Use different keys for different environments
cp .env.example .env.development
cp .env.example .env.production
```

### Rate Limiting Configuration
```python
# Add to config/settings.py
RATE_LIMITS = {
    "openai": {"rpm": 500, "tpm": 40000},
    "google": {"rpm": 60, "tpm": 32000},
    "ollama": {"rpm": None, "tpm": None},  # Local = unlimited
}
```

### Error Handling Strategy
```python
# Graceful degradation configuration
FALLBACK_CHAIN = [
    "primary_provider",
    "fallback_provider", 
    "local_provider",
    "cached_response"
]
```

## πŸ§ͺ Testing Configuration Changes

### Unit Tests for Configuration
```bash
# Test temperature overrides
python -m pytest tests/test_llm_temperature.py -v

# Test provider fallbacks
python -m pytest tests/test_llm_fallback.py -v

# Test API key validation
python -m pytest tests/test_api_keys.py -v
```

### Integration Tests
```bash
# Test each provider individually
python -c "
import os
providers = ['openai', 'google', 'ollama']

for provider in providers:
    os.environ['LLM_PROVIDER'] = provider
    try:
        from services.llm_service import LLMService
        service = LLMService()
        response = service.simple_chat_completion('Test')
        print(f'βœ… {provider}: {len(response)} chars')
    except Exception as e:
        print(f'❌ {provider}: {e}')
"
```

### Performance Benchmarks
```bash
# Measure response times
python -c "
import time
from services.llm_service import LLMService

service = LLMService()
start = time.time()
response = service.simple_chat_completion('Quick recipe suggestion')
elapsed = time.time() - start

print(f'Response time: {elapsed:.2f}s')
print(f'Response length: {len(response)} characters')
print(f'Words per second: {len(response.split()) / elapsed:.1f}')
"
```

## πŸ”„ Configuration Migration

### Upgrading from Old Configuration
```bash
# Migrate old environment variables
# Old format β†’ New format
mv .env .env.backup

# Update variable names
sed 's/LLM_MODEL=/OPENAI_MODEL=/' .env.backup > .env
sed -i 's/LLM_TEMPERATURE=/OPENAI_TEMPERATURE=/' .env
sed -i 's/LLM_MAX_TOKENS=/OPENAI_MAX_TOKENS=/' .env

echo "LLM_PROVIDER=openai" >> .env
```

### Version Compatibility Check
```python
# Check if configuration is compatible
def check_config_version():
    required_vars = ["LLM_PROVIDER"]
    legacy_vars = ["LLM_MODEL", "LLM_TEMPERATURE"]
    
    has_new = all(os.getenv(var) for var in required_vars)
    has_legacy = any(os.getenv(var) for var in legacy_vars)
    
    if has_legacy and not has_new:
        raise ValueError("Legacy configuration detected. Please migrate to new format.")
    
    return has_new
```

---

πŸ’‘ **Next Steps**: After configuring your providers, see the [Model Selection Guide](./model-selection-guide.md) for choosing the best models for your use case.