File size: 5,346 Bytes
59ce7b1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
# Environment Variables Reference

> **Last Updated**: 2025-12-06

Complete reference for all environment variables used by DeepBoner.

## Quick Reference

| Variable | Required | Default | Description |
|----------|----------|---------|-------------|
| `OPENAI_API_KEY` | No* | - | OpenAI API key |
| `HF_TOKEN` | No | - | HuggingFace token |
| `NCBI_API_KEY` | No | - | NCBI/PubMed API key |
| `LLM_PROVIDER` | No | `openai` | LLM backend |
| `MAX_ITERATIONS` | No | `10` | Max search iterations |
| `LOG_LEVEL` | No | `INFO` | Logging level |

*At least one of OPENAI_API_KEY or HF_TOKEN is needed for full functionality.

## LLM Configuration

### OPENAI_API_KEY

OpenAI API key for premium features.

```bash
OPENAI_API_KEY=sk-proj-xxxx
```

- **Format:** Starts with `sk-` or `sk-proj-`
- **Source:** https://platform.openai.com/api-keys
- **Effect:** Enables OpenAI GPT-5 as the LLM backend

### ANTHROPIC_API_KEY

Anthropic API key (reserved for future use).

```bash
ANTHROPIC_API_KEY=sk-ant-xxxx
```

### LLM_PROVIDER

Explicitly select LLM provider.

```bash
LLM_PROVIDER=openai    # Use OpenAI
LLM_PROVIDER=huggingface  # Use HuggingFace
```

- **Default:** `openai`
- **Note:** Auto-detection uses OPENAI_API_KEY presence

### OPENAI_MODEL

OpenAI model name.

```bash
OPENAI_MODEL=gpt-5
OPENAI_MODEL=gpt-4o
```

- **Default:** `gpt-5`

### HUGGINGFACE_MODEL

HuggingFace model for free tier.

```bash
HUGGINGFACE_MODEL=Qwen/Qwen2.5-7B-Instruct
```

- **Default:** `Qwen/Qwen2.5-7B-Instruct`
- **Warning:** Large models (70B+) route to unreliable third-party providers

### HF_TOKEN

HuggingFace API token.

```bash
HF_TOKEN=hf_xxxx
```

- **Source:** https://huggingface.co/settings/tokens
- **Effect:** Enables gated models and higher rate limits

## Embedding Configuration

### OPENAI_EMBEDDING_MODEL

OpenAI embedding model for premium RAG.

```bash
OPENAI_EMBEDDING_MODEL=text-embedding-3-small
OPENAI_EMBEDDING_MODEL=text-embedding-3-large
```

- **Default:** `text-embedding-3-small`
- **Requires:** `OPENAI_API_KEY`

### LOCAL_EMBEDDING_MODEL

Local sentence-transformers model.

```bash
LOCAL_EMBEDDING_MODEL=all-MiniLM-L6-v2
LOCAL_EMBEDDING_MODEL=all-mpnet-base-v2
```

- **Default:** `all-MiniLM-L6-v2`
- **Note:** Downloaded on first use

## External Services

### NCBI_API_KEY

NCBI API key for higher PubMed rate limits.

```bash
NCBI_API_KEY=xxxx
```

- **Source:** https://www.ncbi.nlm.nih.gov/account/settings/
- **Effect:** 10 requests/second instead of 3

### CHROMA_DB_PATH

ChromaDB storage location.

```bash
CHROMA_DB_PATH=./chroma_db
CHROMA_DB_PATH=/data/vectors
```

- **Default:** `./chroma_db`
- **Note:** Directory is created if it doesn't exist

## Agent Configuration

### MAX_ITERATIONS

Maximum search-judge loop iterations.

```bash
MAX_ITERATIONS=10
MAX_ITERATIONS=5   # Faster but less thorough
MAX_ITERATIONS=20  # More thorough
```

- **Default:** `10`
- **Range:** `1` to `50`

### ADVANCED_MAX_ROUNDS

Maximum multi-agent coordination rounds.

```bash
ADVANCED_MAX_ROUNDS=5
```

- **Default:** `5`
- **Range:** `1` to `20`

### ADVANCED_TIMEOUT

Timeout for advanced mode in seconds.

```bash
ADVANCED_TIMEOUT=600   # 10 minutes
ADVANCED_TIMEOUT=300   # 5 minutes
```

- **Default:** `600.0`
- **Range:** `60.0` to `900.0`

### SEARCH_TIMEOUT

Per-search operation timeout in seconds.

```bash
SEARCH_TIMEOUT=30
```

- **Default:** `30`

## Logging

### LOG_LEVEL

Logging verbosity.

```bash
LOG_LEVEL=DEBUG    # Verbose
LOG_LEVEL=INFO     # Normal
LOG_LEVEL=WARNING  # Errors and warnings
LOG_LEVEL=ERROR    # Errors only
```

- **Default:** `INFO`

## Gradio Configuration

### GRADIO_SERVER_NAME

Server bind address.

```bash
GRADIO_SERVER_NAME=0.0.0.0  # All interfaces
GRADIO_SERVER_NAME=127.0.0.1  # Localhost only
```

- **Default:** Set in Dockerfile for containers

### GRADIO_SERVER_PORT

Server port.

```bash
GRADIO_SERVER_PORT=7860
```

- **Default:** `7860`

## Python Configuration

### PYTHONPATH

Python module search path.

```bash
PYTHONPATH=/app
```

- **Note:** Set automatically in Docker

## .env File Format

```bash
# Comments start with #
KEY=value           # No quotes needed for simple values
KEY="value"         # Quotes for values with spaces
KEY='value'         # Single quotes also work

# Empty lines are ignored

# Multi-line values not supported - use single line
```

## Security Notes

1. **Never commit .env files** - They're in .gitignore
2. **Use secrets for production** - HuggingFace Secrets, Docker secrets
3. **Rotate keys regularly** - Especially for production
4. **Limit permissions** - Use read-only keys where possible

## Validation

Variables are validated on application startup:

```python
# Invalid values raise ValidationError
MAX_ITERATIONS=100  # Error: must be 1-50
LOG_LEVEL=TRACE     # Error: invalid level
```

## Debugging

Check loaded configuration:

```bash
LOG_LEVEL=DEBUG uv run python -c "
from src.utils.config import settings
print(f'Provider: {settings.llm_provider}')
print(f'Has OpenAI: {settings.has_openai_key}')
print(f'Has HF: {settings.has_huggingface_key}')
print(f'Max Iterations: {settings.max_iterations}')
"
```

## Related Documentation

- [Configuration Reference](configuration.md)
- [Getting Started - Configuration](../getting-started/configuration.md)
- [Deployment - Docker](../deployment/docker.md)