File size: 2,893 Bytes
ab96cfe
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
# Configuration Guide

The HF EDA MCP Server uses a centralized configuration system that supports both environment variables and command-line arguments.

## Configuration Module

The configuration is managed by the `src/hf_eda_mcp/config.py` module, which provides:

- `ServerConfig` dataclass with all configuration options
- Environment variable loading with `ServerConfig.from_env()`
- Global configuration management with `get_config()` and `set_config()`
- Logging setup and validation utilities

## Configuration Options

### Server Settings
- `HF_EDA_PORT` (default: 7860) - Server port
- `HF_EDA_HOST` (default: 127.0.0.1) - Server host
- `HF_EDA_MCP_ENABLED` (default: true) - Enable MCP server functionality
- `HF_EDA_SHARE` (default: false) - Enable public sharing via Gradio

### Authentication
- `HF_TOKEN` - HuggingFace access token for private datasets

### Logging
- `HF_EDA_LOG_LEVEL` (default: INFO) - Logging level (DEBUG, INFO, WARNING, ERROR)

### Performance and Caching
- `HF_EDA_CACHE_DIR` - Directory for caching datasets (optional)
- `HF_EDA_MAX_CACHE_SIZE` (default: 1000) - Maximum cache size in MB
- `HF_EDA_MAX_SAMPLE_SIZE` (default: 50000) - Maximum sample size for tools
- `HF_EDA_MAX_CONCURRENT` (default: 10) - Maximum concurrent requests
- `HF_EDA_REQUEST_TIMEOUT` (default: 300) - Request timeout in seconds

## How Configuration is Used

### Server Startup
The server loads configuration from environment variables and applies command-line overrides:

```python
from hf_eda_mcp.config import ServerConfig
from hf_eda_mcp.server import launch_server

config = ServerConfig.from_env()
launch_server(config)
```

### Tools Integration
All EDA tools (metadata, sampling, analysis) use the global configuration:

```python
from hf_eda_mcp.config import get_config

config = get_config()
# Tools respect config.max_sample_size, config.cache_dir, config.hf_token
```

### Dataset Service
The `DatasetService` is initialized with configuration values:

```python
service = DatasetService(
    cache_dir=config.cache_dir,
    token=config.hf_token
)
```

## Configuration Priority

1. Command-line arguments (highest priority)
2. Environment variables
3. Default values (lowest priority)

## Example Usage

### Environment Variables
```bash
export HF_TOKEN="your_token_here"
export HF_EDA_CACHE_DIR="/tmp/hf-cache"
export HF_EDA_MAX_SAMPLE_SIZE=25000
pdm run hf-eda-mcp
```

### Command Line
```bash
pdm run hf-eda-mcp --cache-dir /tmp/cache --max-sample-size 25000 --verbose
```

### Configuration File
Copy `config.example.env` to `.env` and modify as needed, then load with:
```bash
source .env
pdm run hf-eda-mcp
```

## Validation

The configuration system includes validation for:
- Port ranges (1024-65535)
- Cache directory permissions
- Sample size limits
- Timeout values

Invalid configurations will cause the server to exit with helpful error messages.