File size: 10,628 Bytes
76c3b0a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
# SPARKNET Security Documentation

This document outlines security considerations, deployment options, and compliance
guidelines for the SPARKNET AI-Powered Technology Transfer Office Automation Platform.

## Overview

SPARKNET handles sensitive data including:
- Patent documents and IP information
- License agreements and financial terms
- Partner/stakeholder contact information
- Research data and findings

Proper security measures are essential for production deployments.

---

## Deployment Options

### 1. Fully Local Deployment (Maximum Privacy)

**Recommended for:** Organizations with strict data sovereignty requirements, classified research, or GDPR Article 17 obligations.

```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Your Private Network                      β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚  SPARKNET   │──│   Ollama    │──│  Local Vector Store β”‚  β”‚
β”‚  β”‚  (Streamlit)β”‚  β”‚  (LLM)      β”‚  β”‚  (ChromaDB)         β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚         β”‚                                                    β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚  PostgreSQL β”‚  β”‚  Document Storage (NFS/S3-compat)   β”‚   β”‚
β”‚  β”‚  (metadata) β”‚  β”‚                                     β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```

**Configuration:**
- Set no cloud API keys in `.env`
- System automatically uses Ollama for all inference
- All data remains within your network
- No external API calls for LLM inference

**Setup:**
```bash
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Pull required models
ollama pull llama3.2:latest
ollama pull nomic-embed-text

# Configure SPARKNET
cp .env.example .env
# Leave cloud API keys empty

# Run
streamlit run demo/app.py
```

### 2. Hybrid Deployment (Balanced)

**Recommended for:** Organizations that want cloud LLM capabilities for non-sensitive operations while keeping sensitive data local.

```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Your Private Network                      β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚  SPARKNET   │──│   Ollama    │──│  Document Storage   β”‚  β”‚
β”‚  β”‚  (Streamlit)β”‚  β”‚  (Sensitive)β”‚  β”‚  (Encrypted)        β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”‚β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
          β”‚
          β”‚ (Non-sensitive queries only)
          β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Cloud LLM Providers                       β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚  Groq   β”‚  β”‚ Gemini  β”‚  β”‚ OpenRouter  β”‚  β”‚  GitHub   β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```

**Configuration:**
- Configure cloud API keys for general queries
- Use document sensitivity classification
- Route sensitive documents to local Ollama
- Implement data anonymization for cloud queries

### 3. Cloud Deployment (Streamlit Cloud)

**Recommended for:** Public demos, non-sensitive research, or when local infrastructure is not available.

**Configuration:**
```toml
# .streamlit/secrets.toml
[auth]
password = "your-secure-password"

GROQ_API_KEY = "your-key"
GOOGLE_API_KEY = "your-key"
```

**Security Checklist:**
- [ ] Use secrets management (never commit API keys)
- [ ] Enable authentication
- [ ] Review provider data processing policies
- [ ] Consider data anonymization
- [ ] Implement session timeouts

---

## GDPR Compliance

### Data Processing Principles

SPARKNET is designed to support GDPR compliance:

1. **Lawfulness, Fairness, Transparency**
   - Document all data processing activities
   - Obtain appropriate consent for personal data
   - Provide clear privacy notices

2. **Purpose Limitation**
   - Use data only for stated TTO purposes
   - Do not repurpose data without consent

3. **Data Minimization**
   - Only process necessary data
   - Anonymize data when possible
   - Implement data retention policies

4. **Accuracy**
   - CriticAgent validation helps ensure accuracy
   - Human-in-the-loop for critical decisions
   - Source verification for claims

5. **Storage Limitation**
   - Configure `DATA_RETENTION_DAYS` in `.env`
   - Implement automatic data purging
   - Support data deletion requests

6. **Integrity and Confidentiality**
   - Encrypt data at rest
   - Use TLS for data in transit
   - Implement access controls

### Data Subject Rights

Support for GDPR data subject rights:

| Right | Implementation |
|-------|----------------|
| Access | Export function for user data |
| Rectification | Edit capabilities in UI |
| Erasure | Delete user data on request |
| Portability | JSON/CSV export options |
| Objection | Opt-out from AI processing |

### Cross-Border Data Transfers

When using cloud LLM providers:

1. **EU-US Data Transfers:**
   - Review provider's Data Processing Agreement
   - Ensure Standard Contractual Clauses in place
   - Consider EU-hosted alternatives

2. **Recommended Approach:**
   - Use Ollama for EU data residency
   - Anonymize data before cloud API calls
   - Implement geographic routing

---

## Security Best Practices

### API Key Management

```python
# GOOD: Load from environment/secrets
api_key = os.environ.get("GROQ_API_KEY")
# or
api_key = st.secrets.get("GROQ_API_KEY")

# BAD: Hardcoded keys
api_key = "gsk_abc123..."  # NEVER DO THIS
```

### Authentication

Configure authentication in `.streamlit/secrets.toml`:

```toml
[auth]
# Single user
password = "strong-password-here"

# Multi-user
[auth.users]
admin = "admin-password"
analyst = "analyst-password"
viewer = "viewer-password"
```

### Audit Logging

Enable audit logging for compliance:

```env
AUDIT_LOG_ENABLED=true
AUDIT_LOG_PATH=./logs/audit.log
```

Audit log includes:
- User authentication events
- Document access
- AI query/response pairs
- Decision point approvals

### Network Security

For production deployments:

1. **Firewall Rules:**
   - Restrict Ollama to internal network
   - Limit database access to app servers
   - Use VPN for remote access

2. **TLS/SSL:**
   - Enable HTTPS for Streamlit
   - Use encrypted database connections
   - Secure WebSocket connections

3. **Access Control:**
   - Implement role-based access
   - Use IP allowlisting where possible
   - Enable MFA for admin access

---

## Sensitive Data Handling

### Document Classification

SPARKNET can classify documents by sensitivity:

| Level | Description | Handling |
|-------|-------------|----------|
| Public | Non-confidential | Cloud LLM allowed |
| Internal | Business confidential | Prefer local |
| Confidential | Sensitive business | Local only |
| Restricted | Highly sensitive | Local + encryption |

### PII Detection

Enable PII detection:

```env
PII_DETECTION_ENABLED=true
```

Detected PII types:
- Names (persons)
- Email addresses
- Phone numbers
- Addresses
- ID numbers

### Data Anonymization

For cloud API calls, implement anonymization:

```python
# Pseudonymization example
text = text.replace(real_name, "[PERSON_1]")
text = text.replace(company_name, "[COMPANY_1]")
```

---

## Incident Response

### Security Incident Procedure

1. **Detection:** Monitor audit logs and alerts
2. **Containment:** Isolate affected systems
3. **Investigation:** Determine scope and impact
4. **Notification:** Inform stakeholders (72h for GDPR)
5. **Recovery:** Restore from clean backups
6. **Lessons Learned:** Update security measures

### Contact

For security issues:
- Review issue privately before public disclosure
- Report to project maintainers
- Follow responsible disclosure practices

---

## Compliance Checklist

### Pre-Deployment

- [ ] API keys stored in secrets management
- [ ] Authentication configured
- [ ] Audit logging enabled
- [ ] Data retention policy defined
- [ ] Backup strategy implemented
- [ ] Network security reviewed

### GDPR Compliance

- [ ] Data processing register updated
- [ ] Privacy notice published
- [ ] Data subject rights procedures in place
- [ ] Cross-border transfer safeguards
- [ ] Data Protection Impact Assessment (if required)

### Ongoing

- [ ] Regular security audits
- [ ] Log review and monitoring
- [ ] Access control review
- [ ] Incident response testing
- [ ] Staff security training

---

## Additional Resources

- [GDPR Official Text](https://gdpr.eu/)
- [Ollama Documentation](https://ollama.com/)
- [Streamlit Security](https://docs.streamlit.io/deploy/streamlit-community-cloud/security)
- [OWASP Top 10](https://owasp.org/Top10/)

---

*SPARKNET - VISTA/Horizon EU Project*
*Last Updated: 2025*