File size: 6,662 Bytes
a9dc537 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 |
# Getting Started with SPARKNET
This guide will help you get up and running with SPARKNET quickly.
## Prerequisites
β Python 3.10+ installed
β NVIDIA GPU with CUDA support
β Ollama installed and running
## Quick Start
### 1. Verify Installation
First, check that your GPUs are available:
```bash
cd /home/mhamdan/SPARKNET
python examples/gpu_monitor.py
```
This will show:
- All detected GPUs
- Memory usage for each GPU
- Temperature and utilization stats
- Best GPU selection based on available memory
### 2. Test Basic Functionality
Run the basic test to verify all components work:
```bash
python test_basic.py
```
This tests:
- GPU Manager
- Ollama Client
- Tool System
### 3. Run Your First Agent Task
Try a simple agent-based task:
```bash
# Coming soon - full agent example
python examples/simple_task.py
```
## Important: GPU Configuration
SPARKNET works best when Ollama uses a GPU with sufficient free memory. Your current GPU status:
- **GPU 0**: 0.32 GB free - Nearly full
- **GPU 1**: 0.00 GB free - Full
- **GPU 2**: 6.87 GB free - Good for small/medium models
- **GPU 3**: 8.71 GB free - Best for larger models
To run Ollama on a specific GPU (recommended GPU 3):
```bash
# Stop current Ollama
pkill -f "ollama serve"
# Start Ollama on GPU 3
CUDA_VISIBLE_DEVICES=3 ollama serve
```
## Available Models
You currently have these models installed:
| Model | Size | Best Use Case |
|-------|------|---------------|
| **gemma2:2b** | 1.6 GB | Fast inference, lightweight tasks |
| **llama3.2:latest** | 2.0 GB | Classification, simple QA |
| **phi3:latest** | 2.2 GB | Reasoning, structured output |
| **mistral:latest** | 4.4 GB | General tasks, creative writing |
| **llama3.1:8b** | 4.9 GB | Code generation, analysis |
| **qwen2.5:14b** | 9.0 GB | Complex reasoning, multi-step tasks |
| **nomic-embed-text** | 274 MB | Text embeddings |
| **mxbai-embed-large** | 669 MB | High-quality embeddings |
## System Architecture
```
SPARKNET/
βββ src/
β βββ agents/ # AI agents (BaseAgent, ExecutorAgent, etc.)
β βββ llm/ # Ollama integration
β βββ tools/ # Tools for agents (file ops, code exec, GPU mon)
β βββ utils/ # GPU manager, logging, config
β βββ workflow/ # Task orchestration (coming soon)
β βββ memory/ # Vector memory (coming soon)
βββ configs/ # YAML configurations
βββ examples/ # Example scripts
βββ tests/ # Unit tests (coming soon)
```
## Core Components
### 1. GPU Manager
```python
from src.utils.gpu_manager import get_gpu_manager
gpu_manager = get_gpu_manager()
# Monitor all GPUs
print(gpu_manager.monitor())
# Select best GPU with minimum memory requirement
best_gpu = gpu_manager.select_best_gpu(min_memory_gb=8.0)
# Use GPU context manager
with gpu_manager.gpu_context(min_memory_gb=4.0) as gpu_id:
# Your model code here
print(f"Using GPU {gpu_id}")
```
### 2. Ollama Client
```python
from src.llm.ollama_client import OllamaClient
client = OllamaClient(default_model="gemma2:2b")
# Simple generation
response = client.generate(
prompt="Explain quantum computing in one sentence.",
temperature=0.7
)
# Chat with history
messages = [
{"role": "user", "content": "What is AI?"},
]
response = client.chat(messages=messages)
# Generate embeddings
embeddings = client.embed(
text="Hello world",
model="nomic-embed-text:latest"
)
```
### 3. Tool System
```python
from src.tools import register_default_tools
# Register all default tools
registry = register_default_tools()
# List available tools
print(registry.list_tools())
# Output: ['file_reader', 'file_writer', 'file_search', 'directory_list',
# 'python_executor', 'bash_executor', 'gpu_monitor', 'gpu_select']
# Use a tool directly
gpu_tool = registry.get_tool('gpu_monitor')
result = await gpu_tool.safe_execute()
print(result.output)
```
### 4. Agents
```python
from src.llm.ollama_client import OllamaClient
from src.agents.executor_agent import ExecutorAgent
from src.agents.base_agent import Task
# Initialize client and agent
ollama_client = OllamaClient()
agent = ExecutorAgent(llm_client=ollama_client, model="gemma2:2b")
agent.set_tool_registry(registry)
# Create and execute a task
task = Task(
id="task_1",
description="Check GPU status and report available memory"
)
result = await agent.process_task(task)
print(f"Status: {result.status}")
print(f"Result: {result.result}")
```
## Configuration
Edit `configs/system.yaml` to customize:
```yaml
gpu:
primary: 3 # Use GPU 3 as primary
fallback: [2, 1, 0] # Fallback order
max_memory_per_model: "8GB"
ollama:
host: "localhost"
port: 11434
default_model: "gemma2:2b"
timeout: 300
memory:
vector_store: "chromadb"
embedding_model: "nomic-embed-text:latest"
max_context_length: 4096
```
## Next Steps
### Phase 1 Complete β
- [x] Project structure
- [x] GPU manager with multi-GPU support
- [x] Ollama client integration
- [x] Base agent class
- [x] 8 essential tools
- [x] Configuration system
- [x] ExecutorAgent implementation
### Phase 2: Advanced Agents (Next)
- [ ] PlannerAgent - Task decomposition
- [ ] CriticAgent - Output validation
- [ ] MemoryAgent - Context management
- [ ] CoordinatorAgent - Multi-agent orchestration
- [ ] Agent communication protocol
### Phase 3: Advanced Features
- [ ] Vector-based memory (ChromaDB)
- [ ] Model router for task-appropriate selection
- [ ] Workflow engine
- [ ] Learning and feedback loops
- [ ] Comprehensive examples
## Troubleshooting
### Ollama Out of Memory Error
If you see "CUDA error: out of memory":
```bash
# Check GPU memory
python examples/gpu_monitor.py
# Restart Ollama on a GPU with more memory
pkill -f "ollama serve"
CUDA_VISIBLE_DEVICES=3 ollama serve # Use GPU with most free memory
```
### Model Not Found
Download missing models:
```bash
ollama pull gemma2:2b
ollama pull llama3.2:latest
ollama pull nomic-embed-text:latest
```
### Import Errors
Install missing dependencies:
```bash
cd /home/mhamdan/SPARKNET
pip install -r requirements.txt
```
## Examples
Check the `examples/` directory for more:
- `gpu_monitor.py` - GPU monitoring and management
- `simple_task.py` - Basic agent task execution (coming soon)
- `multi_agent_collab.py` - Multi-agent collaboration (coming soon)
## Support & Documentation
- **Full Documentation**: See `README.md`
- **Configuration Reference**: See `configs/` directory
- **API Reference**: Coming soon
- **Issues**: Report at your issue tracker
---
**Happy building with SPARKNET!** π
|