| # Getting Started with SPARKNET | |
| This guide will help you get up and running with SPARKNET quickly. | |
| ## Prerequisites | |
| ✓ Python 3.10+ installed | |
| ✓ NVIDIA GPU with CUDA support | |
| ✓ Ollama installed and running | |
| ## Quick Start | |
| ### 1. Verify Installation | |
| First, check that your GPUs are available: | |
| ```bash | |
| cd /home/mhamdan/SPARKNET | |
| python examples/gpu_monitor.py | |
| ``` | |
| This will show: | |
| - All detected GPUs | |
| - Memory usage for each GPU | |
| - Temperature and utilization stats | |
| - Best GPU selection based on available memory | |
| ### 2. Test Basic Functionality | |
| Run the basic test to verify all components work: | |
| ```bash | |
| python test_basic.py | |
| ``` | |
| This tests: | |
| - GPU Manager | |
| - Ollama Client | |
| - Tool System | |
| ### 3. Run Your First Agent Task | |
| Try a simple agent-based task: | |
| ```bash | |
| # Coming soon - full agent example | |
| python examples/simple_task.py | |
| ``` | |
| ## Important: GPU Configuration | |
| SPARKNET works best when Ollama uses a GPU with sufficient free memory. Your current GPU status: | |
| - **GPU 0**: 0.32 GB free - Nearly full | |
| - **GPU 1**: 0.00 GB free - Full | |
| - **GPU 2**: 6.87 GB free - Good for small/medium models | |
| - **GPU 3**: 8.71 GB free - Best for larger models | |
| To run Ollama on a specific GPU (recommended GPU 3): | |
| ```bash | |
| # Stop current Ollama | |
| pkill -f "ollama serve" | |
| # Start Ollama on GPU 3 | |
| CUDA_VISIBLE_DEVICES=3 ollama serve | |
| ``` | |
| ## Available Models | |
| You currently have these models installed: | |
| | Model | Size | Best Use Case | | |
| |-------|------|---------------| | |
| | **gemma2:2b** | 1.6 GB | Fast inference, lightweight tasks | | |
| | **llama3.2:latest** | 2.0 GB | Classification, simple QA | | |
| | **phi3:latest** | 2.2 GB | Reasoning, structured output | | |
| | **mistral:latest** | 4.4 GB | General tasks, creative writing | | |
| | **llama3.1:8b** | 4.9 GB | Code generation, analysis | | |
| | **qwen2.5:14b** | 9.0 GB | Complex reasoning, multi-step tasks | | |
| | **nomic-embed-text** | 274 MB | Text embeddings | | |
| | **mxbai-embed-large** | 669 MB | High-quality embeddings | | |
| ## System Architecture | |
| ``` | |
| SPARKNET/ | |
| ├── src/ | |
| │ ├── agents/ # AI agents (BaseAgent, ExecutorAgent, etc.) | |
| │ ├── llm/ # Ollama integration | |
| │ ├── tools/ # Tools for agents (file ops, code exec, GPU mon) | |
| │ ├── utils/ # GPU manager, logging, config | |
| │ ├── workflow/ # Task orchestration (coming soon) | |
| │ └── memory/ # Vector memory (coming soon) | |
| ├── configs/ # YAML configurations | |
| ├── examples/ # Example scripts | |
| └── tests/ # Unit tests (coming soon) | |
| ``` | |
| ## Core Components | |
| ### 1. GPU Manager | |
| ```python | |
| from src.utils.gpu_manager import get_gpu_manager | |
| gpu_manager = get_gpu_manager() | |
| # Monitor all GPUs | |
| print(gpu_manager.monitor()) | |
| # Select best GPU with minimum memory requirement | |
| best_gpu = gpu_manager.select_best_gpu(min_memory_gb=8.0) | |
| # Use GPU context manager | |
| with gpu_manager.gpu_context(min_memory_gb=4.0) as gpu_id: | |
| # Your model code here | |
| print(f"Using GPU {gpu_id}") | |
| ``` | |
| ### 2. Ollama Client | |
| ```python | |
| from src.llm.ollama_client import OllamaClient | |
| client = OllamaClient(default_model="gemma2:2b") | |
| # Simple generation | |
| response = client.generate( | |
| prompt="Explain quantum computing in one sentence.", | |
| temperature=0.7 | |
| ) | |
| # Chat with history | |
| messages = [ | |
| {"role": "user", "content": "What is AI?"}, | |
| ] | |
| response = client.chat(messages=messages) | |
| # Generate embeddings | |
| embeddings = client.embed( | |
| text="Hello world", | |
| model="nomic-embed-text:latest" | |
| ) | |
| ``` | |
| ### 3. Tool System | |
| ```python | |
| from src.tools import register_default_tools | |
| # Register all default tools | |
| registry = register_default_tools() | |
| # List available tools | |
| print(registry.list_tools()) | |
| # Output: ['file_reader', 'file_writer', 'file_search', 'directory_list', | |
| # 'python_executor', 'bash_executor', 'gpu_monitor', 'gpu_select'] | |
| # Use a tool directly | |
| gpu_tool = registry.get_tool('gpu_monitor') | |
| result = await gpu_tool.safe_execute() | |
| print(result.output) | |
| ``` | |
| ### 4. Agents | |
| ```python | |
| from src.llm.ollama_client import OllamaClient | |
| from src.agents.executor_agent import ExecutorAgent | |
| from src.agents.base_agent import Task | |
| # Initialize client and agent | |
| ollama_client = OllamaClient() | |
| agent = ExecutorAgent(llm_client=ollama_client, model="gemma2:2b") | |
| agent.set_tool_registry(registry) | |
| # Create and execute a task | |
| task = Task( | |
| id="task_1", | |
| description="Check GPU status and report available memory" | |
| ) | |
| result = await agent.process_task(task) | |
| print(f"Status: {result.status}") | |
| print(f"Result: {result.result}") | |
| ``` | |
| ## Configuration | |
| Edit `configs/system.yaml` to customize: | |
| ```yaml | |
| gpu: | |
| primary: 3 # Use GPU 3 as primary | |
| fallback: [2, 1, 0] # Fallback order | |
| max_memory_per_model: "8GB" | |
| ollama: | |
| host: "localhost" | |
| port: 11434 | |
| default_model: "gemma2:2b" | |
| timeout: 300 | |
| memory: | |
| vector_store: "chromadb" | |
| embedding_model: "nomic-embed-text:latest" | |
| max_context_length: 4096 | |
| ``` | |
| ## Next Steps | |
| ### Phase 1 Complete ✓ | |
| - [x] Project structure | |
| - [x] GPU manager with multi-GPU support | |
| - [x] Ollama client integration | |
| - [x] Base agent class | |
| - [x] 8 essential tools | |
| - [x] Configuration system | |
| - [x] ExecutorAgent implementation | |
| ### Phase 2: Advanced Agents (Next) | |
| - [ ] PlannerAgent - Task decomposition | |
| - [ ] CriticAgent - Output validation | |
| - [ ] MemoryAgent - Context management | |
| - [ ] CoordinatorAgent - Multi-agent orchestration | |
| - [ ] Agent communication protocol | |
| ### Phase 3: Advanced Features | |
| - [ ] Vector-based memory (ChromaDB) | |
| - [ ] Model router for task-appropriate selection | |
| - [ ] Workflow engine | |
| - [ ] Learning and feedback loops | |
| - [ ] Comprehensive examples | |
| ## Troubleshooting | |
| ### Ollama Out of Memory Error | |
| If you see "CUDA error: out of memory": | |
| ```bash | |
| # Check GPU memory | |
| python examples/gpu_monitor.py | |
| # Restart Ollama on a GPU with more memory | |
| pkill -f "ollama serve" | |
| CUDA_VISIBLE_DEVICES=3 ollama serve # Use GPU with most free memory | |
| ``` | |
| ### Model Not Found | |
| Download missing models: | |
| ```bash | |
| ollama pull gemma2:2b | |
| ollama pull llama3.2:latest | |
| ollama pull nomic-embed-text:latest | |
| ``` | |
| ### Import Errors | |
| Install missing dependencies: | |
| ```bash | |
| cd /home/mhamdan/SPARKNET | |
| pip install -r requirements.txt | |
| ``` | |
| ## Examples | |
| Check the `examples/` directory for more: | |
| - `gpu_monitor.py` - GPU monitoring and management | |
| - `simple_task.py` - Basic agent task execution (coming soon) | |
| - `multi_agent_collab.py` - Multi-agent collaboration (coming soon) | |
| ## Support & Documentation | |
| - **Full Documentation**: See `README.md` | |
| - **Configuration Reference**: See `configs/` directory | |
| - **API Reference**: Coming soon | |
| - **Issues**: Report at your issue tracker | |
| --- | |
| **Happy building with SPARKNET!** 🚀 | |