A newer version of the Streamlit SDK is available:
1.54.0
Getting Started with SPARKNET
This guide will help you get up and running with SPARKNET quickly.
Prerequisites
β Python 3.10+ installed β NVIDIA GPU with CUDA support β Ollama installed and running
Quick Start
1. Verify Installation
First, check that your GPUs are available:
cd /home/mhamdan/SPARKNET
python examples/gpu_monitor.py
This will show:
- All detected GPUs
- Memory usage for each GPU
- Temperature and utilization stats
- Best GPU selection based on available memory
2. Test Basic Functionality
Run the basic test to verify all components work:
python test_basic.py
This tests:
- GPU Manager
- Ollama Client
- Tool System
3. Run Your First Agent Task
Try a simple agent-based task:
# Coming soon - full agent example
python examples/simple_task.py
Important: GPU Configuration
SPARKNET works best when Ollama uses a GPU with sufficient free memory. Your current GPU status:
- GPU 0: 0.32 GB free - Nearly full
- GPU 1: 0.00 GB free - Full
- GPU 2: 6.87 GB free - Good for small/medium models
- GPU 3: 8.71 GB free - Best for larger models
To run Ollama on a specific GPU (recommended GPU 3):
# Stop current Ollama
pkill -f "ollama serve"
# Start Ollama on GPU 3
CUDA_VISIBLE_DEVICES=3 ollama serve
Available Models
You currently have these models installed:
| Model | Size | Best Use Case |
|---|---|---|
| gemma2:2b | 1.6 GB | Fast inference, lightweight tasks |
| llama3.2:latest | 2.0 GB | Classification, simple QA |
| phi3:latest | 2.2 GB | Reasoning, structured output |
| mistral:latest | 4.4 GB | General tasks, creative writing |
| llama3.1:8b | 4.9 GB | Code generation, analysis |
| qwen2.5:14b | 9.0 GB | Complex reasoning, multi-step tasks |
| nomic-embed-text | 274 MB | Text embeddings |
| mxbai-embed-large | 669 MB | High-quality embeddings |
System Architecture
SPARKNET/
βββ src/
β βββ agents/ # AI agents (BaseAgent, ExecutorAgent, etc.)
β βββ llm/ # Ollama integration
β βββ tools/ # Tools for agents (file ops, code exec, GPU mon)
β βββ utils/ # GPU manager, logging, config
β βββ workflow/ # Task orchestration (coming soon)
β βββ memory/ # Vector memory (coming soon)
βββ configs/ # YAML configurations
βββ examples/ # Example scripts
βββ tests/ # Unit tests (coming soon)
Core Components
1. GPU Manager
from src.utils.gpu_manager import get_gpu_manager
gpu_manager = get_gpu_manager()
# Monitor all GPUs
print(gpu_manager.monitor())
# Select best GPU with minimum memory requirement
best_gpu = gpu_manager.select_best_gpu(min_memory_gb=8.0)
# Use GPU context manager
with gpu_manager.gpu_context(min_memory_gb=4.0) as gpu_id:
# Your model code here
print(f"Using GPU {gpu_id}")
2. Ollama Client
from src.llm.ollama_client import OllamaClient
client = OllamaClient(default_model="gemma2:2b")
# Simple generation
response = client.generate(
prompt="Explain quantum computing in one sentence.",
temperature=0.7
)
# Chat with history
messages = [
{"role": "user", "content": "What is AI?"},
]
response = client.chat(messages=messages)
# Generate embeddings
embeddings = client.embed(
text="Hello world",
model="nomic-embed-text:latest"
)
3. Tool System
from src.tools import register_default_tools
# Register all default tools
registry = register_default_tools()
# List available tools
print(registry.list_tools())
# Output: ['file_reader', 'file_writer', 'file_search', 'directory_list',
# 'python_executor', 'bash_executor', 'gpu_monitor', 'gpu_select']
# Use a tool directly
gpu_tool = registry.get_tool('gpu_monitor')
result = await gpu_tool.safe_execute()
print(result.output)
4. Agents
from src.llm.ollama_client import OllamaClient
from src.agents.executor_agent import ExecutorAgent
from src.agents.base_agent import Task
# Initialize client and agent
ollama_client = OllamaClient()
agent = ExecutorAgent(llm_client=ollama_client, model="gemma2:2b")
agent.set_tool_registry(registry)
# Create and execute a task
task = Task(
id="task_1",
description="Check GPU status and report available memory"
)
result = await agent.process_task(task)
print(f"Status: {result.status}")
print(f"Result: {result.result}")
Configuration
Edit configs/system.yaml to customize:
gpu:
primary: 3 # Use GPU 3 as primary
fallback: [2, 1, 0] # Fallback order
max_memory_per_model: "8GB"
ollama:
host: "localhost"
port: 11434
default_model: "gemma2:2b"
timeout: 300
memory:
vector_store: "chromadb"
embedding_model: "nomic-embed-text:latest"
max_context_length: 4096
Next Steps
Phase 1 Complete β
- Project structure
- GPU manager with multi-GPU support
- Ollama client integration
- Base agent class
- 8 essential tools
- Configuration system
- ExecutorAgent implementation
Phase 2: Advanced Agents (Next)
- PlannerAgent - Task decomposition
- CriticAgent - Output validation
- MemoryAgent - Context management
- CoordinatorAgent - Multi-agent orchestration
- Agent communication protocol
Phase 3: Advanced Features
- Vector-based memory (ChromaDB)
- Model router for task-appropriate selection
- Workflow engine
- Learning and feedback loops
- Comprehensive examples
Troubleshooting
Ollama Out of Memory Error
If you see "CUDA error: out of memory":
# Check GPU memory
python examples/gpu_monitor.py
# Restart Ollama on a GPU with more memory
pkill -f "ollama serve"
CUDA_VISIBLE_DEVICES=3 ollama serve # Use GPU with most free memory
Model Not Found
Download missing models:
ollama pull gemma2:2b
ollama pull llama3.2:latest
ollama pull nomic-embed-text:latest
Import Errors
Install missing dependencies:
cd /home/mhamdan/SPARKNET
pip install -r requirements.txt
Examples
Check the examples/ directory for more:
gpu_monitor.py- GPU monitoring and managementsimple_task.py- Basic agent task execution (coming soon)multi_agent_collab.py- Multi-agent collaboration (coming soon)
Support & Documentation
- Full Documentation: See
README.md - Configuration Reference: See
configs/directory - API Reference: Coming soon
- Issues: Report at your issue tracker
Happy building with SPARKNET! π