Spaces:

MHamdan
/

SPARKNET

Sleeping

App Files Files Community

MHamdan commited on Jan 25

Commit

667e85c

1 Parent(s): 76c3b0a

Fix Streamlit Cloud deployment and related short lacks

Browse files

Files changed (5) hide show

README.md +153 -228
demo/app.py +6 -6
demo/rag_config.py +12 -0
requirements-streamlit.txt +58 -0
requirements.txt +37 -43

README.md CHANGED Viewed

@@ -1,254 +1,185 @@
-<<<<<<< HEAD
 ---
 title: SPARKNET
 sdk: streamlit
 app_file: demo/app.py
 python_version: "3.10"
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
-=======
-# SPARKNET: Agentic AI Workflow System
-Multi-agent orchestration system leveraging local LLM models via Ollama with multi-GPU support.
-## Overview
-SPARKNET is an autonomous AI agent framework that enables:
-- **Multi-Agent Orchestration**: Specialized agents for planning, execution, and validation
-- **Local LLM Integration**: Uses Ollama for privacy-preserving AI inference
-- **Multi-GPU Support**: Efficiently utilizes 4x NVIDIA RTX 2080 Ti GPUs
-- **Tool-Augmented Agents**: Agents can use tools for file I/O, code execution, and system monitoring
-- **Memory Management**: Vector-based episodic and semantic memory
-- **Learning & Adaptation**: Feedback loops for continuous improvement
-## System Requirements
-### Hardware
-- NVIDIA GPUs with CUDA support (tested on 4x RTX 2080 Ti, 11GB VRAM each)
-- Minimum 16GB RAM
-- 50GB+ free disk space
-### Software
-- Python 3.10+
-- CUDA 12.0+
-- Ollama installed and running
-## Installation
-### 1. Install Ollama
-```bash
-# Install Ollama (if not already installed)
-curl -fsSL https://ollama.com/install.sh | sh
-# Start Ollama server
-ollama serve
-```
-### 2. Install SPARKNET
-```bash
-cd /home/mhamdan/SPARKNET
-# Install dependencies
-pip install -r requirements.txt
-# Install in development mode
-pip install -e .
-```
-### 3. Download Recommended Models
-```bash
-# Lightweight models
-ollama pull llama3.2:latest
-ollama pull phi3:latest
-# General purpose models
-ollama pull llama3.1:8b
-ollama pull mistral:latest
-# Large reasoning model
-ollama pull qwen2.5:14b
-# Embedding models
-ollama pull nomic-embed-text:latest
-ollama pull mxbai-embed-large:latest
-```
 ## Quick Start
-### Basic Usage
-```python
-from src.llm.ollama_client import OllamaClient
-from src.agents.executor_agent import ExecutorAgent
-from src.agents.base_agent import Task
-from src.tools import register_default_tools
-import asyncio
-# Initialize
-ollama_client = OllamaClient()
-tool_registry = register_default_tools()
-# Create agent
-agent = ExecutorAgent(llm_client=ollama_client)
-agent.set_tool_registry(tool_registry)
-# Create and execute task
-task = Task(
-    id="task_1",
-    description="List all Python files in the current directory",
-)
-async def run():
-    result = await agent.process_task(task)
-    print(f"Status: {result.status}")
-    print(f"Result: {result.result}")
-asyncio.run(run())
 ```
-### Running Examples
 ```bash
-# Simple agent with tool usage
-python examples/simple_task.py
-# Multi-agent collaboration
-python examples/multi_agent_collab.py
-# GPU monitoring
-python examples/gpu_monitor.py
-# Patent Wake-Up workflow (VISTA Scenario 1)
-python test_patent_wakeup.py
 ```
-## Patent Wake-Up Workflow (Phase 2C)
-SPARKNET now includes a complete **Patent Wake-Up workflow** for VISTA Scenario 1, which transforms dormant patents into commercialization opportunities.
-### Quick Start
 ```bash
-# 1. Ensure required models are available
-ollama pull llama3.1:8b
-ollama pull mistral:latest
-ollama pull qwen2.5:14b
-# 2. Run the Patent Wake-Up workflow
-python test_patent_wakeup.py
-```
-### Workflow Steps
-The Patent Wake-Up pipeline executes four specialized agents sequentially:
-1. **DocumentAnalysisAgent** - Analyzes patent structure and assesses Technology Readiness Level (TRL)
-2. **MarketAnalysisAgent** - Identifies market opportunities with size/growth data
-3. **MatchmakingAgent** - Matches with potential partners using semantic search
-4. **OutreachAgent** - Generates professional valorization briefs (PDF format)
-### Example Output
 ```
-Patent: AI-Powered Drug Discovery Platform
-TRL Level: 7/9
-Market Opportunities: 4 identified ($150B+ addressable market)
-Stakeholder Matches: 10 partners (investors, companies, universities)
-Output: outputs/valorization_brief_[patent_id]_[date].pdf
-```
-### Specialized Agents
-| Agent | Purpose | Model | Output |
-|-------|---------|-------|--------|
-| DocumentAnalysisAgent | Patent extraction & TRL assessment | llama3.1:8b | PatentAnalysis object |
-| MarketAnalysisAgent | Market opportunity identification | mistral:latest | MarketAnalysis object |
-| MatchmakingAgent | Stakeholder matching with scoring | qwen2.5:14b | List of StakeholderMatch |
-| OutreachAgent | Valorization brief generation | llama3.1:8b | ValorizationBrief + PDF |
-See `PHASE_2C_COMPLETE_SUMMARY.md` for full implementation details.
-## Architecture
-### Core Components
-1. **Agents** (`src/agents/`)
-   - `BaseAgent`: Core agent interface
-   - `ExecutorAgent`: Task execution with tools
-   - `PlannerAgent`: Task decomposition (coming soon)
-   - `CriticAgent`: Output validation (coming soon)
-2. **LLM Integration** (`src/llm/`)
-   - `OllamaClient`: Interface to local Ollama models
-   - Model routing based on task complexity
-3. **Tools** (`src/tools/`)
-   - File operations: read, write, search
-   - Code execution: Python, bash
-   - GPU monitoring and selection
-4. **Utilities** (`src/utils/`)
-   - GPU manager for resource allocation
-   - Logging and configuration
-   - Memory management
-### Configuration
-Configuration files in `configs/`:
-- `system.yaml`: System-wide settings
-- `models.yaml`: Model routing rules
-- `agents.yaml`: Agent configurations
-## Available Models
-| Model | Size | Use Case |
-|-------|------|----------|
-| llama3.2:latest | 2.0 GB | Classification, routing, simple QA |
-| phi3:latest | 2.2 GB | Quick reasoning, structured output |
-| mistral:latest | 4.4 GB | General tasks, creative writing |
-| llama3.1:8b | 4.9 GB | General tasks, code generation |
-| qwen2.5:14b | 9.0 GB | Complex reasoning, multi-step tasks |
-| nomic-embed-text | 274 MB | Text embeddings, semantic search |
-| mxbai-embed-large | 669 MB | High-quality embeddings, RAG |
-## GPU Management
-SPARKNET automatically manages GPU resources:
-```python
-from src.utils.gpu_manager import get_gpu_manager
-gpu_manager = get_gpu_manager()
-# Monitor all GPUs
-print(gpu_manager.monitor())
-# Select best GPU with 8GB+ free
-with gpu_manager.gpu_context(min_memory_gb=8.0) as gpu_id:
-    # Your model code here
-    print(f"Using GPU {gpu_id}")
-```
 ## Development
-### Project Structure
-```
-SPARKNET/
-├── src/
-│   ├── agents/         # Agent implementations
-│   ├── llm/           # LLM client and routing
-│   ├── workflow/      # Task orchestration (coming soon)
-│   ├── memory/        # Memory systems (coming soon)
-│   ├── tools/         # Agent tools
-│   └── utils/         # Utilities
-├── configs/           # Configuration files
-├── examples/          # Example scripts
-├── tests/            # Unit tests
-└── Dataset/          # Data directory
-```
 ### Running Tests
 ```bash
 pytest tests/
@@ -260,36 +191,20 @@ black src/
 flake8 src/
 ```
 ## Roadmap
-### Phase 1: Foundation ✅
-- [x] Project structure
-- [x] GPU manager
-- [x] Ollama client
-- [x] Base agent
-- [x] Basic tools
-- [x] Configuration system
-### Phase 2: Multi-Agent System (In Progress)
-- [x] ExecutorAgent
-- [ ] PlannerAgent
-- [ ] CriticAgent
-- [ ] MemoryAgent
-- [ ] CoordinatorAgent
-- [ ] Agent communication protocol
-### Phase 3: Advanced Features
-- [ ] Vector-based memory (ChromaDB)
-- [ ] Learning and feedback mechanisms
-- [ ] Model router
-- [ ] Workflow engine
-- [ ] Monitoring dashboard
-### Phase 4: Optimization
-- [ ] Multi-GPU parallelization
-- [ ] Performance optimization
-- [ ] Comprehensive testing
-- [ ] Documentation
 ## Contributing
@@ -300,23 +215,33 @@ Contributions are welcome! Please:
 4. Run tests
 5. Submit a pull request
-## License
-MIT License - see LICENSE file for details
 ## Acknowledgments
-- Ollama for local LLM inference
-- NVIDIA for CUDA and GPU support
-- The open-source AI community
 ## Support
-For issues and questions:
-- GitHub Issues: [Your repo URL]
-- Documentation: [Docs URL]
 ---
-Built with ❤️ for autonomous AI systems
->>>>>>> e692211 (Initial commit: SPARKNET framework)

 ---
 title: SPARKNET
+emoji: 🔥
+colorFrom: red
+colorTo: blue
 sdk: streamlit
+sdk_version: 1.28.0
 app_file: demo/app.py
 python_version: "3.10"
+pinned: false
 ---
+# 🔥 SPARKNET: AI-Powered Technology Transfer Office Automation
+**Multi-agent AI platform for research valorization and IP management**
+[![Streamlit App](https://static.streamlit.io/badges/streamlit_badge_black_white.svg)](https://sparknet.streamlit.app)
+[![VISTA Project](https://img.shields.io/badge/VISTA-Horizon%20EU-blue)](https://vista-project.eu)
+[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
+---
+## Overview
+SPARKNET is an enterprise-grade **Technology Transfer Office (TTO) Automation Platform** that combines multi-agent AI orchestration with document intelligence to automate key TTO workflows. Built for the VISTA/Horizon EU project.
+### 🎯 Core TTO Scenarios
+| Scenario | Status | Description |
+|----------|--------|-------------|
+| 💡 **Patent Wake-Up** | ✅ Live | Transform dormant patents into commercialization opportunities |
+| ⚖️ **Agreement Safety** | ✅ Live | AI-assisted legal document review with risk detection |
+| 🤝 **Partner Matching** | ✅ Live | Intelligent stakeholder matching for technology transfer |
+| 📋 **License Compliance** | 🔨 Dev | Payment tracking, milestone verification, revenue alerts |
+| 🏆 **Award Identification** | 🔨 Dev | Funding opportunity scanning and nomination assistance |
+### 📊 Coverage Dashboard
+- **3 Fully Covered** - Production-ready scenarios
+- **5 Partially Covered** - In development
+- **2 Not Covered** - Planned for future
+---
+## Features
+### 🛡️ AI Quality Assurance
+- **CriticAgent Validation**: Every AI output validated against VISTA quality standards
+- **Confidence Scoring**: Automatic abstention for low-confidence results
+- **Source Verification**: Hallucination mitigation with evidence grounding
+- **Human-in-the-Loop**: Critical decisions require human approval
+### 🤖 Multi-Agent Architecture
+- **PlannerAgent**: Task decomposition and workflow planning
+- **ExecutorAgent**: Task execution with tool usage
+- **CriticAgent**: Output validation and refinement
+- **MemoryAgent**: Context management and retrieval
+### 📄 Document Intelligence
+- OCR with PaddleOCR/Tesseract
+- Layout detection and semantic chunking
+- Schema-driven field extraction
+- Visual evidence grounding (bbox, page, confidence)
+### 💬 RAG Q&A
+- Vector search with ChromaDB
+- Grounded retrieval with citations
+- Multi-document querying
+- Citation generation
+---
 ## Quick Start
+### Streamlit Cloud (Recommended)
+The app is deployed on Streamlit Cloud. Visit:
+```
+https://sparknet.streamlit.app
 ```
+### Local Installation
 ```bash
+# Clone repository
+git clone https://github.com/MHHamdan/SPARKNET.git
+cd SPARKNET
+# Install dependencies
+pip install -r requirements.txt
+# Run Streamlit app
+streamlit run demo/app.py
 ```
+### With Local LLM (Ollama)
+For privacy-preserving local inference:
 ```bash
+# Install Ollama
+curl -fsSL https://ollama.com/install.sh | sh
+# Pull models
+ollama pull llama3.2:latest
+ollama pull nomic-embed-text
+# Run SPARKNET
+streamlit run demo/app.py
 ```
+---
+## Configuration
+### API Keys
+Configure in `.streamlit/secrets.toml` or environment variables:
+```toml
+[auth]
+password = "your-password"
+GROQ_API_KEY = "your-groq-key"
+GOOGLE_API_KEY = "your-google-key"
+OPENROUTER_API_KEY = "your-openrouter-key"
+```
+See `.env.example` for all available configuration options.
+### Supported LLM Providers
+| Provider | Free Tier | Notes |
+|----------|-----------|-------|
+| Groq | 14,400 req/day | Fastest inference |
+| Google Gemini | 15 req/min | Good for general use |
+| OpenRouter | Many free models | Multi-model access |
+| GitHub Models | Free GPT-4o | Requires GitHub token |
+| HuggingFace | Thousands of models | Good for embeddings |
+| Ollama | Unlimited (local) | Maximum privacy |
+---
+## Project Structure
+```
+SPARKNET/
+├── demo/                    # Streamlit application
+│   ├── app.py              # Main app
+│   ├── auth.py             # Authentication
+│   ├── llm_providers.py    # LLM provider management
+│   └── pages/              # Multi-page app
+├── src/
+│   ├── agents/             # Agent implementations
+│   │   ├── scenario1/      # Patent Wake-Up
+│   │   ├── scenario3/      # License Compliance
+│   │   └── scenario4/      # Award Identification
+│   ├── rag/                # RAG subsystem
+│   ├── workflow/           # LangGraph workflows
+│   └── document_intelligence/  # Document processing
+├── configs/                # Configuration files
+├── .streamlit/             # Streamlit config
+└── SECURITY.md            # Security documentation
+```
+---
+## Security & GDPR
+SPARKNET supports GDPR-compliant deployments:
+- **Local Inference**: Use Ollama for 100% on-premise processing
+- **Data Isolation**: Configure data retention policies
+- **Audit Logging**: Track all AI interactions
+- **Private Deployment**: Enterprise deployment options
+See [SECURITY.md](SECURITY.md) for detailed security documentation.
+---
 ## Development
 ### Running Tests
 ```bash
 pytest tests/
 flake8 src/
 ```
+---
 ## Roadmap
+- [x] Patent Wake-Up workflow
+- [x] Agreement Safety review
+- [x] Partner Matching
+- [x] CriticAgent validation
+- [ ] License Compliance Monitoring (in progress)
+- [ ] Award Identification (in progress)
+- [ ] Grant Writing Assistant
+- [ ] Negotiation Support
+---
 ## Contributing
 4. Run tests
 5. Submit a pull request
+---
 ## Acknowledgments
+- **Ollama** for local LLM inference
+- **NVIDIA** for CUDA and GPU support
+- **LangChain** for LLM orchestration
+- **Streamlit** for the web framework
+- **The open-source AI community**
+---
 ## Support
+- **GitHub Issues**: [github.com/MHHamdan/SPARKNET/issues](https://github.com/MHHamdan/SPARKNET/issues)
+- **Documentation**: See `/docs` folder
+---
+## License
+MIT License - see [LICENSE](LICENSE) file for details.
 ---
+<p align="center">
+  <strong>🔥 SPARKNET</strong><br>
+  AI-Powered Technology Transfer Office Automation<br>
+  <em>VISTA/Horizon EU Project</em>
+</p>

demo/app.py CHANGED Viewed

@@ -660,24 +660,24 @@ def render_home_page():
     with col1:
         st.markdown("""
         <div style="background: #f0f9ff; border-radius: 8px; padding: 1rem; border: 1px solid #bae6fd;">
-            <h4 style="margin: 0 0 0.5rem 0;">🔍 CriticAgent Validation</h4>
-            <p style="font-size: 0.9rem; margin: 0;">Every AI output is validated against VISTA quality standards with dimension-based scoring.</p>
         </div>
         """, unsafe_allow_html=True)
     with col2:
         st.markdown("""
         <div style="background: #f0fdf4; border-radius: 8px; padding: 1rem; border: 1px solid #bbf7d0;">
-            <h4 style="margin: 0 0 0.5rem 0;">📊 Confidence Scoring</h4>
-            <p style="font-size: 0.9rem; margin: 0;">All extractions include confidence scores with automatic abstention for low-confidence results.</p>
         </div>
         """, unsafe_allow_html=True)
     with col3:
         st.markdown("""
         <div style="background: #fefce8; border-radius: 8px; padding: 1rem; border: 1px solid #fef08a;">
-            <h4 style="margin: 0 0 0.5rem 0;">👤 Human-in-the-Loop</h4>
-            <p style="font-size: 0.9rem; margin: 0;">Critical decisions require human approval with clear decision points throughout workflows.</p>
         </div>
         """, unsafe_allow_html=True)

     with col1:
         st.markdown("""
         <div style="background: #f0f9ff; border-radius: 8px; padding: 1rem; border: 1px solid #bae6fd;">
+            <h4 style="margin: 0 0 0.5rem 0; color: #1e3a5f;">🔍 CriticAgent Validation</h4>
+            <p style="font-size: 0.9rem; margin: 0; color: #334155;">Every AI output is validated against VISTA quality standards with dimension-based scoring.</p>
         </div>
         """, unsafe_allow_html=True)
     with col2:
         st.markdown("""
         <div style="background: #f0fdf4; border-radius: 8px; padding: 1rem; border: 1px solid #bbf7d0;">
+            <h4 style="margin: 0 0 0.5rem 0; color: #14532d;">📊 Confidence Scoring</h4>
+            <p style="font-size: 0.9rem; margin: 0; color: #334155;">All extractions include confidence scores with automatic abstention for low-confidence results.</p>
         </div>
         """, unsafe_allow_html=True)
     with col3:
         st.markdown("""
         <div style="background: #fefce8; border-radius: 8px; padding: 1rem; border: 1px solid #fef08a;">
+            <h4 style="margin: 0 0 0.5rem 0; color: #713f12;">👤 Human-in-the-Loop</h4>
+            <p style="font-size: 0.9rem; margin: 0; color: #334155;">Critical decisions require human approval with clear decision points throughout workflows.</p>
         </div>
         """, unsafe_allow_html=True)

demo/rag_config.py CHANGED Viewed

@@ -53,6 +53,18 @@ def get_unified_rag_system():
     This is cached at the Streamlit level so all pages share the same instance.
     """
     try:
         from src.rag.agentic import AgenticRAG, RAGConfig
         from src.rag.store import get_vector_store, VectorStoreConfig, reset_vector_store
         from src.rag.embeddings import get_embedding_adapter, EmbeddingConfig, reset_embedding_adapter

     This is cached at the Streamlit level so all pages share the same instance.
     """
     try:
+        # Check for required dependencies first
+        try:
+            import pydantic
+        except ImportError:
+            return {
+                "status": "error",
+                "error": "Required dependency 'pydantic' is not installed. Please check requirements.txt.",
+                "rag": None,
+                "store": None,
+                "embedder": None,
+            }
         from src.rag.agentic import AgenticRAG, RAGConfig
         from src.rag.store import get_vector_store, VectorStoreConfig, reset_vector_store
         from src.rag.embeddings import get_embedding_adapter, EmbeddingConfig, reset_embedding_adapter

requirements-streamlit.txt ADDED Viewed

	@@ -0,0 +1,58 @@

+# SPARKNET Requirements for Streamlit Cloud
+# Lighter version without heavy ML dependencies
+# ==============================================================================
+# Streamlit Web Framework
+# ==============================================================================
+streamlit>=1.28.0
+# ==============================================================================
+# LLM Orchestration (LangChain Ecosystem)
+# ==============================================================================
+langchain>=0.1.0
+langchain-community>=0.0.20
+ollama>=0.1.0
+# ==============================================================================
+# Vector Stores & Embeddings
+# ==============================================================================
+chromadb>=0.4.0
+sentence-transformers>=2.2.0
+# ==============================================================================
+# Data Validation & Configuration
+# ==============================================================================
+pydantic>=2.0.0
+pydantic-settings>=2.0.0
+pyyaml>=6.0
+python-dotenv>=1.0.0
+typing-extensions>=4.0.0
+# ==============================================================================
+# Observability & Logging
+# ==============================================================================
+loguru>=0.7.0
+rich>=13.0.0
+# ==============================================================================
+# System Monitoring
+# ==============================================================================
+psutil>=5.9.0
+# ==============================================================================
+# Web & HTTP
+# ==============================================================================
+requests>=2.31.0
+httpx>=0.25.0
+# ==============================================================================
+# PDF & Document Processing
+# ==============================================================================
+reportlab>=4.0.0
+pymupdf>=1.23.0
+# ==============================================================================
+# Caching & Performance
+# ==============================================================================
+cachetools>=5.3.0
+tenacity>=8.2.0

requirements.txt CHANGED Viewed

@@ -1,92 +1,86 @@
 # SPARKNET Requirements
-# Organized by category with strict version pinning for production stability
 # ==============================================================================
-# Core ML/AI Framework
 # ==============================================================================
-torch>=2.0.0,<3.0.0
-transformers>=4.35.0,<5.0.0
 # ==============================================================================
 # LLM Orchestration (LangChain Ecosystem)
 # ==============================================================================
-langchain>=0.1.0,<0.3.0
-langchain-community>=0.0.20,<1.0.0
 langchain-ollama>=0.0.1
-langgraph>=0.0.20,<1.0.0
-ollama>=0.1.0,<1.0.0
 # ==============================================================================
 # Vector Stores & Embeddings
 # ==============================================================================
-chromadb>=0.4.0,<0.5.0
-faiss-cpu>=1.7.4,<2.0.0
-sentence-transformers>=2.2.0,<3.0.0
-# ==============================================================================
-# Workflow & Task Management
-# ==============================================================================
-networkx>=3.0,<4.0
-redis>=5.0.0,<6.0.0
 # ==============================================================================
 # Data Validation & Configuration
 # ==============================================================================
-pydantic>=2.0.0,<3.0.0
 pydantic-settings>=2.0.0
-pyyaml>=6.0,<7.0
 python-dotenv>=1.0.0
 # ==============================================================================
 # Observability & Logging
 # ==============================================================================
-loguru>=0.7.0,<1.0.0
-rich>=13.0.0,<14.0.0
 # ==============================================================================
-# GPU & System Monitoring
 # ==============================================================================
-nvidia-ml-py3>=7.352.0
-psutil>=5.9.0,<6.0.0
 # ==============================================================================
 # Web & HTTP
 # ==============================================================================
-requests>=2.31.0,<3.0.0
-beautifulsoup4>=4.12.0,<5.0.0
-httpx>=0.25.0,<1.0.0
 # ==============================================================================
 # PDF & Document Processing
 # ==============================================================================
-reportlab>=4.0.0,<5.0.0
 pymupdf>=1.23.0
 # ==============================================================================
-# API Framework
 # ==============================================================================
-fastapi>=0.104.0,<1.0.0
-uvicorn[standard]>=0.24.0,<1.0.0
 python-multipart>=0.0.6
 # ==============================================================================
 # Caching & Performance
 # ==============================================================================
-cachetools>=5.3.0,<6.0.0
-tenacity>=8.2.0,<9.0.0
 # ==============================================================================
-# Testing
 # ==============================================================================
-pytest>=7.4.0,<8.0.0
-pytest-asyncio>=0.21.0,<1.0.0
-pytest-cov>=4.1.0
 # ==============================================================================
-# Development Tools
 # ==============================================================================
-black>=23.0.0
-flake8>=6.0.0
-mypy>=1.0.0
-isort>=5.12.0
-pre-commit>=3.5.0

 # SPARKNET Requirements
+# Compatible with Streamlit Cloud deployment
 # ==============================================================================
+# Streamlit Web Framework
 # ==============================================================================
+streamlit>=1.28.0
 # ==============================================================================
 # LLM Orchestration (LangChain Ecosystem)
 # ==============================================================================
+langchain>=0.1.0
+langchain-community>=0.0.20
 langchain-ollama>=0.0.1
+langgraph>=0.0.20
+ollama>=0.1.0
 # ==============================================================================
 # Vector Stores & Embeddings
 # ==============================================================================
+chromadb>=0.4.0
+faiss-cpu>=1.7.4
+sentence-transformers>=2.2.0
 # ==============================================================================
 # Data Validation & Configuration
 # ==============================================================================
+pydantic>=2.0.0
 pydantic-settings>=2.0.0
+pyyaml>=6.0
 python-dotenv>=1.0.0
+typing-extensions>=4.0.0
 # ==============================================================================
 # Observability & Logging
 # ==============================================================================
+loguru>=0.7.0
+rich>=13.0.0
 # ==============================================================================
+# System Monitoring
 # ==============================================================================
+psutil>=5.9.0
 # ==============================================================================
 # Web & HTTP
 # ==============================================================================
+requests>=2.31.0
+beautifulsoup4>=4.12.0
+httpx>=0.25.0
 # ==============================================================================
 # PDF & Document Processing
 # ==============================================================================
+reportlab>=4.0.0
 pymupdf>=1.23.0
 # ==============================================================================
+# API Framework (optional, for backend)
 # ==============================================================================
+fastapi>=0.104.0
+uvicorn>=0.24.0
 python-multipart>=0.0.6
 # ==============================================================================
 # Caching & Performance
 # ==============================================================================
+cachetools>=5.3.0
+tenacity>=8.2.0
+# ==============================================================================
+# Workflow & Task Management
+# ==============================================================================
+networkx>=3.0
 # ==============================================================================
+# ML/AI (Optional - uncomment for full functionality)
 # ==============================================================================
+# torch>=2.0.0
+# transformers>=4.35.0
 # ==============================================================================
+# Testing (development only)
 # ==============================================================================
+# pytest>=7.4.0
+# pytest-asyncio>=0.21.0