Spaces:

holistic-ai
/

AgentGraph

Running

App Files Files Community

wu981526092 commited on Sep 1, 2025

Commit

286c429

1 Parent(s): cac81df

add

Browse files

Files changed (7) hide show

backend/database/README_sample_data.md +0 -165
backend/database/sample_data.py +208 -336
backend/database/samples/README.md +157 -0
backend/database/samples/add_algorithm_sample_example.py +162 -0
backend/database/samples/knowledge_graphs/kg_python_documentation_enhanced.json +216 -0
backend/database/samples/samples_config.json +33 -0
backend/database/samples/traces/python_documentation_inquiry.json +110 -0

backend/database/README_sample_data.md DELETED Viewed

@@ -1,165 +0,0 @@
-# Enhanced Sample Data System
-## Overview
-The enhanced sample data system automatically inserts curated examples showcasing AgentGraph's complete feature set into new databases. Instead of starting with an empty system, users immediately see examples of traces and knowledge graphs with failure detection, optimization recommendations, and advanced content referencing capabilities.
-## Features
-### 📊 Automatic Insertion
-- Triggered when initializing an empty database
-- Non-destructive: skips insertion if existing data is found
-- Logs all operations for transparency
-### 🎯 Enhanced Examples
-The system includes a carefully selected example showcasing AgentGraph's advanced capabilities:
-**Python Documentation Assistant** (Comprehensive)
-   - Type: `documentation_search`
-   - Example: RAG-powered assistant processing multi-turn programming inquiry with knowledge search and failure detection
-   - 6 entities, 5 relations, 1 failure, 2 optimizations
-   - Features: Multi-step workflow, educational interactions, content references, quality scoring
-### 🕸️ Enhanced Knowledge Graph Examples
-Each trace comes with a pre-generated knowledge graph showcasing AgentGraph's complete feature set:
-- **Agent interactions and roles** with detailed prompts and content references
-- **Task decomposition** with clear importance levels
-- **Information flow** with specific interaction prompts
-- **RAG-powered knowledge search** retrieving relevant documents and context
-- **Failure detection** identifying real issues (spelling errors, system gaps)
-- **Optimization recommendations** providing actionable improvements
-- **Quality assessment** with confidence scores and metadata
-- **System summaries** with natural language descriptions using pronouns
-## Technical Implementation
-### Files
-- `backend/database/sample_data.py` - Contains sample data and insertion logic
-- `backend/database/init_db.py` - Modified to call sample data insertion
-- `backend/database/README_sample_data.md` - This documentation
-### Database Integration
-- Insertion happens after table creation in `init_database()`
-- Only triggers when `trace_count == 0` (empty database)
-- Uses existing `save_trace()` and `save_knowledge_graph()` functions
-- Full transaction support with rollback on errors
-### Data Structure
-```python
-SAMPLE_TRACES = [
-    {
-        "filename": "sample_basic_question.txt",
-        "title": "Basic Q&A: California Great America Season Pass",
-        "description": "Simple arithmetic calculation...",
-        "trace_type": "conversation",
-        "trace_source": "sample_data",
-        "tags": ["arithmetic", "simple", "calculation"],
-        "content": "User: ... Assistant: ..."
-    }
-]
-SAMPLE_KNOWLEDGE_GRAPHS = [
-    {
-        "filename": "kg_basic_question_001.json",
-        "trace_index": 0,  # Links to first trace
-        "graph_data": {
-            "entities": [...],
-            "relations": [...]
-        }
-    }
-]
-```
-## Usage
-### Automatic (Default)
-Sample data is inserted automatically when:
-- Creating a new database
-- Resetting an existing database with `--reset --force`
-- Database has zero traces
-### Manual Control
-```python
-from backend.database.sample_data import insert_sample_data, get_sample_data_info
-# Get information about available samples
-info = get_sample_data_info()
-print(f"Available: {info['traces_count']} traces, {info['knowledge_graphs_count']} KGs")
-# Manual insertion (with force to override existing data check)
-with get_session() as session:
-    results = insert_sample_data(session, force_insert=True)
-    print(f"Inserted: {results['traces_inserted']} traces, {results['knowledge_graphs_inserted']} KGs")
-```
-### Disabling Sample Data
-To disable automatic sample data insertion, modify `init_db.py`:
-```python
-# Comment out this section in init_database():
-# if trace_count == 0:
-#     # ... sample data insertion code ...
-```
-## Benefits for Users
-1. **Immediate Value**: New users see AgentGraph's complete capabilities immediately
-2. **Learning**: Example demonstrates RAG search, failure detection, optimization suggestions, and advanced features
-3. **Testing**: Users can test all AgentGraph features including quality assessment and content referencing
-4. **Reference**: Examples serve as high-quality templates showcasing best practices
-5. **Feature Discovery**: Users understand the full potential of knowledge graph enhancement
-6. **Quality Standards**: Examples demonstrate what production-ready knowledge graphs should contain
-## Quality Assurance
-- All sample traces are realistic and demonstrate real-world scenarios
-- Knowledge graphs are hand-crafted to showcase AgentGraph's complete feature set
-- Examples include actual failure detection (spelling errors, system gaps)
-- RAG search capabilities demonstrate knowledge retrieval workflows
-- Optimization recommendations are practical and actionable
-- Content references are accurate and support proper traceability
-- Quality scores reflect realistic assessment metrics
-- Content is appropriate and safe for all audiences
-- Regular validation ensures data integrity and feature completeness
-## Maintenance
-To update sample data:
-1. Modify `SAMPLE_TRACES` and `SAMPLE_KNOWLEDGE_GRAPHS` in `sample_data.py`
-2. Ensure trace_index links are correct between trace and KG
-3. Test with a fresh database initialization
-4. Update this documentation if needed
-## Troubleshooting
-### Sample Data Not Appearing
-- Check logs for "Sample data already exists, skipping insertion"
-- Verify database is actually empty: `SELECT COUNT(*) FROM traces;`
-- Force insertion manually with `force_insert=True`
-### Insertion Errors
-- Check logs for specific error messages
-- Verify database schema is up to date
-- Ensure all required tables exist
-- Check for foreign key constraint issues
-### Performance Impact
-- Sample data insertion adds ~2-3 seconds to database initialization
-- Total size: ~4KB of text content + ~15KB of JSON data
-- Negligible impact on production systems

backend/database/sample_data.py CHANGED Viewed

@@ -1,342 +1,170 @@
 #!/usr/bin/env python
 """
-Sample data for database initialization.
-Provides curated examples of traces and knowledge graphs for new users.
 """
 import json
 import logging
 from typing import Dict, List, Any
 logger = logging.getLogger(__name__)
-# Enhanced sample traces showcasing AgentGraph's full capabilities
-SAMPLE_TRACES = [
-    {
-        "filename": "python_documentation_inquiry.json",
-        "title": "Python Documentation Assistant Demo",
-        "description": "Comprehensive example showing RAG-powered AI assistant handling multi-turn programming inquiry with knowledge search, detailed explanations, code examples, performance analysis, and interactive learning",
-        "trace_type": "documentation_search",
-        "trace_source": "sample_data",
-        "tags": ["programming", "rag_assistant", "documentation", "failure_detection", "optimization"],
-        "content": """{
-  "id": "doc_trace_demo_001",
-  "timestamp": "2025-01-27T00:00:00",
-  "metadata": {
-    "source": "AgentGraph_Demo",
-    "row_index": 0,
-    "converted_at": "2025-01-27T12:00:00.000000"
-  },
-  "data": {
-    "total_observations": 4,
-    "summary": "Python documentation inquiry with RAG-powered assistant response including knowledge search, explanation, and follow-up code examples"
-  },
-  "observations": [
-    {
-      "id": "demo_obs_001",
-      "type": "user_query",
-      "timestamp": "2025-01-27T00:00:00",
-      "input": "Hello! I'm learning Python and I keep seeing this syntax with square brackets that looks different from regular loops. Can you help me understand what Python list comprehensions are used for and when I should use them?",
-      "output": "I'll help you understand Python list comprehensions! Let me search our documentation to give you a comprehensive explanation.",
-      "metadata": {
-        "request_date": "2025-01-27T00:00:00",
-        "interaction_type": "initial_query",
-        "user_level": "beginner",
-        "topic_category": "python_syntax"
-      },
-      "latency": 0.124
-    },
-    {
-      "id": "demo_obs_002",
-      "type": "knowledge_search",
-      "timestamp": "2025-01-27T00:00:05",
-      "input": "Searching Python documentation for: list comprehensions, syntax, use cases, performance benefits",
-      "output": "Found relevant documentation: ['Python list comprehensions - efficient way to create lists', 'List comprehension vs loops performance comparison', 'Python built-in functions - len(), range(), enumerate()', 'Nested list comprehensions and filtering', 'Python data structures - lists, dictionaries, sets, tuples']",
-      "metadata": {
-        "search_query": "python list comprehensions syntax use cases",
-        "documents_found": 5,
-        "search_time_ms": 23,
-        "relevance_score": 0.94
-      },
-      "latency": 0.089
-    },
-    {
-      "id": "demo_obs_003",
-      "type": "inference",
-      "timestamp": "2025-01-27T00:00:08",
-      "input": "You are a helpful Python programming assistant designed to answer questions accurately based on the provided documentation. Use the information retrieved from the knowledge base below to respond concisely and correctly.\\n\\n### Retrieved Documents:\\n['Python list comprehensions - efficient way to create lists', 'List comprehension vs loops performance comparison', 'Python built-in functions - len(), range(), enumerate()', 'Nested list comprehensions and filtering']\\n\\n### Question:\\nwhat are python list comprehensions used for and when should I use them?\\n\\n### Answer:",
-      "output": "Python list comprehensions are a powerful and concise way to create lists! Here's what they're used for:\\n\\n**Main Uses:**\\n1. **Creating lists from existing iterables** - Transform elements from sequences\\n2. **Filtering data** - Include only elements that meet specific conditions\\n3. **Mathematical operations** - Apply calculations efficiently\\n4. **Combining operations** - Filter and transform in one line\\n\\n**When to use them:**\\n- When you need to create a new list from an existing iterable\\n- For simple transformations (keep it readable!)\\n- When performance matters (they're faster than equivalent for loops)\\n- To make code more Pythonic and concise\\n\\n**Basic syntax:** `[expression for item in iterable if condition]`",
-      "metadata": {
-        "request_date": "2025-01-27T00:00:00",
-        "databricks_request_id": "demo-request-003",
-        "request_time": "2025-01-27T12:00:08.000",
-        "status_code": 200,
-        "sampling_fraction": 1,
-        "request_raw": "{\"messages\":[{\"role\":\"system\",\"content\":\"You are a helpful Python programming assistant\"},{\"role\":\"user\",\"content\":\"You are a helpful Python programming assistant designed to answer questions accurately based on the provided documentation. Use the information retrieved from the knowledge base below to respond concisely and correctly.\\\\n\\\\n### Retrieved Documents:\\\\n['Python list comprehensions - efficient way to create lists', 'List comprehension vs loops performance comparison', 'Python built-in functions - len(), range(), enumerate()', 'Nested list comprehensions and filtering']\\\\n\\\\n### Question:\\\\nwhat are python list comprehensions used for and when should I use them?\\\\n\\\\n### Answer:\"}]}",
-        "response_raw": "{\"choices\":[{\"content_filter_results\":{\"hate\":{\"filtered\":false,\"severity\":\"safe\"},\"self_harm\":{\"filtered\":false,\"severity\":\"safe\"},\"sexual\":{\"filtered\":false,\"severity\":\"safe\"},\"violence\":{\"filtered\":false,\"severity\":\"safe\"}},\"finish_reason\":\"stop\",\"index\":0,\"logprobs\":null,\"message\":{\"content\":\"Python list comprehensions are a powerful and concise way to create lists! Here's what they're used for:\\\\n\\\\n**Main Uses:**\\\\n1. **Creating lists from existing iterables** - Transform elements from sequences\\\\n2. **Filtering data** - Include only elements that meet specific conditions\\\\n3. **Mathematical operations** - Apply calculations efficiently\\\\n4. **Combining operations** - Filter and transform in one line\\\\n\\\\n**When to use them:**\\\\n- When you need to create a new list from an existing iterable\\\\n- For simple transformations (keep it readable!)\\\\n- When performance matters (they're faster than equivalent for loops)\\\\n- To make code more Pythonic and concise\\\\n\\\\n**Basic syntax:** `[expression for item in iterable if condition]`\",\"refusal\":null,\"role\":\"assistant\"}}],\"created\":1737979208,\"id\":\"chatcmpl-DEMO003\",\"model\":\"gpt-4o-2024-11-20\",\"object\":\"chat.completion\",\"system_fingerprint\":\"fp_demo_003\",\"usage\":{\"completion_tokens\":156,\"completion_tokens_details\":{\"accepted_prediction_tokens\":0,\"audio_tokens\":0,\"reasoning_tokens\":0,\"rejected_prediction_tokens\":0},\"prompt_tokens\":184,\"prompt_tokens_details\":{\"audio_tokens\":0,\"cached_tokens\":0},\"total_tokens\":340}}",
-        "served_entity_id": "demo-entity-003",
-        "logging_error_codes": "[]",
-        "requester": "demo-user-001"
-      },
-      "latency": 1.247,
-      "usage": {
-        "completion_tokens": 156,
-        "completion_tokens_details": {
-          "accepted_prediction_tokens": 0,
-          "audio_tokens": 0,
-          "reasoning_tokens": 0,
-          "rejected_prediction_tokens": 0
-        },
-        "prompt_tokens": 184,
-        "prompt_tokens_details": {
-          "audio_tokens": 0,
-          "cached_tokens": 0
-        },
-        "total_tokens": 340
-      },
-      "model": "gpt-4o-2024-11-20"
-    },
-    {
-      "id": "demo_obs_004",
-      "type": "follow_up",
-      "timestamp": "2025-01-27T00:00:15",
-      "input": "That's really helpful! Could you show me some practical examples? I'd love to see the difference between a regular for loop and a list comprehension.",
-      "output": "Absolutely! Here are some practical examples:\\n\\n**Example 1: Squaring numbers**\\n```python\\n# Traditional for loop\\nsquares = []\\nfor x in range(5):\\n    squares.append(x**2)\\n# Result: [0, 1, 4, 9, 16]\\n\\n# List comprehension\\nsquares = [x**2 for x in range(5)]\\n# Same result: [0, 1, 4, 9, 16]\\n```\\n\\n**Example 2: Filtering even numbers**\\n```python\\n# Traditional approach\\nevens = []\\nfor x in range(10):\\n    if x % 2 == 0:\\n        evens.append(x)\\n\\n# List comprehension\\nevens = [x for x in range(10) if x % 2 == 0]\\n# Result: [0, 2, 4, 6, 8]\\n```\\n\\n**Example 3: String processing**\\n```python\\nwords = ['hello', 'world', 'python']\\n# Capitalize and get lengths\\nlengths = [len(word.upper()) for word in words]\\n# Result: [5, 5, 6]\\n```\\n\\nList comprehensions are not only more concise but also typically 20-30% faster than equivalent for loops!",
-      "metadata": {
-        "interaction_type": "follow_up_examples",
-        "code_examples_provided": 3,
-        "response_time": "2025-01-27T12:00:15.000"
-      },
-      "latency": 0.892,
-      "usage": {
-        "completion_tokens": 287,
-        "total_tokens": 445
-      }
-    }
-  ]
-}"""
-    }
-]
-# Enhanced knowledge graphs demonstrating AgentGraph's complete feature set
-SAMPLE_KNOWLEDGE_GRAPHS = [
-    {
-        "filename": "kg_python_documentation_enhanced.json",
-        "trace_index": 0,  # Links to first trace
-        "graph_data": {
-            "system_name": "Python Documentation Assistant",
-            "system_summary": "This intelligent assistant processes user inquiries about Python programming through a comprehensive multi-step workflow. When users submit questions, the agent performs knowledge search, delivers detailed explanations with code examples, and engages in follow-up interactions to ensure thorough understanding of Python concepts, syntax, and performance considerations.",
-            "entities": [
-                {
-                    "id": "agent_001",
-                    "type": "Agent",
-                    "name": "Python Documentation Agent",
-                    "importance": "HIGH",
-                    "raw_prompt": "You are a helpful Python programming assistant designed to answer questions accurately based on retrieved documentation context. Use the search results to provide precise responses.",
-                    "raw_prompt_ref": [
-                        {
-                            "line_start": 31,
-                            "line_end": 32
-                        }
-                    ]
-                },
-                {
-                    "id": "task_001",
-                    "type": "Task",
-                    "name": "Programming Question Processing",
-                    "importance": "HIGH",
-                    "raw_prompt": "Process user inquiry about Python programming and generate an accurate, contextual response based on available documentation and programming best practices.",
-                    "raw_prompt_ref": [
-                        {
-                            "line_start": 26,
-                            "line_end": 28
-                        }
-                    ]
-                },
-                {
-                    "id": "input_001",
-                    "type": "Input",
-                    "name": "User Programming Query",
-                    "importance": "HIGH",
-                    "raw_prompt": "Hello! I'm learning Python and I keep seeing this syntax with square brackets that looks different from regular loops. Can you help me understand what Python list comprehensions are used for and when I should use them?",
-                    "raw_prompt_ref": [
-                        {
-                            "line_start": 19,
-                            "line_end": 19
-                        }
-                    ]
-                },
-                {
-                    "id": "output_001",
-                    "type": "Output",
-                    "name": "Programming Concept Explanation",
-                    "importance": "HIGH",
-                    "raw_prompt": "Python list comprehensions are a powerful and concise way to create lists! Here's what they're used for: Main Uses: 1. Creating lists from existing iterables, 2. Filtering data, 3. Mathematical operations, 4. Combining operations. When to use them: For simple transformations, when performance matters, to make code more Pythonic and concise.",
-                    "raw_prompt_ref": [
-                        {
-                            "line_start": 20,
-                            "line_end": 20
-                        }
-                    ]
-                },
-                {
-                    "id": "human_001",
-                    "type": "Human",
-                    "name": "Python Developer",
-                    "importance": "MEDIUM",
-                    "raw_prompt": "Developer seeking Python programming guidance and documentation",
-                    "raw_prompt_ref": [
-                        {
-                            "line_start": 31,
-                            "line_end": 31
-                        }
-                    ]
-                },
-                {
-                    "id": "tool_001",
-                    "type": "Tool",
-                    "name": "Python Documentation Search",
-                    "importance": "HIGH",
-                    "raw_prompt": "Retrieval-Augmented Generation (RAG) system that searches Python documentation knowledge base for relevant concepts, syntax examples, and best practices to provide contextual information.",
-                    "raw_prompt_ref": [
-                        {
-                            "line_start": 49,
-                            "line_end": 49
-                        }
-                    ]
-                }
-            ],
-            "relations": [
-                {
-                    "id": "rel_001",
-                    "source": "input_001",
-                    "target": "agent_001",
-                    "type": "CONSUMED_BY",
-                    "importance": "HIGH",
-                    "interaction_prompt": "Extended user inquiry about Python list comprehensions received and processed through multi-step RAG workflow",
-                    "interaction_prompt_ref": [
-                        {
-                            "line_start": 19,
-                            "line_end": 19
-                        }
-                    ]
-                },
-                {
-                    "id": "rel_002",
-                    "source": "agent_001",
-                    "target": "task_001",
-                    "type": "PERFORMS",
-                    "importance": "HIGH",
-                    "interaction_prompt": "Agent executes comprehensive programming question processing including knowledge search, explanation, and code examples",
-                    "interaction_prompt_ref": [
-                        {
-                            "line_start": 26,
-                            "line_end": 28
-                        }
-                    ]
-                },
-                {
-                    "id": "rel_003",
-                    "source": "task_001",
-                    "target": "output_001",
-                    "type": "PRODUCES",
-                    "importance": "HIGH",
-                    "interaction_prompt": "Processing task generates detailed multi-part explanation with examples, performance analysis, and interactive follow-ups",
-                    "interaction_prompt_ref": [
-                        {
-                            "line_start": 20,
-                            "line_end": 20
-                        }
-                    ]
-                },
-                {
-                    "id": "rel_004",
-                    "source": "output_001",
-                    "target": "human_001",
-                    "type": "DELIVERS_TO",
-                    "importance": "HIGH",
-                    "interaction_prompt": "Comprehensive programming tutorial with examples and performance insights delivered to developer",
-                    "interaction_prompt_ref": [
-                        {
-                            "line_start": 20,
-                            "line_end": 20
-                        }
-                    ]
-                },
-                {
-                    "id": "rel_005",
-                    "source": "agent_001",
-                    "target": "tool_001",
-                    "type": "USES",
-                    "importance": "HIGH",
-                    "interaction_prompt": "Agent performs multi-step knowledge search retrieving documentation, examples, and performance comparisons for comprehensive response",
-                    "interaction_prompt_ref": [
-                        {
-                            "line_start": 49,
-                            "line_end": 49
-                        }
-                    ]
-                }
-            ],
-            "failures": [
-                {
-                    "id": "failure_001",
-                    "risk_type": "HALLUCINATION",
-                    "description": "Initial query could benefit from more specific learning objectives, though the multi-turn interaction successfully addressed this through follow-up questions.",
-                    "raw_text": "Hello! I'm learning Python and I keep seeing this syntax with square brackets that looks different from regular loops. Can you help me understand what Python list comprehensions are used for and when I should use them?",
-                    "raw_text_ref": [
-                        {
-                            "line_start": 19,
-                            "line_end": 19
-                        }
-                    ],
-                    "affected_id": "input_001"
                 }
-            ],
-            "optimizations": [
-                {
-                    "id": "opt_001",
-                    "recommendation_type": "PROMPT_REFINEMENT",
-                    "description": "Enhance initial query processing to identify learning level and tailor explanations accordingly. The current multi-turn approach works well but could be optimized with upfront user profiling.",
-                    "affected_ids": ["agent_001"],
-                    "raw_text_ref": [
-                        {
-                            "line_start": 31,
-                            "line_end": 32
-                        }
-                    ]
-                },
-                {
-                    "id": "opt_002",
-                    "recommendation_type": "TOOL_ENHANCEMENT",
-                    "description": "Integrate real-time code execution environment for testing examples, and expand knowledge base to include performance benchmarks and best practice recommendations.",
-                    "affected_ids": ["tool_001"],
-                    "raw_text_ref": [
-                        {
-                            "line_start": 49,
-                            "line_end": 49
-                        }
-                    ]
                 }
-            ],
-            "metadata": {
-                "creation_timestamp": "2025-01-27T12:00:00Z",
-                "schema_version": "2.1.0",
-                "quality_score": 0.89,
-                "entity_count": 6,
-                "relation_count": 5,
-                "failure_count": 1,
-                "optimization_count": 2,
-                "interaction_depth": "multi_turn",
-                "educational_value": "high",
-                "processing_method": "production_enhanced",
-                "content_source": "documentation_trace",
-                "language": "en",
-                "domain": "programming_documentation"
-            }
         }
-    }
-]
 def insert_sample_data(session, force_insert=False):
@@ -360,6 +188,10 @@ def insert_sample_data(session, force_insert=False):
         "errors": []
     }
     # Check if sample data already exists
     if not force_insert:
         existing_sample = session.query(Trace).filter(
@@ -368,13 +200,13 @@ def insert_sample_data(session, force_insert=False):
         if existing_sample:
             logger.info("Sample data already exists, skipping insertion")
-            results["skipped"] = len(SAMPLE_TRACES)
             return results
     try:
         # Insert sample traces
         trace_ids = []
-        for i, trace_data in enumerate(SAMPLE_TRACES):
             try:
                 trace = save_trace(
                     session=session,
@@ -395,7 +227,7 @@ def insert_sample_data(session, force_insert=False):
                 results["errors"].append(error_msg)
         # Insert corresponding knowledge graphs
-        for kg_data in SAMPLE_KNOWLEDGE_GRAPHS:
             try:
                 trace_index = kg_data["trace_index"]
                 if trace_index < len(trace_ids):
@@ -433,11 +265,51 @@ def get_sample_data_info():
     Returns:
         Dict with sample data statistics
     """
-    return {
-        "traces_count": len(SAMPLE_TRACES),
-        "knowledge_graphs_count": len(SAMPLE_KNOWLEDGE_GRAPHS),
-        "trace_types": list(set(t["trace_type"] for t in SAMPLE_TRACES)),
-        "complexity_levels": ["enhanced", "simple"],
-        "features": ["rag_search", "failure_detection", "optimization_recommendations", "content_references", "quality_scoring"],
-        "description": "Comprehensive AgentGraph example showcasing Python Documentation RAG-powered assistant with multi-turn interactions, detailed knowledge search, educational content delivery, failure detection, optimization suggestions, and advanced knowledge graph features"
-    }

 #!/usr/bin/env python
 """
+Sample data loader for database initialization.
+Loads curated examples of traces and knowledge graphs from JSON files for new users.
 """
 import json
 import logging
+import os
+from pathlib import Path
 from typing import Dict, List, Any
 logger = logging.getLogger(__name__)
+# Get the directory where this file is located
+CURRENT_DIR = Path(__file__).parent
+SAMPLES_DIR = CURRENT_DIR / "samples"
+CONFIG_FILE = SAMPLES_DIR / "samples_config.json"
+class SampleDataLoader:
+    """Loads sample data from JSON files."""
+    def __init__(self):
+        self._config = None
+        self._traces = None
+        self._knowledge_graphs = None
+    def _load_config(self) -> Dict[str, Any]:
+        """Load the samples configuration."""
+        if self._config is None:
+            try:
+                with open(CONFIG_FILE, 'r', encoding='utf-8') as f:
+                    self._config = json.load(f)
+                logger.info(f"Loaded sample data configuration from {CONFIG_FILE}")
+            except FileNotFoundError:
+                logger.error(f"Configuration file not found: {CONFIG_FILE}")
+                raise
+            except json.JSONDecodeError as e:
+                logger.error(f"Invalid JSON in configuration file: {e}")
+                raise
+        return self._config
+    def _load_trace(self, trace_file: str) -> Dict[str, Any]:
+        """Load a single trace from JSON file."""
+        trace_path = SAMPLES_DIR / trace_file
+        try:
+            with open(trace_path, 'r', encoding='utf-8') as f:
+                return json.load(f)
+        except FileNotFoundError:
+            logger.error(f"Trace file not found: {trace_path}")
+            raise
+        except json.JSONDecodeError as e:
+            logger.error(f"Invalid JSON in trace file {trace_path}: {e}")
+            raise
+    def _load_knowledge_graph(self, kg_file: str) -> Dict[str, Any]:
+        """Load a single knowledge graph from JSON file."""
+        kg_path = SAMPLES_DIR / kg_file
+        try:
+            with open(kg_path, 'r', encoding='utf-8') as f:
+                return json.load(f)
+        except FileNotFoundError:
+            logger.error(f"Knowledge graph file not found: {kg_path}")
+            raise
+        except json.JSONDecodeError as e:
+            logger.error(f"Invalid JSON in knowledge graph file {kg_path}: {e}")
+            raise
+    def get_traces(self) -> List[Dict[str, Any]]:
+        """Get all sample traces in the expected format."""
+        if self._traces is None:
+            config = self._load_config()
+            self._traces = []
+            for sample in config["samples"]:
+                # Load the trace data
+                trace_data = self._load_trace(sample["trace_file"])
+                # Convert to the expected format
+                trace_entry = {
+                    "filename": sample["name"].replace(" ", "_").lower() + ".json",
+                    "title": sample["name"],
+                    "description": sample["description"],
+                    "trace_type": sample["trace_type"],
+                    "trace_source": sample["trace_source"],
+                    "tags": sample["tags"],
+                    "content": json.dumps(trace_data["content"])  # Convert content back to JSON string
                 }
+                self._traces.append(trace_entry)
+            logger.info(f"Loaded {len(self._traces)} sample traces")
+        return self._traces
+    def get_knowledge_graphs(self) -> List[Dict[str, Any]]:
+        """Get all sample knowledge graphs in the expected format."""
+        if self._knowledge_graphs is None:
+            config = self._load_config()
+            self._knowledge_graphs = []
+            for i, sample in enumerate(config["samples"]):
+                # Load the knowledge graph data
+                kg_data = self._load_knowledge_graph(sample["knowledge_graph_file"])
+                # Convert to the expected format
+                kg_entry = {
+                    "filename": sample["knowledge_graph_file"].split("/")[-1],  # Get just the filename
+                    "trace_index": i,  # Links to trace by index
+                    "graph_data": kg_data["graph_data"]
                 }
+                self._knowledge_graphs.append(kg_entry)
+            logger.info(f"Loaded {len(self._knowledge_graphs)} sample knowledge graphs")
+        return self._knowledge_graphs
+    def get_sample_info(self) -> Dict[str, Any]:
+        """Get information about the available sample data."""
+        config = self._load_config()
+        traces = self.get_traces()
+        knowledge_graphs = self.get_knowledge_graphs()
+        # Extract unique features from all samples
+        all_features = set()
+        for sample in config["samples"]:
+            all_features.update(sample.get("features", []))
+        return {
+            "traces_count": len(traces),
+            "knowledge_graphs_count": len(knowledge_graphs),
+            "trace_types": list(set(t["trace_type"] for t in traces)),
+            "complexity_levels": list(set(sample.get("complexity", "standard") for sample in config["samples"])),
+            "features": list(all_features),
+            "description": config["metadata"]["description"],
+            "version": config["metadata"]["version"]
         }
+# Create a global loader instance
+_loader = SampleDataLoader()
+# Maintain backward compatibility by exposing the same interface
+def get_sample_traces() -> List[Dict[str, Any]]:
+    """Get sample traces (backward compatibility)."""
+    return _loader.get_traces()
+def get_sample_knowledge_graphs() -> List[Dict[str, Any]]:
+    """Get sample knowledge graphs (backward compatibility)."""
+    return _loader.get_knowledge_graphs()
+# Legacy global variables for backward compatibility
+@property
+def SAMPLE_TRACES():
+    """Legacy property for backward compatibility."""
+    return _loader.get_traces()
+@property
+def SAMPLE_KNOWLEDGE_GRAPHS():
+    """Legacy property for backward compatibility."""
+    return _loader.get_knowledge_graphs()
+# Make them accessible as module-level variables
+import sys
+current_module = sys.modules[__name__]
+current_module.SAMPLE_TRACES = _loader.get_traces()
+current_module.SAMPLE_KNOWLEDGE_GRAPHS = _loader.get_knowledge_graphs()
 def insert_sample_data(session, force_insert=False):
         "errors": []
     }
+    # Get sample data from loader
+    sample_traces = _loader.get_traces()
+    sample_knowledge_graphs = _loader.get_knowledge_graphs()
     # Check if sample data already exists
     if not force_insert:
         existing_sample = session.query(Trace).filter(
         if existing_sample:
             logger.info("Sample data already exists, skipping insertion")
+            results["skipped"] = len(sample_traces)
             return results
     try:
         # Insert sample traces
         trace_ids = []
+        for i, trace_data in enumerate(sample_traces):
             try:
                 trace = save_trace(
                     session=session,
                 results["errors"].append(error_msg)
         # Insert corresponding knowledge graphs
+        for kg_data in sample_knowledge_graphs:
             try:
                 trace_index = kg_data["trace_index"]
                 if trace_index < len(trace_ids):
     Returns:
         Dict with sample data statistics
     """
+    return _loader.get_sample_info()
+# Additional utility functions for managing samples
+def add_sample(sample_id: str, name: str, description: str, trace_file: str,
+               knowledge_graph_file: str, tags: List[str], trace_type: str = "custom",
+               trace_source: str = "sample_data", complexity: str = "standard",
+               features: List[str] = None):
+    """
+    Add a new sample to the configuration (utility function for future use).
+    Args:
+        sample_id: Unique identifier for the sample
+        name: Human-readable name
+        description: Description of the sample
+        trace_file: Path to trace JSON file relative to samples directory
+        knowledge_graph_file: Path to KG JSON file relative to samples directory
+        tags: List of tags
+        trace_type: Type of trace
+        trace_source: Source of trace
+        complexity: Complexity level
+        features: List of features demonstrated
+    """
+    # This would modify the config file - implementation depends on requirements
+    logger.info(f"Add sample feature called for: {sample_id}")
+    pass
+def list_available_samples() -> List[Dict[str, Any]]:
+    """List all available samples with their metadata."""
+    config = _loader._load_config()
+    return config["samples"]
+if __name__ == "__main__":
+    # Quick test of the loader
+    try:
+        info = get_sample_data_info()
+        print("Sample Data Info:", json.dumps(info, indent=2))
+        traces = get_sample_traces()
+        print(f"Loaded {len(traces)} traces")
+        kgs = get_sample_knowledge_graphs()
+        print(f"Loaded {len(kgs)} knowledge graphs")
+    except Exception as e:
+        print(f"Error testing sample data loader: {e}")

backend/database/samples/README.md ADDED Viewed

	@@ -0,0 +1,157 @@

+# AgentGraph Sample Data System
+这是重构后的 sample data 系统，使用 JSON 文件而不是硬编码的 Python 数据，更容易维护和扩展。
+## 📁 文件结构
+```
+samples/
+├── README.md                          # 本文档
+├── samples_config.json                # 样本配置文件
+├── traces/                           # Trace数据目录
+│   └── python_documentation_inquiry.json
+└── knowledge_graphs/                  # Knowledge Graph数据目录
+    └── kg_python_documentation_enhanced.json
+```
+## 🔧 配置系统
+### `samples_config.json`
+主配置文件，定义所有可用的样本：
+```json
+{
+  "samples": [
+    {
+      "id": "python_documentation_demo",
+      "name": "Python Documentation Assistant Demo",
+      "description": "...",
+      "trace_file": "traces/python_documentation_inquiry.json",
+      "knowledge_graph_file": "knowledge_graphs/kg_python_documentation_enhanced.json",
+      "tags": ["programming", "rag_assistant", "documentation"],
+      "complexity": "enhanced",
+      "trace_type": "documentation_search",
+      "trace_source": "sample_data",
+      "features": [
+        "rag_search",
+        "failure_detection",
+        "optimization_recommendations"
+      ]
+    }
+  ],
+  "metadata": {
+    "version": "1.0.0",
+    "created": "2025-01-27",
+    "description": "..."
+  }
+}
+```
+## 📄 数据文件格式
+### Trace 文件
+- 位置：`traces/`目录
+- 格式：标准 JSON，包含 filename, title, description, content 等字段
+- Content 字段包含完整的 trace 数据（observations, metadata 等）
+### Knowledge Graph 文件
+- 位置：`knowledge_graphs/`目录
+- 格式：标准 JSON，包含 filename, trace_index, graph_data 等字段
+- Graph_data 包含 entities, relations, failures, optimizations, metadata
+## 🔄 向后兼容性
+新的`sample_data.py`保持了与旧 API 的完全兼容性：
+```python
+# 这些调用仍然正常工作
+from backend.database.sample_data import SAMPLE_TRACES, SAMPLE_KNOWLEDGE_GRAPHS
+from backend.database.sample_data import insert_sample_data, get_sample_data_info
+```
+## ✨ 新增功能
+### 动态加载
+- 支持运行时添加新样本（修改 JSON 文件即可）
+- 自动验证 JSON 格式
+- 更好的错误处理和日志
+### 配置管理
+```python
+from backend.database.sample_data import list_available_samples, get_sample_data_info
+# 列出所有可用样本
+samples = list_available_samples()
+# 获取详细信息
+info = get_sample_data_info()
+```
+## 🚀 添加新样本
+### 1. 准备数据文件
+创建 trace 和 knowledge graph 的 JSON 文件，放在相应目录下。
+### 2. 更新配置
+在`samples_config.json`中添加新条目：
+```json
+{
+  "id": "new_sample_id",
+  "name": "New Sample Name",
+  "description": "Description of the sample",
+  "trace_file": "traces/new_trace.json",
+  "knowledge_graph_file": "knowledge_graphs/new_kg.json",
+  "tags": ["tag1", "tag2"],
+  "complexity": "standard",
+  "trace_type": "custom",
+  "trace_source": "sample_data",
+  "features": ["feature1", "feature2"]
+}
+```
+### 3. 自动加载
+系统会自动检测并加载新样本，无需重启。
+## 🎯 优势
+1. **易于维护**：数据与代码分离，修改样本不需要改 Python 代码
+2. **版本控制友好**：JSON diff 更清晰，方便 code review
+3. **扩展性强**：添加新样本只需添加 JSON 文件
+4. **类型安全**：JSON schema 验证（可扩展）
+5. **向后兼容**：现有代码无需修改
+## 🛠️ 开发工具
+### 测试新系统
+```bash
+cd backend/database
+python sample_data_new.py
+```
+### 验证 JSON 格式
+```bash
+python -m json.tool samples/traces/new_trace.json
+python -m json.tool samples/knowledge_graphs/new_kg.json
+```
+## 📊 从 algorithm-generated.jsonl 迁移
+当我们准备好从 algorithm-generated.jsonl 中选择样本时：
+1. 运行`multi_agent_knowledge_extractor.py`生成 KG
+2. 将 trace 和 KG 分别保存为 JSON 文件
+3. 在`samples_config.json`中添加配置条目
+4. 自动集成到系统中
+这个结构使得批量添加真实样本变得非常简单！

backend/database/samples/add_algorithm_sample_example.py ADDED Viewed

	@@ -0,0 +1,162 @@

+#!/usr/bin/env python
+"""
+示例脚本：如何从algorithm-generated.jsonl添加新样本到系统中
+"""
+import json
+import sys
+from pathlib import Path
+def extract_algorithm_sample(jsonl_path: str, sample_id: int = 0):
+    """
+    从algorithm-generated.jsonl中提取指定样本并转换为我们的格式
+    Args:
+        jsonl_path: algorithm-generated.jsonl文件路径
+        sample_id: 要提取的样本ID
+    """
+    # 读取JSONL文件
+    samples = []
+    with open(jsonl_path, 'r', encoding='utf-8') as f:
+        for line in f:
+            if line.strip():
+                samples.append(json.loads(line))
+    if sample_id >= len(samples):
+        print(f"错误：样本ID {sample_id} 超出范围，最大ID为 {len(samples)-1}")
+        return
+    sample = samples[sample_id]
+    # 提取trace数据
+    trace_data = {
+        "filename": f"algorithm_sample_{sample_id}.json",
+        "title": f"Algorithm Sample {sample_id}: {sample['question'][:50]}...",
+        "description": f"Multi-agent collaboration sample from algorithm-generated dataset. Agents: {', '.join(sample['agents'])}. Question: {sample['question'][:100]}...",
+        "trace_type": "multi_agent_collaboration",
+        "trace_source": "algorithm_generated",
+        "tags": ["multi_agent", "algorithm_generated", "real_failure"] + sample.get('agents', []),
+        "content": {
+            "id": f"algorithm_trace_{sample_id}",
+            "timestamp": "2025-01-27T00:00:00",
+            "metadata": {
+                "source": "algorithm-generated.jsonl",
+                "original_id": sample['id'],
+                "mistake_step": sample.get('mistake_step', 0),
+                "mistake_agent": sample.get('mistake_agent', 'unknown'),
+                "mistake_reason": sample.get('mistake_reason', 'unknown'),
+                "ground_truth": sample.get('ground_truth', 'unknown'),
+                "is_correct": sample.get('is_correct', False)
+            },
+            "data": {
+                "question": sample['question'],
+                "agents": sample['agents'],
+                "total_observations": len(json.loads(sample['trace'])) if isinstance(sample['trace'], str) else len(sample['trace'])
+            },
+            "observations": json.loads(sample['trace']) if isinstance(sample['trace'], str) else sample['trace']
+        }
+    }
+    print(f"✅ 成功提取样本 {sample_id}")
+    print(f"   问题: {sample['question'][:100]}...")
+    print(f"   智能体: {', '.join(sample['agents'])}")
+    print(f"   观察数量: {len(trace_data['content']['observations'])}")
+    print(f"   错误步骤: {sample.get('mistake_step', 'N/A')}")
+    print(f"   错误智能体: {sample.get('mistake_agent', 'N/A')}")
+    return trace_data
+def create_sample_config_entry(sample_id: int, trace_data: dict):
+    """创建样本配置条目"""
+    sample_config = {
+        "id": f"algorithm_sample_{sample_id}",
+        "name": f"Algorithm Generated Sample {sample_id}",
+        "description": trace_data["description"],
+        "trace_file": f"traces/algorithm_sample_{sample_id}.json",
+        "knowledge_graph_file": f"knowledge_graphs/kg_algorithm_sample_{sample_id}.json",
+        "tags": trace_data["tags"],
+        "complexity": "advanced",
+        "trace_type": trace_data["trace_type"],
+        "trace_source": trace_data["trace_source"],
+        "features": [
+            "multi_agent_collaboration",
+            "real_failure_analysis",
+            "complex_reasoning",
+            "tool_usage",
+            "error_patterns"
+        ]
+    }
+    return sample_config
+def demo_algorithm_sample_extraction():
+    """演示如何提取algorithm样本的过程"""
+    print("🔍 AgentGraph Sample Data 系统 - Algorithm Sample 集成演示")
+    print("=" * 60)
+    # 模拟从algorithm-generated.jsonl提取样本
+    print("\n1️⃣ 从algorithm-generated.jsonl中选择最有价值的样本:")
+    sample_recommendations = [
+        {"id": 0, "reason": "数学计算 + 多智能体协作，相对简单但真实"},
+        {"id": 1, "reason": "地理查询 + 复杂搜索验证流程，展示网络服务集成"},
+        {"id": 2, "reason": "API调用失败，典型的认证和网络服务问题"}
+    ]
+    for rec in sample_recommendations:
+        print(f"   📝 样本 #{rec['id']}: {rec['reason']}")
+    print("\n2️⃣ 数据提取和转换流程:")
+    print("   ✅ 从JSONL提取原始trace数据")
+    print("   ✅ 转换为AgentGraph标准格式")
+    print("   ✅ 添加metadata和分类标签")
+    print("   ✅ 生成JSON文件")
+    print("\n3️⃣ Knowledge Graph生成:")
+    print("   🤖 运行 multi_agent_knowledge_extractor.py")
+    print("   📊 分析智能体角色和交互关系")
+    print("   ⚠️  识别失败模式和原因")
+    print("   🚀 生成优化建议")
+    print("\n4️⃣ 系统集成:")
+    print("   📁 保存trace和KG为JSON文件")
+    print("   ⚙️  更新samples_config.json")
+    print("   🔄 自动加载到AgentGraph系统")
+    print("\n5️⃣ 预期结果:")
+    print("   🎯 真实的多智能体失败案例")
+    print("   📈 比现有Python文档示例更复杂和真实")
+    print("   🛠️  展示AgentGraph分析复杂系统的能力")
+    print("   🌟 为用户提供production-ready的示例")
+    print("\n6️⃣ 下一步操作:")
+    print("   1. 选择3-5个最有代表性的algorithm样本")
+    print("   2. 运行knowledge graph提取")
+    print("   3. 集成到新的JSON系统中")
+    print("   4. 测试并优化样本质量")
+    print("\n" + "=" * 60)
+    print("✨ 新系统已准备好接收algorithm-generated样本！")
+if __name__ == "__main__":
+    demo_algorithm_sample_extraction()
+    # 如果提供了JSONL文件路径，可以进行实际提取
+    if len(sys.argv) > 1:
+        jsonl_path = sys.argv[1]
+        sample_id = int(sys.argv[2]) if len(sys.argv) > 2 else 0
+        print(f"\n🔄 实际提取样本 {sample_id} from {jsonl_path}")
+        trace_data = extract_algorithm_sample(jsonl_path, sample_id)
+        if trace_data:
+            config_entry = create_sample_config_entry(sample_id, trace_data)
+            print("\n📋 生成的配置条目:")
+            print(json.dumps(config_entry, indent=2, ensure_ascii=False))
+            print("\n💾 要保存这个样本，请:")
+            print(f"   1. 将trace数据保存到: samples/traces/algorithm_sample_{sample_id}.json")
+            print(f"   2. 运行KG提取生成: samples/knowledge_graphs/kg_algorithm_sample_{sample_id}.json")
+            print(f"   3. 将配置条目添加到: samples/samples_config.json")

backend/database/samples/knowledge_graphs/kg_python_documentation_enhanced.json ADDED Viewed

	@@ -0,0 +1,216 @@

+{
+  "filename": "kg_python_documentation_enhanced.json",
+  "trace_index": 0,
+  "graph_data": {
+    "system_name": "Python Documentation Assistant",
+    "system_summary": "This intelligent assistant processes user inquiries about Python programming through a comprehensive multi-step workflow. When users submit questions, the agent performs knowledge search, delivers detailed explanations with code examples, and engages in follow-up interactions to ensure thorough understanding of Python concepts, syntax, and performance considerations.",
+    "entities": [
+      {
+        "id": "agent_001",
+        "type": "Agent",
+        "name": "Python Documentation Agent",
+        "importance": "HIGH",
+        "raw_prompt": "You are a helpful Python programming assistant designed to answer questions accurately based on retrieved documentation context. Use the search results to provide precise responses.",
+        "raw_prompt_ref": [
+          {
+            "line_start": 31,
+            "line_end": 32
+          }
+        ]
+      },
+      {
+        "id": "task_001",
+        "type": "Task",
+        "name": "Programming Question Processing",
+        "importance": "HIGH",
+        "raw_prompt": "Process user inquiry about Python programming and generate an accurate, contextual response based on available documentation and programming best practices.",
+        "raw_prompt_ref": [
+          {
+            "line_start": 26,
+            "line_end": 28
+          }
+        ]
+      },
+      {
+        "id": "input_001",
+        "type": "Input",
+        "name": "User Programming Query",
+        "importance": "HIGH",
+        "raw_prompt": "Hello! I'm learning Python and I keep seeing this syntax with square brackets that looks different from regular loops. Can you help me understand what Python list comprehensions are used for and when I should use them?",
+        "raw_prompt_ref": [
+          {
+            "line_start": 19,
+            "line_end": 19
+          }
+        ]
+      },
+      {
+        "id": "output_001",
+        "type": "Output",
+        "name": "Programming Concept Explanation",
+        "importance": "HIGH",
+        "raw_prompt": "Python list comprehensions are a powerful and concise way to create lists! Here's what they're used for: Main Uses: 1. Creating lists from existing iterables, 2. Filtering data, 3. Mathematical operations, 4. Combining operations. When to use them: For simple transformations, when performance matters, to make code more Pythonic and concise.",
+        "raw_prompt_ref": [
+          {
+            "line_start": 20,
+            "line_end": 20
+          }
+        ]
+      },
+      {
+        "id": "human_001",
+        "type": "Human",
+        "name": "Python Developer",
+        "importance": "MEDIUM",
+        "raw_prompt": "Developer seeking Python programming guidance and documentation",
+        "raw_prompt_ref": [
+          {
+            "line_start": 31,
+            "line_end": 31
+          }
+        ]
+      },
+      {
+        "id": "tool_001",
+        "type": "Tool",
+        "name": "Python Documentation Search",
+        "importance": "HIGH",
+        "raw_prompt": "Retrieval-Augmented Generation (RAG) system that searches Python documentation knowledge base for relevant concepts, syntax examples, and best practices to provide contextual information.",
+        "raw_prompt_ref": [
+          {
+            "line_start": 49,
+            "line_end": 49
+          }
+        ]
+      }
+    ],
+    "relations": [
+      {
+        "id": "rel_001",
+        "source": "input_001",
+        "target": "agent_001",
+        "type": "CONSUMED_BY",
+        "importance": "HIGH",
+        "interaction_prompt": "Extended user inquiry about Python list comprehensions received and processed through multi-step RAG workflow",
+        "interaction_prompt_ref": [
+          {
+            "line_start": 19,
+            "line_end": 19
+          }
+        ]
+      },
+      {
+        "id": "rel_002",
+        "source": "agent_001",
+        "target": "task_001",
+        "type": "PERFORMS",
+        "importance": "HIGH",
+        "interaction_prompt": "Agent executes comprehensive programming question processing including knowledge search, explanation, and code examples",
+        "interaction_prompt_ref": [
+          {
+            "line_start": 26,
+            "line_end": 28
+          }
+        ]
+      },
+      {
+        "id": "rel_003",
+        "source": "task_001",
+        "target": "output_001",
+        "type": "PRODUCES",
+        "importance": "HIGH",
+        "interaction_prompt": "Processing task generates detailed multi-part explanation with examples, performance analysis, and interactive follow-ups",
+        "interaction_prompt_ref": [
+          {
+            "line_start": 20,
+            "line_end": 20
+          }
+        ]
+      },
+      {
+        "id": "rel_004",
+        "source": "output_001",
+        "target": "human_001",
+        "type": "DELIVERS_TO",
+        "importance": "HIGH",
+        "interaction_prompt": "Comprehensive programming tutorial with examples and performance insights delivered to developer",
+        "interaction_prompt_ref": [
+          {
+            "line_start": 20,
+            "line_end": 20
+          }
+        ]
+      },
+      {
+        "id": "rel_005",
+        "source": "agent_001",
+        "target": "tool_001",
+        "type": "USES",
+        "importance": "HIGH",
+        "interaction_prompt": "Agent performs multi-step knowledge search retrieving documentation, examples, and performance comparisons for comprehensive response",
+        "interaction_prompt_ref": [
+          {
+            "line_start": 49,
+            "line_end": 49
+          }
+        ]
+      }
+    ],
+    "failures": [
+      {
+        "id": "failure_001",
+        "risk_type": "HALLUCINATION",
+        "description": "Initial query could benefit from more specific learning objectives, though the multi-turn interaction successfully addressed this through follow-up questions.",
+        "raw_text": "Hello! I'm learning Python and I keep seeing this syntax with square brackets that looks different from regular loops. Can you help me understand what Python list comprehensions are used for and when I should use them?",
+        "raw_text_ref": [
+          {
+            "line_start": 19,
+            "line_end": 19
+          }
+        ],
+        "affected_id": "input_001"
+      }
+    ],
+    "optimizations": [
+      {
+        "id": "opt_001",
+        "recommendation_type": "PROMPT_REFINEMENT",
+        "description": "Enhance initial query processing to identify learning level and tailor explanations accordingly. The current multi-turn approach works well but could be optimized with upfront user profiling.",
+        "affected_ids": ["agent_001"],
+        "raw_text_ref": [
+          {
+            "line_start": 31,
+            "line_end": 32
+          }
+        ]
+      },
+      {
+        "id": "opt_002",
+        "recommendation_type": "TOOL_ENHANCEMENT",
+        "description": "Integrate real-time code execution environment for testing examples, and expand knowledge base to include performance benchmarks and best practice recommendations.",
+        "affected_ids": ["tool_001"],
+        "raw_text_ref": [
+          {
+            "line_start": 49,
+            "line_end": 49
+          }
+        ]
+      }
+    ],
+    "metadata": {
+      "creation_timestamp": "2025-01-27T12:00:00Z",
+      "schema_version": "2.1.0",
+      "quality_score": 0.89,
+      "entity_count": 6,
+      "relation_count": 5,
+      "failure_count": 1,
+      "optimization_count": 2,
+      "interaction_depth": "multi_turn",
+      "educational_value": "high",
+      "processing_method": "production_enhanced",
+      "content_source": "documentation_trace",
+      "language": "en",
+      "domain": "programming_documentation"
+    }
+  }
+}

backend/database/samples/samples_config.json ADDED Viewed

	@@ -0,0 +1,33 @@

+{
+  "samples": [
+    {
+      "id": "python_documentation_demo",
+      "name": "Python Documentation Assistant Demo",
+      "description": "Comprehensive example showing RAG-powered AI assistant handling multi-turn programming inquiry with knowledge search, detailed explanations, code examples, performance analysis, and interactive learning",
+      "trace_file": "traces/python_documentation_inquiry.json",
+      "knowledge_graph_file": "knowledge_graphs/kg_python_documentation_enhanced.json",
+      "tags": [
+        "programming",
+        "rag_assistant",
+        "documentation",
+        "failure_detection",
+        "optimization"
+      ],
+      "complexity": "enhanced",
+      "trace_type": "documentation_search",
+      "trace_source": "sample_data",
+      "features": [
+        "rag_search",
+        "failure_detection",
+        "optimization_recommendations",
+        "content_references",
+        "quality_scoring"
+      ]
+    }
+  ],
+  "metadata": {
+    "version": "1.0.0",
+    "created": "2025-01-27",
+    "description": "Comprehensive AgentGraph sample data showcasing real multi-agent interactions, failures, and optimizations"
+  }
+}

backend/database/samples/traces/python_documentation_inquiry.json ADDED Viewed

	@@ -0,0 +1,110 @@

+{
+  "filename": "python_documentation_inquiry.json",
+  "title": "Python Documentation Assistant Demo",
+  "description": "Comprehensive example showing RAG-powered AI assistant handling multi-turn programming inquiry with knowledge search, detailed explanations, code examples, performance analysis, and interactive learning",
+  "trace_type": "documentation_search",
+  "trace_source": "sample_data",
+  "tags": [
+    "programming",
+    "rag_assistant",
+    "documentation",
+    "failure_detection",
+    "optimization"
+  ],
+  "content": {
+    "id": "doc_trace_demo_001",
+    "timestamp": "2025-01-27T00:00:00",
+    "metadata": {
+      "source": "AgentGraph_Demo",
+      "row_index": 0,
+      "converted_at": "2025-01-27T12:00:00.000000"
+    },
+    "data": {
+      "total_observations": 4,
+      "summary": "Python documentation inquiry with RAG-powered assistant response including knowledge search, explanation, and follow-up code examples"
+    },
+    "observations": [
+      {
+        "id": "demo_obs_001",
+        "type": "user_query",
+        "timestamp": "2025-01-27T00:00:00",
+        "input": "Hello! I'm learning Python and I keep seeing this syntax with square brackets that looks different from regular loops. Can you help me understand what Python list comprehensions are used for and when I should use them?",
+        "output": "I'll help you understand Python list comprehensions! Let me search our documentation to give you a comprehensive explanation.",
+        "metadata": {
+          "request_date": "2025-01-27T00:00:00",
+          "interaction_type": "initial_query",
+          "user_level": "beginner",
+          "topic_category": "python_syntax"
+        },
+        "latency": 0.124
+      },
+      {
+        "id": "demo_obs_002",
+        "type": "knowledge_search",
+        "timestamp": "2025-01-27T00:00:05",
+        "input": "Searching Python documentation for: list comprehensions, syntax, use cases, performance benefits",
+        "output": "Found relevant documentation: ['Python list comprehensions - efficient way to create lists', 'List comprehension vs loops performance comparison', 'Python built-in functions - len(), range(), enumerate()', 'Nested list comprehensions and filtering', 'Python data structures - lists, dictionaries, sets, tuples']",
+        "metadata": {
+          "search_query": "python list comprehensions syntax use cases",
+          "documents_found": 5,
+          "search_time_ms": 23,
+          "relevance_score": 0.94
+        },
+        "latency": 0.089
+      },
+      {
+        "id": "demo_obs_003",
+        "type": "inference",
+        "timestamp": "2025-01-27T00:00:08",
+        "input": "You are a helpful Python programming assistant designed to answer questions accurately based on the provided documentation. Use the information retrieved from the knowledge base below to respond concisely and correctly.\\n\\n### Retrieved Documents:\\n['Python list comprehensions - efficient way to create lists', 'List comprehension vs loops performance comparison', 'Python built-in functions - len(), range(), enumerate()', 'Nested list comprehensions and filtering']\\n\\n### Question:\\nwhat are python list comprehensions used for and when should I use them?\\n\\n### Answer:",
+        "output": "Python list comprehensions are a powerful and concise way to create lists! Here's what they're used for:\\n\\n**Main Uses:**\\n1. **Creating lists from existing iterables** - Transform elements from sequences\\n2. **Filtering data** - Include only elements that meet specific conditions\\n3. **Mathematical operations** - Apply calculations efficiently\\n4. **Combining operations** - Filter and transform in one line\\n\\n**When to use them:**\\n- When you need to create a new list from an existing iterable\\n- For simple transformations (keep it readable!)\\n- When performance matters (they're faster than equivalent for loops)\\n- To make code more Pythonic and concise\\n\\n**Basic syntax:** `[expression for item in iterable if condition]`",
+        "metadata": {
+          "request_date": "2025-01-27T00:00:00",
+          "databricks_request_id": "demo-request-003",
+          "request_time": "2025-01-27T12:00:08.000",
+          "status_code": 200,
+          "sampling_fraction": 1,
+          "request_raw": "{\"messages\":[{\"role\":\"system\",\"content\":\"You are a helpful Python programming assistant\"},{\"role\":\"user\",\"content\":\"You are a helpful Python programming assistant designed to answer questions accurately based on the provided documentation. Use the information retrieved from the knowledge base below to respond concisely and correctly.\\\\n\\\\n### Retrieved Documents:\\\\n['Python list comprehensions - efficient way to create lists', 'List comprehension vs loops performance comparison', 'Python built-in functions - len(), range(), enumerate()', 'Nested list comprehensions and filtering']\\\\n\\\\n### Question:\\\\nwhat are python list comprehensions used for and when should I use them?\\\\n\\\\n### Answer:\"}]}",
+          "response_raw": "{\"choices\":[{\"content_filter_results\":{\"hate\":{\"filtered\":false,\"severity\":\"safe\"},\"self_harm\":{\"filtered\":false,\"severity\":\"safe\"},\"sexual\":{\"filtered\":false,\"severity\":\"safe\"},\"violence\":{\"filtered\":false,\"severity\":\"safe\"}},\"finish_reason\":\"stop\",\"index\":0,\"logprobs\":null,\"message\":{\"content\":\"Python list comprehensions are a powerful and concise way to create lists! Here's what they're used for:\\\\n\\\\n**Main Uses:**\\\\n1. **Creating lists from existing iterables** - Transform elements from sequences\\\\n2. **Filtering data** - Include only elements that meet specific conditions\\\\n3. **Mathematical operations** - Apply calculations efficiently\\\\n4. **Combining operations** - Filter and transform in one line\\\\n\\\\n**When to use them:**\\\\n- When you need to create a new list from an existing iterable\\\\n- For simple transformations (keep it readable!)\\\\n- When performance matters (they're faster than equivalent for loops)\\\\n- To make code more Pythonic and concise\\\\n\\\\n**Basic syntax:** `[expression for item in iterable if condition]`\",\"refusal\":null,\"role\":\"assistant\"}}],\"created\":1737979208,\"id\":\"chatcmpl-DEMO003\",\"model\":\"gpt-4o-2024-11-20\",\"object\":\"chat.completion\",\"system_fingerprint\":\"fp_demo_003\",\"usage\":{\"completion_tokens\":156,\"completion_tokens_details\":{\"accepted_prediction_tokens\":0,\"audio_tokens\":0,\"reasoning_tokens\":0,\"rejected_prediction_tokens\":0},\"prompt_tokens\":184,\"prompt_tokens_details\":{\"audio_tokens\":0,\"cached_tokens\":0},\"total_tokens\":340}}",
+          "served_entity_id": "demo-entity-003",
+          "logging_error_codes": "[]",
+          "requester": "demo-user-001"
+        },
+        "latency": 1.247,
+        "usage": {
+          "completion_tokens": 156,
+          "completion_tokens_details": {
+            "accepted_prediction_tokens": 0,
+            "audio_tokens": 0,
+            "reasoning_tokens": 0,
+            "rejected_prediction_tokens": 0
+          },
+          "prompt_tokens": 184,
+          "prompt_tokens_details": {
+            "audio_tokens": 0,
+            "cached_tokens": 0
+          },
+          "total_tokens": 340
+        },
+        "model": "gpt-4o-2024-11-20"
+      },
+      {
+        "id": "demo_obs_004",
+        "type": "follow_up",
+        "timestamp": "2025-01-27T00:00:15",
+        "input": "That's really helpful! Could you show me some practical examples? I'd love to see the difference between a regular for loop and a list comprehension.",
+        "output": "Absolutely! Here are some practical examples:\\n\\n**Example 1: Squaring numbers**\\n```python\\n# Traditional for loop\\nsquares = []\\nfor x in range(5):\\n    squares.append(x**2)\\n# Result: [0, 1, 4, 9, 16]\\n\\n# List comprehension\\nsquares = [x**2 for x in range(5)]\\n# Same result: [0, 1, 4, 9, 16]\\n```\\n\\n**Example 2: Filtering even numbers**\\n```python\\n# Traditional approach\\nevens = []\\nfor x in range(10):\\n    if x % 2 == 0:\\n        evens.append(x)\\n\\n# List comprehension\\nevens = [x for x in range(10) if x % 2 == 0]\\n# Result: [0, 2, 4, 6, 8]\\n```\\n\\n**Example 3: String processing**\\n```python\\nwords = ['hello', 'world', 'python']\\n# Capitalize and get lengths\\nlengths = [len(word.upper()) for word in words]\\n# Result: [5, 5, 6]\\n```\\n\\nList comprehensions are not only more concise but also typically 20-30% faster than equivalent for loops!",
+        "metadata": {
+          "interaction_type": "follow_up_examples",
+          "code_examples_provided": 3,
+          "response_time": "2025-01-27T12:00:15.000"
+        },
+        "latency": 0.892,
+        "usage": {
+          "completion_tokens": 287,
+          "total_tokens": 445
+        }
+      }
+    ]
+  }
+}