masterllm / API_CHANGES_REQUIRED.md
ganesh-vilje's picture
Fix: Update last_activity only for user messages
6cb5907

API Response Changes Required

This document outlines ALL the changes that need to be made to the MasterLLM API responses based on the corrections defined in API_RESPONSES_correction.md.


1. Session Management APIs

1.1 GET /api/v2/sessions (with include_stats=True)

Changes Required:

  1. Sort sessions by last_activity (most recent first)

    • Current: No sorting specified
    • Required: Sessions array must be sorted by last_activity in descending order
  2. Update state field logic

    • Current: "state": "initial"
    • Required: State should indicate pipeline/conversation status:
      • "in_progress" - if pipeline is currently running
      • "completed" - if pipeline completed successfully
      • "initial" - if no pipeline has been run
      • "failed" - if pipeline failed
  3. Remove current_file field

    • Current: "current_file": "s3://my-bucket/masterllm/550e8400-e29b-41d4-a716-446655440000/report.pdf"
    • Required: This field is not needed for session list

1.2 GET /api/v2/sessions/{session_id}/history

Changes Required:

  1. Add message_id to each history item

    • Current: No message_id field
    • Required: Each message in history array needs a unique message_id
    {
      "message_id": "msg_abc123",
      "role": "user",
      "content": "...",
      ...
    }
    
  2. Ensure content contains only conversation text

    • Current: Content field is correct
    • Required: Content should NOT contain pipeline output (pipeline output goes in pipelines_history)
  3. Standardize file attachment fields

    • Current: Has both file_data object AND separate file, fileName, fileUrl fields
    • Required: Keep both structures for backward compatibility, but ensure consistency:
      • If file_data.has_file is true, then file must be true
      • file_url should be a presigned URL with 7-day validity
      • If no file, all file-related fields should be false/null
  4. Update pipeline history - rename result_preview to result

    • Current: "result_preview": "The Q3 report indicates..."
    • Required:
      "result": "The Q3 report indicates a 15% growth in revenue..."
      
    • Note: If pipeline completed successfully, populate with output. If failed/in-progress, set to null
  5. Add error handling fields to pipeline history

    • Current: No error fields
    • Required: Add to each pipeline in pipelines_history:
      {
        "pipeline_id": "pipe_987654321",
        "hasError": false,
        "error": null,
        ...
      }
      
    • If pipeline failed:
      {
        "pipeline_id": "pipe_987654321",
        "hasError": true,
        "error": {
          "error_code": "EXTRACTION_FAILED",
          "message": "Failed to extract text from PDF",
          "details": "..."
        },
        ...
      }
      
  6. Remove pipeline_s3_key from response

    • Current: "pipeline_s3_key": "masterllm/pipelines/pipe_987654321.json"
    • Required: Store internally but don't include in API response
  7. Add component-level details to each component

    • Current: Components only have basic info
    • Required: Each component needs:
      {
        "component_id": "comp_001",
        "step_id": 1,
        "tool_name": "extract_text",
        "description": "Extract text from the PDF",
        "status": "success",  // or "failed", "running"
        "component_output": "Extracted text content here...",
        "hasError": false,
        "error": null,
        "metadata": {  // Optional
          "duration_ms": 1500,
          "tokens_consumed": 250,
          "model_used": "claude-3-sonnet"
        },
        "parameters": {
          "file_path": "s3://..."
        }
      }
      

1.3 GET /api/v2/sessions/{session_id}/pipelines

Changes Required:

  1. Add component details
    • Current: Only basic pipeline info
    • Required: Include full component data as specified in /history endpoint
    {
      "pipeline_id": "pipe_987654321",
      "pipeline_name": "Document Summarization",
      "status": "completed",
      "created_at": "2023-11-15T08:31:05Z",
      "final_output_url": "s3://bucket/results/pipe_987654321_output.json",
      "components": [
        {
          "component_id": "comp_001",
          "step_id": 1,
          "tool_name": "extract_text",
          "status": "success",
          "component_output": "...",
          "hasError": false,
          "error": null,
          "metadata": { ... },
          "parameters": { ... }
        }
      ]
    }
    

2. Unified Chat (Non-Streaming) - POST /api/v2/chat/unified

2.1 All Scenarios - Add message_id

Changes Required:

  1. Add message_id to assistant responses
    • Current: No message_id field
    • Required: Every response needs a unique message ID
    {
      "message_id": "msg_xyz789",
      "assistant_response": "...",
      ...
    }
    

2.2 Scenario 4: Pipeline Generated (Proposal)

Changes Required:

  1. Add component_id to each component
    • Current: Components don't have unique IDs
    • Required:
    "components": [
      {
        "component_id": "comp_001",
        "tool_name": "extract_text",
        "args": { "pages": "all" }
      },
      {
        "component_id": "comp_002",
        "tool_name": "summarize_text",
        "args": { "style": "brief" }
      }
    ]
    

2.3 Scenario 5: Pipeline Completed (Execution Success)

Changes Required:

  1. Add error handling fields

    • Current: Only has exception: null
    • Required: Add hasError field
    {
      "assistant_response": "πŸŽ‰ Pipeline completed successfully!",
      "hasError": false,
      "exception": null,
      ...
    }
    
  2. Add output download capability to final_output

    • Current: Only has text and result
    • Required:
    "final_output": {
      "output_id": "output_abc123",
      "download_url": "https://s3.amazonaws.com/.../output.json?signature=...",
      "text": "The Q3 financial report highlights...",
      "result": {
        "summary": "...",
        "metadata": { ... }
      }
    }
    
    • Note: download_url is optional for now; output_id is required for separate download endpoint
  3. Remove nested result object from final_output

    • Current: Has both text and result with duplicate data
    • Required: Simplify structure - move result data to component level
    "final_output": {
      "output_id": "output_abc123",
      "download_url": "https://s3.amazonaws.com/.../output.json?signature=...",
      "text": "The Q3 financial report highlights a strong performance..."
    }
    
  4. Add component-level details in api_response.pipeline.components

    • Current: Components only have tool names
    • Required: Full component details with status, output, errors
    "components": [
      {
        "component_id": "comp_001",
        "tool_name": "extract_text",
        "status": "success",
        "component_output": "Full extracted text...",
        "hasError": false,
        "error": null,
        "metadata": {
          "duration_ms": 1200,
          "tokens_consumed": 300
        }
      },
      {
        "component_id": "comp_002",
        "tool_name": "summarize_text",
        "status": "success",
        "component_output": "The Q3 financial report highlights...",
        "hasError": false,
        "error": null,
        "metadata": {
          "duration_ms": 800,
          "tokens_consumed": 150
        }
      }
    ]
    

2.4 Scenario 6: Error

Changes Required:

  1. Add hasError field

    • Current: Only has exception
    • Required:
    {
      "assistant_response": "❌ Pipeline execution failed: ValueError: Input file is empty.",
      "hasError": true,
      "exception": "ValueError: Input file is empty.",
      ...
    }
    
  2. Keep api_response format unchanged

    • Note: Don't change the error response format in api_response
    • Required: If error is in a specific component, send error at component level (in component's error field)
  3. Component-level error example:

    "api_response": {
      "type": "error",
      "error_code": "PIPELINE_EXECUTION_FAILED",
      "message": "Component 'extract_text' failed",
      "pipeline": {
        "pipeline_id": "pipe_123abc",
        "components": [
          {
            "component_id": "comp_001",
            "tool_name": "extract_text",
            "status": "failed",
            "hasError": true,
            "error": {
              "error_code": "FILE_READ_ERROR",
              "message": "ValueError: Input file is empty.",
              "details": "The uploaded PDF file contains no readable content"
            }
          }
        ]
      }
    }
    

3. Workflows APIs

3.1 GET /api/v2/workflows/{workflow_id}

Changes Required:

  1. Add component details to workflow definition
    • Current: Components only have tool name and args
    • Required: When workflow is executed, include full component details:
    "definition": {
      "pipeline_name": "Quarterly Report Analyzer",
      "components": [
        {
          "component_id": "comp_001",
          "tool_name": "extract_text",
          "args": {},
          "status": "success",  // Only if workflow was executed
          "component_output": "...",  // Only if workflow was executed
          "hasError": false,
          "error": null,
          "metadata": {  // Optional
            "duration_ms": 1500,
            "tokens_consumed": 250
          }
        }
      ]
    }
    

4. Summary of All Changes by Category

A. New Fields to Add

Field Location Type Description
message_id All chat messages string Unique identifier for each message
component_id All components string Unique identifier for each pipeline component
hasError Pipeline/Component responses boolean Indicates if error occurred
error Pipeline/Component responses object/null Error details if hasError is true
status Components string Component execution status (success/failed/running)
component_output Components string/object Output from component execution
metadata Components (optional) object Execution metadata (duration, tokens, etc.)
output_id final_output string ID for downloading output
download_url final_output (optional) string Direct download URL for output

B. Fields to Remove/Hide

Field Location Reason
current_file Session list (include_stats=true) Not needed for session list
pipeline_s3_key Pipeline history Internal use only, don't expose
Nested result in final_output Chat responses Redundant, move to component level

C. Fields to Rename

Old Name New Name Location
result_preview result Pipeline history

D. Behavioral Changes

Change Location Description
Sort by last_activity Session list Most recent sessions first
Dynamic state values Sessions & pipelines Reflect actual pipeline status
7-day presigned URLs File attachments Extend URL validity
Component-level errors Error responses Show which component failed

5. Implementation Priority

High Priority (Critical for UI)

  1. βœ… Add message_id to all messages
  2. βœ… Add component_id to all components
  3. βœ… Add component status, hasError, and error fields
  4. βœ… Rename result_preview to result
  5. βœ… Sort sessions by last_activity
  6. βœ… Add output_id to final_output

Medium Priority (Important for UX)

  1. βœ… Add component_output to components
  2. βœ… Remove current_file from session list
  3. βœ… Remove pipeline_s3_key from responses
  4. βœ… Add download_url to final_output
  5. βœ… Update state field logic

Low Priority (Nice to Have)

  1. βœ… Add component metadata (duration, tokens, etc.)
  2. βœ… Extend presigned URL validity to 7 days

6. Code Files to Modify

Based on these changes, the following files will need modifications:

  1. api_routes_v2.py - Update all endpoint response structures
  2. session_manager.py - Add message_id generation, update session retrieval logic
  3. pipeline_executor.py - Add component_id, status tracking, error handling
  4. schemas.py (if exists) - Update Pydantic models for responses
  5. Database models - May need to add fields for message_id, component_id, etc.

7. Testing Checklist

After implementing changes, verify:

  • All messages have unique message_id
  • All components have unique component_id
  • Sessions are sorted by last_activity (descending)
  • state field reflects actual pipeline status
  • result field (renamed from result_preview) shows correct data
  • Component errors are properly captured and returned
  • hasError flag is set correctly
  • output_id is generated for completed pipelines
  • File presigned URLs have 7-day validity
  • current_file is removed from session list
  • pipeline_s3_key is not exposed in responses
  • Component-level output is captured and returned
  • Error responses include component-level details

8. Example: Complete Corrected Response

GET /api/v2/sessions/{session_id}/history (Corrected)

{
  "session_id": "550e8400-e29b-41d4-a716-446655440000",
  "history": [
    {
      "message_id": "msg_001",
      "role": "user",
      "content": "Uploaded file: report.pdf",
      "timestamp": "2023-11-15T08:30:00Z",
      "file_data": {
        "has_file": true,
        "file_name": "report.pdf",
        "file_url": "https://s3.amazonaws.com/bucket/key?signature=..."
      },
      "file": true,
      "fileName": "report.pdf",
      "fileUrl": "https://s3.amazonaws.com/bucket/key?signature=..."
    },
    {
      "message_id": "msg_002",
      "role": "assistant",
      "content": "File uploaded successfully.",
      "timestamp": "2023-11-15T08:30:05Z",
      "file_data": {
        "has_file": false
      },
      "file": false,
      "fileName": null,
      "fileUrl": null
    }
  ],
  "count": 2,
  "limit": 50,
  "chat_name": "Q3 Financial Report Analysis",
  "pipelines_history": [
    {
      "pipeline_id": "pipe_987654321",
      "pipeline_name": "Document Summarization",
      "status": "completed",
      "created_at": "2023-11-15T08:31:05Z",
      "created_from": "request",
      "model_provider": "bedrock",
      "model_name": "anthropic.claude-3-sonnet-20240229-v1:0",
      "result": "The Q3 report indicates a 15% growth in revenue...",
      "hasError": false,
      "error": null,
      "updated_at": "2023-11-15T08:32:00Z",
      "tools": ["extract_text", "summarize_text"],
      "component_count": 2,
      "components": [
        {
          "component_id": "comp_001",
          "step_id": 1,
          "tool_name": "extract_text",
          "description": "Extract text from the PDF",
          "status": "success",
          "component_output": "Full extracted text from the PDF document...",
          "hasError": false,
          "error": null,
          "metadata": {
            "duration_ms": 1200,
            "tokens_consumed": 300
          },
          "parameters": {
            "file_path": "s3://..."
          }
        },
        {
          "component_id": "comp_002",
          "step_id": 2,
          "tool_name": "summarize_text",
          "description": "Summarize the extracted content",
          "status": "success",
          "component_output": "The Q3 report indicates a 15% growth in revenue...",
          "hasError": false,
          "error": null,
          "metadata": {
            "duration_ms": 800,
            "tokens_consumed": 150
          },
          "parameters": {
            "max_length": 500
          }
        }
      ]
    }
  ]
}

9. Next Steps

  1. Review this document with the development team
  2. Create implementation tasks for each change category
  3. Update API documentation
  4. Implement changes in backend code
  5. Update frontend to consume new response structure
  6. Test all endpoints thoroughly
  7. Deploy and monitor