Spaces:

arterm-sedov
/

cmw-copilot

Running

App Files Files Community

cmw-copilot / docs /20251010_API_ENDPOINTS_DOCUMENTATION.md

arterm-sedov

working api ask and ask_streaming with session memory

7bdcbc6 4 months ago

preview code

raw

history blame contribute delete

10.1 kB

A newer version of the Gradio SDK is available: 6.4.0

Upgrade

CMW Platform Agent API Endpoints Documentation

Date: 2025-10-10
Version: 1.0
Status: Production Ready

Overview

The CMW Platform Agent now exposes two REST API endpoints that allow external applications to interact with the agent programmatically. These endpoints support both single-turn and multi-turn conversations with session persistence.

Base URL

http://localhost:7860

Authentication

No authentication is required for these endpoints. All requests are processed with session isolation.

Endpoints

1. `/ask` - Final Answer Endpoint

Returns the complete assistant response after processing is finished.

Method: POST
Path: /gradio_api/call/ask
Content-Type: application/json

Request Format

{
  "data": ["Your question here", "username", "password", "base_url"],
  "session_hash": "optional-session-id"
}

Parameters

data[0] (string, required): The user's question
data[1] (string, optional): Username for Comindware Platform authentication
data[2] (string, optional): Password for Comindware Platform authentication
data[3] (string, optional): Base URL of the Comindware Platform (e.g., "https://your-platform.com")
session_hash (string, optional): Session identifier for multi-turn conversations

Response Format

Success Response:

{
  "event_id": "unique-event-id"
}

Final Result (via GET):

{
  "data": ["Complete assistant response"]
}

Example Usage

cURL:

# Submit question with authentication
curl -X POST http://localhost:7860/gradio_api/call/ask \
  -H "Content-Type: application/json" \
  -d '{"data": ["Hello, who are you?", "myuser", "mypass", "https://my-platform.com"]}'

# Get result (replace EVENT_ID with actual ID)
curl -N http://localhost:7860/gradio_api/call/ask/EVENT_ID

Python Client:

from gradio_client import Client

client = Client("http://localhost:7860/")
result = client.predict(
    question="Hello, who are you?",
    username="myuser",
    password="mypass", 
    base_url="https://my-platform.com",
    api_name="/ask"
)
print(result)

Using Environment Variables:

import os
from dotenv import load_dotenv
from gradio_client import Client

# Load from root .env file
load_dotenv()

client = Client("http://localhost:7860/")
result = client.predict(
    question="Hello, who are you?",
    username=os.getenv("CMW_LOGIN"),
    password=os.getenv("CMW_PASSWORD"), 
    base_url=os.getenv("CMW_BASE_URL"),
    api_name="/ask"
)
print(result)

2. `/ask_stream` - Streaming Endpoint

Returns incremental chunks of the assistant response as it's being generated.

Method: POST
Path: /gradio_api/call/ask_stream
Content-Type: application/json

Request Format

{
  "data": ["Your question here", "username", "password", "base_url"],
  "session_hash": "optional-session-id"
}

Parameters

data[0] (string, required): The user's question
data[1] (string, optional): Username for Comindware Platform authentication
data[2] (string, optional): Password for Comindware Platform authentication
data[3] (string, optional): Base URL of the Comindware Platform (e.g., "https://your-platform.com")
session_hash (string, optional): Session identifier for multi-turn conversations

Response Format

Success Response:

{
  "event_id": "unique-event-id"
}

Streaming Results (via GET):

event: generating
data: ["Hello"]

event: generating
data: ["Hello, w"]

event: generating
data: ["Hello, wo"]

event: generating
data: ["Hello, wor"]

event: generating
data: ["Hello, worl"]

event: generating
data: ["Hello, world!"]

event: complete
data: ["Hello, world!"]

Example Usage

cURL:

# Submit question
curl -X POST http://localhost:7860/call/ask_stream \
  -H "Content-Type: application/json" \
  -d '{"data": ["Stream this please"]}'

# Get streaming result (replace EVENT_ID with actual ID)
curl -N http://localhost:7860/call/ask_stream/EVENT_ID

Python Client:

from gradio_client import Client

client = Client("http://localhost:7860/")
job = client.submit(
    question="Stream this please",
    api_name="/ask_stream"
)

# Iterate through streaming chunks
for chunk in job:
    print(f"Chunk: {chunk}")

Session Management

Multi-turn Conversations

Both endpoints support session persistence using the session_hash parameter:

# First message in a session
client = Client("http://localhost:7860/")
result1 = client.predict(
    question="What is 2+2?",
    api_name="/ask",
    session_hash="my-session-123"
)

# Follow-up message in the same session
result2 = client.predict(
    question="What about 3+3?",
    api_name="/ask", 
    session_hash="my-session-123"
)

Session Behavior

With session_hash: Messages are part of the same conversation context
Without session_hash: Each request is treated as a new conversation
Session isolation: Different session hashes maintain separate conversation histories

Error Handling

Common Error Responses

Connection Error:

{
  "error": "Connection refused"
}

Timeout Error:

{
  "error": "Request timeout"
}

Invalid Request:

{
  "error": "Invalid request format"
}

Error Event (Streaming)

For streaming endpoints, errors are returned as events:

event: error
data: ["Error message here"]

Rate Limiting

Concurrent requests: Limited by Gradio's queue system (default: 1)
Rate limits: No explicit rate limiting implemented
Queue timeout: 30 seconds per request

Response Times

Final endpoint (/ask): 2-10 seconds depending on complexity
Streaming endpoint (/ask_stream): First chunk within 1-2 seconds, then incremental updates

Best Practices

1. Use Appropriate Endpoint

Use /ask for simple queries where you need the complete response
Use /ask_stream for better user experience with real-time feedback

2. Session Management

Always use session_hash for multi-turn conversations
Generate unique session IDs for different users/conversations
Reuse session_hash within the same conversation thread

3. Error Handling

try:
    result = client.predict(question="Hello", api_name="/ask")
    print(result)
except Exception as e:
    print(f"Error: {e}")
    # Handle error appropriately

4. Streaming Best Practices

# For streaming, always iterate through chunks
job = client.submit(question="Stream this", api_name="/ask_stream")
for chunk in job:
    # Process each chunk
    print(f"Received: {chunk}")

Integration Examples

JavaScript/Node.js

const fetch = require('node-fetch');

// Final endpoint
async function askQuestion(question) {
    const response = await fetch('http://localhost:7860/call/ask', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ data: [question] })
    });
    
    const { event_id } = await response.json();
    
    // Get result
    const resultResponse = await fetch(`http://localhost:7860/call/ask/${event_id}`);
    const result = await resultResponse.json();
    
    return result.data[0];
}

Python with requests

import requests
import json

def ask_question(question, session_hash=None):
    # Submit question
    payload = {"data": [question]}
    if session_hash:
        payload["session_hash"] = session_hash
    
    response = requests.post(
        "http://localhost:7860/call/ask",
        headers={"Content-Type": "application/json"},
        json=payload
    )
    
    event_id = response.json()["event_id"]
    
    # Get result
    result_response = requests.get(f"http://localhost:7860/call/ask/{event_id}")
    return result_response.json()["data"][0]

Testing

Test Script

A test script is available at agent_ng/_tests/api_test.py:

# Run tests
python agent_ng/_tests/api_test.py

# With custom URL
BASE_URL=http://your-server:7860/ python agent_ng/_tests/api_test.py

# With session hash
SESSION_HASH=test-session-123 python agent_ng/_tests/api_test.py

Manual Testing

Start the agent:
```
python -m agent_ng.app_ng_modular
```

Test final endpoint:

curl -X POST http://localhost:7860/call/ask \
  -H "Content-Type: application/json" \
  -d '{"data": ["Hello"]}'

Test streaming endpoint:

curl -X POST http://localhost:7860/call/ask_stream \
  -H "Content-Type: application/json" \
  -d '{"data": ["Stream this"]}'

Troubleshooting

Common Issues

"Application is initializing..."
- Wait for the agent to fully initialize
- Check logs for initialization errors
Connection refused
- Ensure the agent is running on the correct port
- Check firewall settings
Timeout errors
- Increase timeout values
- Check server performance
Empty responses
- Verify the question is not empty
- Check agent configuration

Debug Mode

Enable debug logging by setting environment variables:

export GRADIO_DEBUG=1
export LOG_LEVEL=DEBUG
python -m agent_ng.app_ng_modular

Changelog

Version 1.0 (2025-10-10)

Initial release of API endpoints
Added /ask final answer endpoint
Added /ask_stream streaming endpoint
Implemented session management
Added comprehensive documentation

Support

For issues or questions regarding the API endpoints:

Check this documentation
Review the test script examples
Check the agent logs for error details
Verify the agent is running and accessible

Note: This documentation covers the API endpoints as implemented in the CMW Platform Agent. For Gradio-specific API details, refer to the official Gradio documentation.

CMW Platform Agent API Endpoints Documentation

Overview

Base URL

Authentication

Endpoints

1. /ask - Final Answer Endpoint

Request Format

Parameters

Response Format

Example Usage

2. /ask_stream - Streaming Endpoint

Request Format

Parameters

Response Format

Example Usage

Session Management

Multi-turn Conversations

Session Behavior

Error Handling

Common Error Responses

Error Event (Streaming)

Rate Limiting

Response Times

Best Practices

1. Use Appropriate Endpoint

2. Session Management

3. Error Handling

4. Streaming Best Practices

Integration Examples

JavaScript/Node.js

Python with requests

Testing

Test Script

Manual Testing

Troubleshooting

Common Issues

Debug Mode

Changelog

Version 1.0 (2025-10-10)

Support

1. `/ask` - Final Answer Endpoint

2. `/ask_stream` - Streaming Endpoint