Spaces:
Running
A newer version of the Gradio SDK is available:
6.4.0
CMW Platform Agent API Endpoints Documentation
Date: 2025-10-10
Version: 1.0
Status: Production Ready
Overview
The CMW Platform Agent now exposes two REST API endpoints that allow external applications to interact with the agent programmatically. These endpoints support both single-turn and multi-turn conversations with session persistence.
Base URL
http://localhost:7860
Authentication
No authentication is required for these endpoints. All requests are processed with session isolation.
Endpoints
1. /ask - Final Answer Endpoint
Returns the complete assistant response after processing is finished.
Method: POST
Path: /gradio_api/call/ask
Content-Type: application/json
Request Format
{
"data": ["Your question here", "username", "password", "base_url"],
"session_hash": "optional-session-id"
}
Parameters
data[0](string, required): The user's questiondata[1](string, optional): Username for Comindware Platform authenticationdata[2](string, optional): Password for Comindware Platform authenticationdata[3](string, optional): Base URL of the Comindware Platform (e.g., "https://your-platform.com")session_hash(string, optional): Session identifier for multi-turn conversations
Response Format
Success Response:
{
"event_id": "unique-event-id"
}
Final Result (via GET):
{
"data": ["Complete assistant response"]
}
Example Usage
cURL:
# Submit question with authentication
curl -X POST http://localhost:7860/gradio_api/call/ask \
-H "Content-Type: application/json" \
-d '{"data": ["Hello, who are you?", "myuser", "mypass", "https://my-platform.com"]}'
# Get result (replace EVENT_ID with actual ID)
curl -N http://localhost:7860/gradio_api/call/ask/EVENT_ID
Python Client:
from gradio_client import Client
client = Client("http://localhost:7860/")
result = client.predict(
question="Hello, who are you?",
username="myuser",
password="mypass",
base_url="https://my-platform.com",
api_name="/ask"
)
print(result)
Using Environment Variables:
import os
from dotenv import load_dotenv
from gradio_client import Client
# Load from root .env file
load_dotenv()
client = Client("http://localhost:7860/")
result = client.predict(
question="Hello, who are you?",
username=os.getenv("CMW_LOGIN"),
password=os.getenv("CMW_PASSWORD"),
base_url=os.getenv("CMW_BASE_URL"),
api_name="/ask"
)
print(result)
2. /ask_stream - Streaming Endpoint
Returns incremental chunks of the assistant response as it's being generated.
Method: POST
Path: /gradio_api/call/ask_stream
Content-Type: application/json
Request Format
{
"data": ["Your question here", "username", "password", "base_url"],
"session_hash": "optional-session-id"
}
Parameters
data[0](string, required): The user's questiondata[1](string, optional): Username for Comindware Platform authenticationdata[2](string, optional): Password for Comindware Platform authenticationdata[3](string, optional): Base URL of the Comindware Platform (e.g., "https://your-platform.com")session_hash(string, optional): Session identifier for multi-turn conversations
Response Format
Success Response:
{
"event_id": "unique-event-id"
}
Streaming Results (via GET):
event: generating
data: ["Hello"]
event: generating
data: ["Hello, w"]
event: generating
data: ["Hello, wo"]
event: generating
data: ["Hello, wor"]
event: generating
data: ["Hello, worl"]
event: generating
data: ["Hello, world!"]
event: complete
data: ["Hello, world!"]
Example Usage
cURL:
# Submit question
curl -X POST http://localhost:7860/call/ask_stream \
-H "Content-Type: application/json" \
-d '{"data": ["Stream this please"]}'
# Get streaming result (replace EVENT_ID with actual ID)
curl -N http://localhost:7860/call/ask_stream/EVENT_ID
Python Client:
from gradio_client import Client
client = Client("http://localhost:7860/")
job = client.submit(
question="Stream this please",
api_name="/ask_stream"
)
# Iterate through streaming chunks
for chunk in job:
print(f"Chunk: {chunk}")
Session Management
Multi-turn Conversations
Both endpoints support session persistence using the session_hash parameter:
# First message in a session
client = Client("http://localhost:7860/")
result1 = client.predict(
question="What is 2+2?",
api_name="/ask",
session_hash="my-session-123"
)
# Follow-up message in the same session
result2 = client.predict(
question="What about 3+3?",
api_name="/ask",
session_hash="my-session-123"
)
Session Behavior
- With session_hash: Messages are part of the same conversation context
- Without session_hash: Each request is treated as a new conversation
- Session isolation: Different session hashes maintain separate conversation histories
Error Handling
Common Error Responses
Connection Error:
{
"error": "Connection refused"
}
Timeout Error:
{
"error": "Request timeout"
}
Invalid Request:
{
"error": "Invalid request format"
}
Error Event (Streaming)
For streaming endpoints, errors are returned as events:
event: error
data: ["Error message here"]
Rate Limiting
- Concurrent requests: Limited by Gradio's queue system (default: 1)
- Rate limits: No explicit rate limiting implemented
- Queue timeout: 30 seconds per request
Response Times
- Final endpoint (
/ask): 2-10 seconds depending on complexity - Streaming endpoint (
/ask_stream): First chunk within 1-2 seconds, then incremental updates
Best Practices
1. Use Appropriate Endpoint
- Use
/askfor simple queries where you need the complete response - Use
/ask_streamfor better user experience with real-time feedback
2. Session Management
- Always use session_hash for multi-turn conversations
- Generate unique session IDs for different users/conversations
- Reuse session_hash within the same conversation thread
3. Error Handling
try:
result = client.predict(question="Hello", api_name="/ask")
print(result)
except Exception as e:
print(f"Error: {e}")
# Handle error appropriately
4. Streaming Best Practices
# For streaming, always iterate through chunks
job = client.submit(question="Stream this", api_name="/ask_stream")
for chunk in job:
# Process each chunk
print(f"Received: {chunk}")
Integration Examples
JavaScript/Node.js
const fetch = require('node-fetch');
// Final endpoint
async function askQuestion(question) {
const response = await fetch('http://localhost:7860/call/ask', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ data: [question] })
});
const { event_id } = await response.json();
// Get result
const resultResponse = await fetch(`http://localhost:7860/call/ask/${event_id}`);
const result = await resultResponse.json();
return result.data[0];
}
Python with requests
import requests
import json
def ask_question(question, session_hash=None):
# Submit question
payload = {"data": [question]}
if session_hash:
payload["session_hash"] = session_hash
response = requests.post(
"http://localhost:7860/call/ask",
headers={"Content-Type": "application/json"},
json=payload
)
event_id = response.json()["event_id"]
# Get result
result_response = requests.get(f"http://localhost:7860/call/ask/{event_id}")
return result_response.json()["data"][0]
Testing
Test Script
A test script is available at agent_ng/_tests/api_test.py:
# Run tests
python agent_ng/_tests/api_test.py
# With custom URL
BASE_URL=http://your-server:7860/ python agent_ng/_tests/api_test.py
# With session hash
SESSION_HASH=test-session-123 python agent_ng/_tests/api_test.py
Manual Testing
Start the agent:
python -m agent_ng.app_ng_modularTest final endpoint:
curl -X POST http://localhost:7860/call/ask \ -H "Content-Type: application/json" \ -d '{"data": ["Hello"]}'Test streaming endpoint:
curl -X POST http://localhost:7860/call/ask_stream \ -H "Content-Type: application/json" \ -d '{"data": ["Stream this"]}'
Troubleshooting
Common Issues
"Application is initializing..."
- Wait for the agent to fully initialize
- Check logs for initialization errors
Connection refused
- Ensure the agent is running on the correct port
- Check firewall settings
Timeout errors
- Increase timeout values
- Check server performance
Empty responses
- Verify the question is not empty
- Check agent configuration
Debug Mode
Enable debug logging by setting environment variables:
export GRADIO_DEBUG=1
export LOG_LEVEL=DEBUG
python -m agent_ng.app_ng_modular
Changelog
Version 1.0 (2025-10-10)
- Initial release of API endpoints
- Added
/askfinal answer endpoint - Added
/ask_streamstreaming endpoint - Implemented session management
- Added comprehensive documentation
Support
For issues or questions regarding the API endpoints:
- Check this documentation
- Review the test script examples
- Check the agent logs for error details
- Verify the agent is running and accessible
Note: This documentation covers the API endpoints as implemented in the CMW Platform Agent. For Gradio-specific API details, refer to the official Gradio documentation.