Spaces:
Sleeping
title: AutoStream AI Agent
emoji: 🤖
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
AutoStream Conversational AI Agent
Project Overview
This project is a production-quality Conversational AI Agent built for AutoStream, a fictional SaaS company. It handles customer inquiries, answers product questions using a Knowledge Base (RAG), and detects high-intent users to seamlessly collect lead information and execute backend lead capture functions.
Web Interface (Streamlit)
A highly interactive, modern web interface is included via Streamlit. It can be run locally or hosted directly on HuggingFace Spaces.
Running the Web Interface Locally
streamlit run app.py
This will open up a browser window where you can converse with the AutoStream Assistant directly.
System Architecture
The system is designed as an agentic workflow using LangGraph, replacing traditional linear chatbots with a stateful, branching graph architecture.
- User Input & State Management: User messages and conversational context are persisted in a shared
AgentStatethat tracks details like intent, history, and collected lead fields. - Intent Classification: Using
gpt-4o-miniwith structured output, the agent categorizes messages (e.g., GREETING, PRICING_QUERY, HIGH_INTENT_LEAD). - Routing: A conditional edge acts as a router, directing the conversation to specialized nodes based on intent.
- Knowledge Retrieval: Product and pricing questions are routed to a RAG pipeline that retrieves context from a FAISS vector store.
- Lead Qualification: High-intent users are routed to a multi-turn lead collection workflow. The agent selectively asks for missing fields (Name, Email, Creator Platform).
- Tool Execution: Once all fields are collected, the agent safely executes a simulated backend lead-capture tool.
Running Locally (CLI)
Prerequisites
- Python 3.9+
- An OpenAI API Key
Setup
- Clone this repository.
- Install dependencies:
pip install -r requirements.txt - Set your OpenAI API key in the environment or create a
.envfile at the root of the project:OPENAI_API_KEY=your_openai_api_key_here
Running the CLI Agent
To interact with the conversational agent via the terminal:
python main.py
Running the Tests
The project features a full automated testing suite that runs completely without API keys, as all LLM and embedding calls are securely mocked.
pytest
RAG Pipeline (Retrieval-Augmented Generation)
When the user asks a product or pricing question, the agent utilizes a RAG pipeline:
- The
data/knowledge_base.mdis loaded and chunked using aRecursiveCharacterTextSplitter. - Chunks are embedded using
OpenAIEmbeddingsand indexed into a localFAISSvector database. - The retriever fetches the top
krelevant chunks for the user's query and injects them into the RAG generation prompt. - The LLM generates a well-grounded response strictly based on the retrieved context.
Lead Capture Workflow
For users expressing a desire to purchase or sign up, the intent classifier triggers HIGH_INTENT_LEAD.
The workflow then shifts to process_lead. The system relies on structured extraction to glean fields (Name, Email, Creator Platform) from incoming text. It incrementally prompts the user over several turns until all required fields are collected, effectively pausing the LangGraph execution between inputs.
State Management
A TypedDict named AgentState tracks the overarching conversation context. This prevents duplicate questions and provides memory. State variables include conversation_history (up to 6 turns), the currently detected_intent, retrieved_documents, and incremental lead variables (user_name, user_email, creator_platform). The state flows deterministically through each node, creating predictable transitions.
Tool Execution Safety
The mock backend tool (mock_lead_capture) is heavily guarded. It executes solely in the execute_tool node, which only runs if the router confirms lead_ready is True. Furthermore, the node performs a strict validation to ensure user_name, user_email, and creator_platform are all non-null before triggering the function, ensuring no premature or incomplete lead data is dispatched.
WhatsApp Integration
This agent can easily be deployed on WhatsApp using webhooks and Twilio:
- Twilio API: Set up a Twilio WhatsApp Business API sandbox or account.
- Webhook Endpoint: Create an HTTP endpoint (e.g., via FastAPI or Flask) to receive incoming webhook payloads containing the user's WhatsApp message.
- Agent Backend: The webhook extracts the message text and user identifier (phone number) and invokes the LangGraph agent.
- Session Management: A database (like Redis) can key the
AgentStateto the user's phone number, maintaining continuity and conversational memory across incoming webhooks. - Response Dispatch: After the graph runs, the final
responsestring is dispatched back to the user via a POST request to Twilio's Message API.
Testing Architecture
A rigorous suite of tests sits in the tests/ directory:
- Mocking: All AI inference (LLMs and Embeddings) is aggressively mocked using
pytest-mockand standard injection. - Deterministic Reliability: By returning controlled mock objects, tests validate the graph structure, logic, state changes, routing, and tool safety independently of live API behavior and latencies.
- End-to-End Simulation:
test_agent_e2e.pywalks through a multi-turn conversation step-by-step, mimicking user turns and validating correct downstream transitions from Greeting -> RAG -> Lead Capture -> Tool Execution.