File size: 5,801 Bytes
c987214
 
 
 
 
 
 
 
 
bf6dbfa
 
 
 
 
c987214
 
 
 
 
 
 
 
 
bf6dbfa
 
 
 
 
 
 
 
 
c987214
bf6dbfa
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c987214
bf6dbfa
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
---
title: AutoStream AI Agent
emoji: 🤖
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
---

# AutoStream Conversational AI Agent

## Project Overview
This project is a production-quality Conversational AI Agent built for **AutoStream**, a fictional SaaS company. It handles customer inquiries, answers product questions using a Knowledge Base (RAG), and detects high-intent users to seamlessly collect lead information and execute backend lead capture functions.

## Web Interface (Streamlit)
A highly interactive, modern web interface is included via `Streamlit`. It can be run locally or hosted directly on HuggingFace Spaces.

### Running the Web Interface Locally
```bash
streamlit run app.py
```
This will open up a browser window where you can converse with the AutoStream Assistant directly.

## System Architecture
The system is designed as an agentic workflow using **LangGraph**, replacing traditional linear chatbots with a stateful, branching graph architecture.
1. **User Input & State Management**: User messages and conversational context are persisted in a shared `AgentState` that tracks details like intent, history, and collected lead fields.
2. **Intent Classification**: Using `gpt-4o-mini` with structured output, the agent categorizes messages (e.g., GREETING, PRICING_QUERY, HIGH_INTENT_LEAD).
3. **Routing**: A conditional edge acts as a router, directing the conversation to specialized nodes based on intent.
4. **Knowledge Retrieval**: Product and pricing questions are routed to a RAG pipeline that retrieves context from a FAISS vector store.
5. **Lead Qualification**: High-intent users are routed to a multi-turn lead collection workflow. The agent selectively asks for missing fields (Name, Email, Creator Platform).
6. **Tool Execution**: Once all fields are collected, the agent safely executes a simulated backend lead-capture tool.

## Running Locally (CLI)

### Prerequisites
- Python 3.9+
- An OpenAI API Key

### Setup
1. Clone this repository.
2. Install dependencies:
   ```bash
   pip install -r requirements.txt
   ```
3. Set your OpenAI API key in the environment or create a `.env` file at the root of the project:
   ```env
   OPENAI_API_KEY=your_openai_api_key_here
   ```

### Running the CLI Agent
To interact with the conversational agent via the terminal:
```bash
python main.py
```

### Running the Tests
The project features a full automated testing suite that runs completely without API keys, as all LLM and embedding calls are securely mocked.
```bash
pytest
```

## RAG Pipeline (Retrieval-Augmented Generation)
When the user asks a product or pricing question, the agent utilizes a RAG pipeline:
1. The `data/knowledge_base.md` is loaded and chunked using a `RecursiveCharacterTextSplitter`.
2. Chunks are embedded using `OpenAIEmbeddings` and indexed into a local `FAISS` vector database.
3. The retriever fetches the top `k` relevant chunks for the user's query and injects them into the RAG generation prompt.
4. The LLM generates a well-grounded response strictly based on the retrieved context.

## Lead Capture Workflow
For users expressing a desire to purchase or sign up, the intent classifier triggers `HIGH_INTENT_LEAD`.
The workflow then shifts to `process_lead`. The system relies on structured extraction to glean fields (Name, Email, Creator Platform) from incoming text. It incrementally prompts the user over several turns until all required fields are collected, effectively pausing the LangGraph execution between inputs.

## State Management
A `TypedDict` named `AgentState` tracks the overarching conversation context. This prevents duplicate questions and provides memory. State variables include `conversation_history` (up to 6 turns), the currently `detected_intent`, `retrieved_documents`, and incremental lead variables (`user_name`, `user_email`, `creator_platform`). The state flows deterministically through each node, creating predictable transitions.

## Tool Execution Safety
The mock backend tool (`mock_lead_capture`) is heavily guarded. It executes solely in the `execute_tool` node, which only runs if the router confirms `lead_ready` is `True`. Furthermore, the node performs a strict validation to ensure `user_name`, `user_email`, and `creator_platform` are all non-null before triggering the function, ensuring no premature or incomplete lead data is dispatched.

## WhatsApp Integration
This agent can easily be deployed on WhatsApp using webhooks and Twilio:
1. **Twilio API**: Set up a Twilio WhatsApp Business API sandbox or account.
2. **Webhook Endpoint**: Create an HTTP endpoint (e.g., via FastAPI or Flask) to receive incoming webhook payloads containing the user's WhatsApp message.
3. **Agent Backend**: The webhook extracts the message text and user identifier (phone number) and invokes the LangGraph agent.
4. **Session Management**: A database (like Redis) can key the `AgentState` to the user's phone number, maintaining continuity and conversational memory across incoming webhooks.
5. **Response Dispatch**: After the graph runs, the final `response` string is dispatched back to the user via a POST request to Twilio's Message API.

## Testing Architecture
A rigorous suite of tests sits in the `tests/` directory:
1. **Mocking**: All AI inference (LLMs and Embeddings) is aggressively mocked using `pytest-mock` and standard injection.
2. **Deterministic Reliability**: By returning controlled mock objects, tests validate the graph structure, logic, state changes, routing, and tool safety independently of live API behavior and latencies.
3. **End-to-End Simulation**: `test_agent_e2e.py` walks through a multi-turn conversation step-by-step, mimicking user turns and validating correct downstream transitions from Greeting -> RAG -> Lead Capture -> Tool Execution.