Spaces:
Sleeping
Sleeping
Commit
·
49f9f52
0
Parent(s):
initial commit
Browse files- .gitattributes +1 -0
- .gitignore +7 -0
- LICENSE +21 -0
- README.md +123 -0
- assets/banner.png +3 -0
- assets/gradio.png +3 -0
- requirements.txt +7 -0
- scripts/agent.py +122 -0
- scripts/app.py +72 -0
.gitattributes
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
*.png filter=lfs diff=lfs merge=lfs -text
|
.gitignore
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
__pycache__/
|
| 2 |
+
*.pyc
|
| 3 |
+
venv/
|
| 4 |
+
.venv/
|
| 5 |
+
.vscode/
|
| 6 |
+
.idea/
|
| 7 |
+
|
LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
MIT License
|
| 2 |
+
|
| 3 |
+
Copyright (c) 2025 Daniel
|
| 4 |
+
|
| 5 |
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
| 6 |
+
of this software and associated documentation files (the "Software"), to deal
|
| 7 |
+
in the Software without restriction, including without limitation the rights
|
| 8 |
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
| 9 |
+
copies of the Software, and to permit persons to whom the Software is
|
| 10 |
+
furnished to do so, subject to the following conditions:
|
| 11 |
+
|
| 12 |
+
The above copyright notice and this permission notice shall be included in all
|
| 13 |
+
copies or substantial portions of the Software.
|
| 14 |
+
|
| 15 |
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
| 16 |
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
| 17 |
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
| 18 |
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
| 19 |
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
| 20 |
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
| 21 |
+
SOFTWARE.
|
README.md
ADDED
|
@@ -0,0 +1,123 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+

|
| 2 |
+
[](https://www.python.org/)[](https://pytorch.org/)[](LICENSE)
|
| 3 |
+
|
| 4 |
+
# 🤖 Advanced Customer Service Agent
|
| 5 |
+
|
| 6 |
+
An intelligent, multi-modal customer service agent built with a Retrieval-Augmented Generation (RAG) pipeline. This agent can understand user sentiment, retrieve relevant information from a knowledge base, and provide empathetic, context-aware responses in both text and voice.
|
| 7 |
+
|
| 8 |
+
the gradio demo can be found [Here](https://huggingface.co/datasets/MakTek/Customer_support_faqs_dataset)
|
| 9 |
+
|
| 10 |
+

|
| 11 |
+
|
| 12 |
+
---
|
| 13 |
+
|
| 14 |
+
## 📋 Table of Contents
|
| 15 |
+
|
| 16 |
+
- [📖 About The Project](#-about-the-project)
|
| 17 |
+
- [✨ Features](#-features)
|
| 18 |
+
- [🛠️ Tech Stack & Model Architecture](#️-tech-stack--model-architecture)
|
| 19 |
+
- [Model Selection Rationale](#model-selection-rationale)
|
| 20 |
+
- [📊 Performance Benchmark](#-performance-benchmark)
|
| 21 |
+
- [🔮 Future Improvements](#-future-improvements)
|
| 22 |
+
- [🚀 Getting Started](#-getting-started)
|
| 23 |
+
- [Prerequisites](#prerequisites)
|
| 24 |
+
- [Installation & Usage](#installation--usage)
|
| 25 |
+
|
| 26 |
+
---
|
| 27 |
+
|
| 28 |
+
## 📖 About The Project
|
| 29 |
+
|
| 30 |
+
This project is a complete implementation of an advanced AI customer service agent. The core of the agent is a RAG pipeline that allows it to answer user queries based on a predefined knowledge base, ensuring factual and relevant responses. It includes conversation memory to handle follow-up questions and sentiment analysis to adapt its tone, making the interaction feel more natural and empathetic.
|
| 31 |
+
|
| 32 |
+
---
|
| 33 |
+
|
| 34 |
+
## ✨ Features
|
| 35 |
+
|
| 36 |
+
- **🧠 Conversation Memory**: Remembers previous turns in the conversation to understand context.
|
| 37 |
+
- **😠 Sentiment-Aware**: Detects user sentiment (Positive/Negative) and adjusts its persona to be more helpful or empathetic.
|
| 38 |
+
- **📚 Retrieval-Augmented Generation (RAG)**: Retrieves relevant information from a vector database to provide accurate, knowledge-based answers.
|
| 39 |
+
- **🔊 Text-to-Speech**: Can read its responses aloud for a complete voice-enabled experience.
|
| 40 |
+
- **🌐 Interactive UI**: Built with Gradio for an easy-to-use web interface.
|
| 41 |
+
|
| 42 |
+
---
|
| 43 |
+
|
| 44 |
+
## 🛠️ Tech Stack & Model Architecture
|
| 45 |
+
|
| 46 |
+
The agent is built on a modern RAG architecture using the Hugging Face ecosystem.
|
| 47 |
+
|
| 48 |
+
1. **User Query**: The user asks a question.
|
| 49 |
+
2. **Sentiment Analysis**: The query's sentiment is analyzed.
|
| 50 |
+
3. **Embedding & Retrieval**: The query is converted into a vector embedding. This embedding is used to search a FAISS vector database to find the most relevant documents from the knowledge base.
|
| 51 |
+
4. **Prompt Engineering**: A detailed prompt is constructed containing the agent's persona (based on sentiment), the conversation history, the retrieved documents (context), and the user's current query.
|
| 52 |
+
5. **LLM Response Generation**: The complete prompt is sent to the LLM, which generates a context-aware and tonally appropriate response.
|
| 53 |
+
6. **Text-to-Speech**: The final text response can be converted to audio.
|
| 54 |
+
|
| 55 |
+
### Model Selection Rationale
|
| 56 |
+
|
| 57 |
+
| Component | Model | Reason for Choice |
|
| 58 |
+
| :--- | :--- | :--- |
|
| 59 |
+
| **Embedding** | `sentence-transformers/all-MiniLM-L6-v2` | A very lightweight and fast model that provides excellent performance for semantic retrieval. It's ideal for creating knowledge base embeddings without requiring massive computational resources. |
|
| 60 |
+
| **Response Generation** | `google/flan-t5-large` | We chose this model after benchmarking it against the smaller `flan-t5-base`. While slower, `flan-t5-large` is significantly better at following complex instructions, such as adopting an empathetic persona. This was crucial for handling negative user sentiment effectively. |
|
| 61 |
+
| **Sentiment Analysis** | `distilbert-base-uncased-finetuned-sst-2-english` | A small, fast, and accurate sentiment classifier. Its efficiency ensures that adding sentiment awareness doesn't create a bottleneck in the response pipeline. |
|
| 62 |
+
| **Text-to-Speech** | `gTTS` (Google Text-to-Speech) | Chosen for its simplicity and reliability. It's very easy to implement and works consistently across different environments, making it perfect for this project. |
|
| 63 |
+
|
| 64 |
+
---
|
| 65 |
+
|
| 66 |
+
## 📊 Performance Benchmark
|
| 67 |
+
|
| 68 |
+
A key decision in this project was selecting the right LLM for response generation. We tested two models on a Google Colab CPU environment to measure the trade-off between response time and quality.
|
| 69 |
+
|
| 70 |
+
| Model | Average Response Time (Colab CPU) | Response Quality |
|
| 71 |
+
| :--- | :--- | :--- |
|
| 72 |
+
| `google/flan-t5-base` | ~4 seconds | Fast, but often ignored persona instructions and provided blunt, unhelpful answers to negative queries. |
|
| 73 |
+
| `google/flan-t5-large` | ~20 seconds | Significantly slower, but consistently followed the empathetic persona instructions, leading to much higher-quality, more appropriate responses. |
|
| 74 |
+
|
| 75 |
+
**Conclusion**: We chose `flan-t5-large` because the improvement in response quality and instruction-following was critical for the agent's primary function, justifying the longer response time for a portfolio demonstration.
|
| 76 |
+
|
| 77 |
+
---
|
| 78 |
+
|
| 79 |
+
## 🔮 Future Improvements
|
| 80 |
+
|
| 81 |
+
While this project is a fully functional proof-of-concept, there are several ways it could be enhanced for a production environment:
|
| 82 |
+
|
| 83 |
+
- **📈 Scale the LLM**: For even higher quality responses and more nuanced conversations, we could upgrade to a much larger model (e.g., Llama 3, Mistral Large). This would require a more powerful GPU for inference to maintain an acceptable response time.
|
| 84 |
+
|
| 85 |
+
- **🎯 Customize the Knowledge Base**: Instead of a generic FAQ dataset [(MakTek/Customer_support_faqs_dataset)](https://huggingface.co/datasets/MakTek/Customer_support_faqs_dataset), the agent could be provided with a company's internal documentation, product manuals, or past support tickets. This would make it a highly specialized and valuable internal tool.
|
| 86 |
+
|
| 87 |
+
- **⚙️ Fine-Tune the Embedding Model**: For a highly specific domain (e.g., medical or legal support), the `all-MiniLM-L6-v2` embedding model could be fine-tuned on domain-specific text to improve the accuracy of the document retrieval step.
|
| 88 |
+
|
| 89 |
+
- **🗣️ Higher-Quality TTS**: While `gTTS` is reliable, we could integrate a more advanced, natural-sounding TTS model (like those from Coqui AI or Microsoft) for a more polished user experience.
|
| 90 |
+
|
| 91 |
+
- **🎤 Add Speech-to-Text (STT)**: Re-integrate a robust STT model (like `openai/whisper`) to create a full voice-to-voice conversation flow, allowing users to speak their queries directly to the agent.
|
| 92 |
+
|
| 93 |
+
- **🐳 Dockerize for Deployment**: The application could be containerized using Docker, making it easy to deploy consistently across different environments, from local machines to cloud servers.
|
| 94 |
+
|
| 95 |
+
---
|
| 96 |
+
|
| 97 |
+
## 🚀 Getting Started
|
| 98 |
+
|
| 99 |
+
Follow these steps to get the agent running locally.
|
| 100 |
+
|
| 101 |
+
### Prerequisites
|
| 102 |
+
|
| 103 |
+
You need to have Python 3.8+ installed on your system.
|
| 104 |
+
|
| 105 |
+
### Installation & Usage
|
| 106 |
+
|
| 107 |
+
1. **Clone the repository (or download the files):**
|
| 108 |
+
|
| 109 |
+
```sh
|
| 110 |
+
git clone <your-repo-url>
|
| 111 |
+
cd <your-repo-directory>
|
| 112 |
+
```
|
| 113 |
+
|
| 114 |
+
2. **Install the dependencies:**
|
| 115 |
+
```sh
|
| 116 |
+
pip install -r requirements.txt
|
| 117 |
+
```
|
| 118 |
+
|
| 119 |
+
3. **Run the terminal-based demo (optional):**
|
| 120 |
+
To see the core agent logic in action, run the `agent.py` script.
|
| 121 |
+
```sh
|
| 122 |
+
python agent.py
|
| 123 |
+
```
|
assets/banner.png
ADDED
|
Git LFS Details
|
assets/gradio.png
ADDED
|
Git LFS Details
|
requirements.txt
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
datasets==4.0.0
|
| 2 |
+
transformers==4.56.1
|
| 3 |
+
sentence-transformers==5.1.0
|
| 4 |
+
faiss-cpu==1.12.0
|
| 5 |
+
torch==2.8.0+cu126
|
| 6 |
+
gTTS==2.5.4
|
| 7 |
+
gradio==5.44.1
|
scripts/agent.py
ADDED
|
@@ -0,0 +1,122 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import faiss
|
| 2 |
+
import numpy as np
|
| 3 |
+
import torch
|
| 4 |
+
import time
|
| 5 |
+
from datasets import load_dataset
|
| 6 |
+
from sentence_transformers import SentenceTransformer
|
| 7 |
+
from transformers import pipeline
|
| 8 |
+
|
| 9 |
+
class CustomerServiceAgent:
|
| 10 |
+
"""
|
| 11 |
+
Encapsulates all the functionality of the AI customer service agent,
|
| 12 |
+
including model loading, knowledge base preparation, and response generation.
|
| 13 |
+
"""
|
| 14 |
+
def __init__(self):
|
| 15 |
+
"""
|
| 16 |
+
Initializes the agent by loading all necessary models and building the
|
| 17 |
+
retrieval-augmented generation (RAG) knowledge base.
|
| 18 |
+
"""
|
| 19 |
+
print("Initializing Customer Service Agent...")
|
| 20 |
+
self._load_models()
|
| 21 |
+
self._build_knowledge_base()
|
| 22 |
+
print("\nAgent is ready.")
|
| 23 |
+
|
| 24 |
+
def _load_models(self):
|
| 25 |
+
"""
|
| 26 |
+
Loads all the machine learning models required for the agent to function.
|
| 27 |
+
"""
|
| 28 |
+
print("\n[1/4] Loading all models...")
|
| 29 |
+
device = 0 if torch.cuda.is_available() else -1
|
| 30 |
+
|
| 31 |
+
self.embedding_model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')
|
| 32 |
+
self.llm_pipeline = pipeline("text2text-generation", model='google/flan-t5-large', device=device)
|
| 33 |
+
self.sentiment_classifier = pipeline("sentiment-analysis", model="distilbert-base-uncased-finetuned-sst-2-english", device=device)
|
| 34 |
+
|
| 35 |
+
print("All models loaded successfully.")
|
| 36 |
+
|
| 37 |
+
def _build_knowledge_base(self):
|
| 38 |
+
"""
|
| 39 |
+
Prepares the knowledge base for the RAG system by loading FAQs,
|
| 40 |
+
creating vector embeddings, and storing them in a FAISS index.
|
| 41 |
+
"""
|
| 42 |
+
print("\n[2/4] Preparing Knowledge Base...")
|
| 43 |
+
try:
|
| 44 |
+
dataset = load_dataset("MakTek/Customer_support_faqs_dataset", split="train")
|
| 45 |
+
self.knowledge_base = [item for item in dataset['answer'] if item and item.strip()]
|
| 46 |
+
print(f"Successfully loaded {len(self.knowledge_base)} documents.")
|
| 47 |
+
except Exception as e:
|
| 48 |
+
print(f"Failed to load dataset. Using a fallback. Error: {e}")
|
| 49 |
+
self.knowledge_base = [
|
| 50 |
+
"You can update your payment method by going to the 'Billing' section in your account settings.",
|
| 51 |
+
"To check your order status, please log in to your account and navigate to the 'My Orders' page.",
|
| 52 |
+
"I am very sorry to hear your package has not arrived. Please provide your order number so I can investigate.",
|
| 53 |
+
]
|
| 54 |
+
|
| 55 |
+
print("\n[3/4] Creating embeddings for the knowledge base...")
|
| 56 |
+
embeddings = self.embedding_model.encode(self.knowledge_base, show_progress_bar=True)
|
| 57 |
+
|
| 58 |
+
print("\n[4/4] Setting up FAISS vector index...")
|
| 59 |
+
self.index = faiss.IndexFlatL2(embeddings.shape[1])
|
| 60 |
+
self.index.add(np.array(embeddings))
|
| 61 |
+
print("FAISS retriever is ready.")
|
| 62 |
+
|
| 63 |
+
def get_rag_response(self, query, history, k=3):
|
| 64 |
+
"""
|
| 65 |
+
Generates a response using the Retrieval-Augmented Generation (RAG) pipeline.
|
| 66 |
+
"""
|
| 67 |
+
print(f"\nProcessing query: '{query}'")
|
| 68 |
+
|
| 69 |
+
sentiment = self.sentiment_classifier(query)[0]['label']
|
| 70 |
+
print(f"Detected Sentiment: {sentiment}")
|
| 71 |
+
|
| 72 |
+
history_string = "".join([f"User: {turn['user']}\nAssistant: {turn['assistant']}\n" for turn in history])
|
| 73 |
+
|
| 74 |
+
query_embedding = self.embedding_model.encode([query])
|
| 75 |
+
_, indices = self.index.search(np.array(query_embedding), k)
|
| 76 |
+
context = "\n\n".join([self.knowledge_base[i] for i in indices[0]])
|
| 77 |
+
|
| 78 |
+
persona = "You are an empathetic customer support agent." if sentiment == 'NEGATIVE' else "You are a helpful customer support agent."
|
| 79 |
+
|
| 80 |
+
prompt = f"""
|
| 81 |
+
{persona}
|
| 82 |
+
Based on history and context, answer the user's question.
|
| 83 |
+
|
| 84 |
+
### History:
|
| 85 |
+
{history_string}
|
| 86 |
+
### Context:
|
| 87 |
+
{context}
|
| 88 |
+
### Question:
|
| 89 |
+
{query}
|
| 90 |
+
### Answer:
|
| 91 |
+
"""
|
| 92 |
+
|
| 93 |
+
start_time = time.time()
|
| 94 |
+
response = self.llm_pipeline(prompt, max_new_tokens=100, num_beams=5, early_stopping=True)[0]['generated_text']
|
| 95 |
+
print(f"LLM Response Time: {time.time() - start_time:.2f} seconds")
|
| 96 |
+
|
| 97 |
+
return response.strip()
|
| 98 |
+
|
| 99 |
+
# --- Terminal-based Demo ---
|
| 100 |
+
if __name__ == "__main__":
|
| 101 |
+
agent = CustomerServiceAgent()
|
| 102 |
+
conversation_history = []
|
| 103 |
+
|
| 104 |
+
print("\n--- Starting Terminal Demo ---")
|
| 105 |
+
|
| 106 |
+
# First query
|
| 107 |
+
query1 = "This is so frustrating, my package never arrived!"
|
| 108 |
+
response1 = agent.get_rag_response(query1, conversation_history)
|
| 109 |
+
conversation_history.append({'user': query1, 'assistant': response1})
|
| 110 |
+
|
| 111 |
+
print(f"\nUser: {query1}")
|
| 112 |
+
print(f"Agent: {response1}")
|
| 113 |
+
|
| 114 |
+
# Follow-up query to test memory
|
| 115 |
+
query2 = "Okay, what do you need from me to find it?"
|
| 116 |
+
response2 = agent.get_rag_response(query2, conversation_history)
|
| 117 |
+
conversation_history.append({'user': query2, 'assistant': response2})
|
| 118 |
+
|
| 119 |
+
print(f"\nUser: {query2}")
|
| 120 |
+
print(f"Agent: {response2}")
|
| 121 |
+
|
| 122 |
+
print("\n--- Demo Complete ---")
|
scripts/app.py
ADDED
|
@@ -0,0 +1,72 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import gradio as gr
|
| 2 |
+
from gtts import gTTS
|
| 3 |
+
|
| 4 |
+
# --- Gradio UI Functions ---
|
| 5 |
+
|
| 6 |
+
def generate_audio_response(text):
|
| 7 |
+
"""
|
| 8 |
+
Converts the agent's text response into an audio file using gTTS.
|
| 9 |
+
"""
|
| 10 |
+
if not text:
|
| 11 |
+
return None
|
| 12 |
+
output_path = "assistant_response.mp3"
|
| 13 |
+
tts = gTTS(text=text, lang='en')
|
| 14 |
+
tts.save(output_path)
|
| 15 |
+
return output_path
|
| 16 |
+
|
| 17 |
+
def respond(text_query, history_state):
|
| 18 |
+
"""
|
| 19 |
+
The main interaction function called by the Gradio interface.
|
| 20 |
+
"""
|
| 21 |
+
if not text_query or not text_query.strip():
|
| 22 |
+
formatted_history = "\n".join([f"**You:** {turn['user']}\n**Agent:** {turn['assistant']}" for turn in history_state])
|
| 23 |
+
return "", history_state, formatted_history
|
| 24 |
+
|
| 25 |
+
query = text_query.strip()
|
| 26 |
+
assistant_response_text = agent.get_rag_response(query, history_state)
|
| 27 |
+
|
| 28 |
+
new_history = history_state + [{'user': query, 'assistant': assistant_response_text}]
|
| 29 |
+
formatted_history = "\n".join([f"**You:** {turn['user']}\n**Agent:** {turn['assistant']}" for turn in new_history])
|
| 30 |
+
|
| 31 |
+
return assistant_response_text, new_history, formatted_history
|
| 32 |
+
|
| 33 |
+
# --- Launch the Gradio Web Interface ---
|
| 34 |
+
print("Launching Gradio Interface...")
|
| 35 |
+
|
| 36 |
+
# Instantiate the agent from agent.py
|
| 37 |
+
agent = CustomerServiceAgent()
|
| 38 |
+
|
| 39 |
+
# Define the UI layout
|
| 40 |
+
with gr.Blocks(theme=gr.themes.Soft(), title="Customer Service Agent") as app:
|
| 41 |
+
gr.Markdown("# Advanced Customer Service Agent")
|
| 42 |
+
gr.Markdown("Type your query below and press Submit.")
|
| 43 |
+
|
| 44 |
+
history_state = gr.State([])
|
| 45 |
+
|
| 46 |
+
with gr.Row():
|
| 47 |
+
with gr.Column(scale=2):
|
| 48 |
+
text_input = gr.Textbox(label="Your Question", lines=4, placeholder="Type your question here...")
|
| 49 |
+
text_submit_btn = gr.Button("Submit")
|
| 50 |
+
|
| 51 |
+
with gr.Accordion("Agent's Response", open=True):
|
| 52 |
+
agent_response_text = gr.Textbox(label="Response Text", interactive=False, lines=4)
|
| 53 |
+
with gr.Row():
|
| 54 |
+
read_aloud_btn = gr.Button("Read Response Aloud")
|
| 55 |
+
audio_output = gr.Audio(label="Agent's Voice", autoplay=False)
|
| 56 |
+
|
| 57 |
+
with gr.Column(scale=3):
|
| 58 |
+
history_display = gr.Markdown("Conversation history will appear here.", label="Conversation")
|
| 59 |
+
|
| 60 |
+
text_submit_btn.click(
|
| 61 |
+
fn=respond,
|
| 62 |
+
inputs=[text_input, history_state],
|
| 63 |
+
outputs=[agent_response_text, history_state, history_display]
|
| 64 |
+
).then(lambda: "", outputs=[text_input])
|
| 65 |
+
|
| 66 |
+
read_aloud_btn.click(
|
| 67 |
+
fn=generate_audio_response,
|
| 68 |
+
inputs=[agent_response_text],
|
| 69 |
+
outputs=[audio_output]
|
| 70 |
+
)
|
| 71 |
+
|
| 72 |
+
app.launch(debug=True, share=True)
|