Spaces:

DanielKiani
/

CustomerServiceAgent

Sleeping

App Files Files Community

DanielKiani commited on Sep 22, 2025

Commit

49f9f52

0 Parent(s):

initial commit

Browse files

Files changed (9) hide show

.gitattributes +1 -0
.gitignore +7 -0
LICENSE +21 -0
README.md +123 -0
assets/banner.png +3 -0
assets/gradio.png +3 -0
requirements.txt +7 -0
scripts/agent.py +122 -0
scripts/app.py +72 -0

.gitattributes ADDED Viewed

	@@ -0,0 +1 @@


1	+ *.png filter=lfs diff=lfs merge=lfs -text

.gitignore ADDED Viewed

	@@ -0,0 +1,7 @@

+__pycache__/
+*.pyc
+venv/
+.venv/
+.vscode/
+.idea/

LICENSE ADDED Viewed

	@@ -0,0 +1,21 @@

+MIT License
+Copyright (c) 2025 Daniel
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

README.md ADDED Viewed

	@@ -0,0 +1,123 @@

+![Banner](assets/banner.png)
+[![Python](https://img.shields.io/badge/Python-3.12.11-blue?logo=python)](https://www.python.org/)[![PyTorch](https://img.shields.io/badge/PyTorch-2.8-EE4C2C?logo=pytorch)](https://pytorch.org/)![Made with ML](https://img.shields.io/badge/Made%20with-ML-blueviolet?logo=openai)[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
+# 🤖 Advanced Customer Service Agent
+An intelligent, multi-modal customer service agent built with a Retrieval-Augmented Generation (RAG) pipeline. This agent can understand user sentiment, retrieve relevant information from a knowledge base, and provide empathetic, context-aware responses in both text and voice.
+the gradio demo can be found [Here](https://huggingface.co/datasets/MakTek/Customer_support_faqs_dataset)
+![Gradio](assets/gradio.png)
+---
+## 📋 Table of Contents
+- [📖 About The Project](#-about-the-project)
+- [✨ Features](#-features)
+- [🛠️ Tech Stack & Model Architecture](#️-tech-stack--model-architecture)
+  - [Model Selection Rationale](#model-selection-rationale)
+- [📊 Performance Benchmark](#-performance-benchmark)
+- [🔮 Future Improvements](#-future-improvements)
+- [🚀 Getting Started](#-getting-started)
+  - [Prerequisites](#prerequisites)
+  - [Installation & Usage](#installation--usage)
+---
+## 📖 About The Project
+This project is a complete implementation of an advanced AI customer service agent. The core of the agent is a RAG pipeline that allows it to answer user queries based on a predefined knowledge base, ensuring factual and relevant responses. It includes conversation memory to handle follow-up questions and sentiment analysis to adapt its tone, making the interaction feel more natural and empathetic.
+---
+## ✨ Features
+- **🧠 Conversation Memory**: Remembers previous turns in the conversation to understand context.
+- **😠 Sentiment-Aware**: Detects user sentiment (Positive/Negative) and adjusts its persona to be more helpful or empathetic.
+- **📚 Retrieval-Augmented Generation (RAG)**: Retrieves relevant information from a vector database to provide accurate, knowledge-based answers.
+- **🔊 Text-to-Speech**: Can read its responses aloud for a complete voice-enabled experience.
+- **🌐 Interactive UI**: Built with Gradio for an easy-to-use web interface.
+---
+## 🛠️ Tech Stack & Model Architecture
+The agent is built on a modern RAG architecture using the Hugging Face ecosystem.
+1. **User Query**: The user asks a question.
+2. **Sentiment Analysis**: The query's sentiment is analyzed.
+3. **Embedding & Retrieval**: The query is converted into a vector embedding. This embedding is used to search a FAISS vector database to find the most relevant documents from the knowledge base.
+4. **Prompt Engineering**: A detailed prompt is constructed containing the agent's persona (based on sentiment), the conversation history, the retrieved documents (context), and the user's current query.
+5. **LLM Response Generation**: The complete prompt is sent to the LLM, which generates a context-aware and tonally appropriate response.
+6. **Text-to-Speech**: The final text response can be converted to audio.
+### Model Selection Rationale
+| Component | Model | Reason for Choice |
+| :--- | :--- | :--- |
+| **Embedding** | `sentence-transformers/all-MiniLM-L6-v2` | A very lightweight and fast model that provides excellent performance for semantic retrieval. It's ideal for creating knowledge base embeddings without requiring massive computational resources. |
+| **Response Generation** | `google/flan-t5-large` | We chose this model after benchmarking it against the smaller `flan-t5-base`. While slower, `flan-t5-large` is significantly better at following complex instructions, such as adopting an empathetic persona. This was crucial for handling negative user sentiment effectively. |
+| **Sentiment Analysis** | `distilbert-base-uncased-finetuned-sst-2-english` | A small, fast, and accurate sentiment classifier. Its efficiency ensures that adding sentiment awareness doesn't create a bottleneck in the response pipeline. |
+| **Text-to-Speech** | `gTTS` (Google Text-to-Speech) | Chosen for its simplicity and reliability. It's very easy to implement and works consistently across different environments, making it perfect for this project. |
+---
+## 📊 Performance Benchmark
+A key decision in this project was selecting the right LLM for response generation. We tested two models on a Google Colab CPU environment to measure the trade-off between response time and quality.
+| Model | Average Response Time (Colab CPU) | Response Quality |
+| :--- | :--- | :--- |
+| `google/flan-t5-base` | ~4 seconds | Fast, but often ignored persona instructions and provided blunt, unhelpful answers to negative queries. |
+| `google/flan-t5-large` | ~20 seconds | Significantly slower, but consistently followed the empathetic persona instructions, leading to much higher-quality, more appropriate responses. |
+**Conclusion**: We chose `flan-t5-large` because the improvement in response quality and instruction-following was critical for the agent's primary function, justifying the longer response time for a portfolio demonstration.
+---
+## 🔮 Future Improvements
+While this project is a fully functional proof-of-concept, there are several ways it could be enhanced for a production environment:
+- **📈 Scale the LLM**: For even higher quality responses and more nuanced conversations, we could upgrade to a much larger model (e.g., Llama 3, Mistral Large). This would require a more powerful GPU for inference to maintain an acceptable response time.
+- **🎯 Customize the Knowledge Base**: Instead of a generic FAQ dataset [(MakTek/Customer_support_faqs_dataset)](https://huggingface.co/datasets/MakTek/Customer_support_faqs_dataset), the agent could be provided with a company's internal documentation, product manuals, or past support tickets. This would make it a highly specialized and valuable internal tool.
+- **⚙️ Fine-Tune the Embedding Model**: For a highly specific domain (e.g., medical or legal support), the `all-MiniLM-L6-v2` embedding model could be fine-tuned on domain-specific text to improve the accuracy of the document retrieval step.
+- **🗣️ Higher-Quality TTS**: While `gTTS` is reliable, we could integrate a more advanced, natural-sounding TTS model (like those from Coqui AI or Microsoft) for a more polished user experience.
+- **🎤 Add Speech-to-Text (STT)**: Re-integrate a robust STT model (like `openai/whisper`) to create a full voice-to-voice conversation flow, allowing users to speak their queries directly to the agent.
+- **🐳 Dockerize for Deployment**: The application could be containerized using Docker, making it easy to deploy consistently across different environments, from local machines to cloud servers.
+---
+## 🚀 Getting Started
+Follow these steps to get the agent running locally.
+### Prerequisites
+You need to have Python 3.8+ installed on your system.
+### Installation & Usage
+1. **Clone the repository (or download the files):**
+    ```sh
+    git clone <your-repo-url>
+    cd <your-repo-directory>
+    ```
+2.  **Install the dependencies:**
+    ```sh
+    pip install -r requirements.txt
+    ```
+3.  **Run the terminal-based demo (optional):**
+    To see the core agent logic in action, run the `agent.py` script.
+    ```sh
+    python agent.py
+    ```

assets/banner.png ADDED Viewed

Git LFS Details

SHA256: 2a23a1870e5c7a4332c9605d2eedaf331503970f90c0927ae76b397ab88a05b1
Pointer size: 132 Bytes
Size of remote file: 1.03 MB

assets/gradio.png ADDED Viewed

Git LFS Details

SHA256: 7abd571224e3b7ab563cba596d854cb0a86140340e72d63036ba0db29e20ba3b
Pointer size: 130 Bytes
Size of remote file: 48.4 kB

requirements.txt ADDED Viewed

	@@ -0,0 +1,7 @@

+datasets==4.0.0
+transformers==4.56.1
+sentence-transformers==5.1.0
+faiss-cpu==1.12.0
+torch==2.8.0+cu126
+gTTS==2.5.4
+gradio==5.44.1

scripts/agent.py ADDED Viewed

	@@ -0,0 +1,122 @@

+import faiss
+import numpy as np
+import torch
+import time
+from datasets import load_dataset
+from sentence_transformers import SentenceTransformer
+from transformers import pipeline
+class CustomerServiceAgent:
+    """
+    Encapsulates all the functionality of the AI customer service agent,
+    including model loading, knowledge base preparation, and response generation.
+    """
+    def __init__(self):
+        """
+        Initializes the agent by loading all necessary models and building the
+        retrieval-augmented generation (RAG) knowledge base.
+        """
+        print("Initializing Customer Service Agent...")
+        self._load_models()
+        self._build_knowledge_base()
+        print("\nAgent is ready.")
+    def _load_models(self):
+        """
+        Loads all the machine learning models required for the agent to function.
+        """
+        print("\n[1/4] Loading all models...")
+        device = 0 if torch.cuda.is_available() else -1
+        self.embedding_model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')
+        self.llm_pipeline = pipeline("text2text-generation", model='google/flan-t5-large', device=device)
+        self.sentiment_classifier = pipeline("sentiment-analysis", model="distilbert-base-uncased-finetuned-sst-2-english", device=device)
+        print("All models loaded successfully.")
+    def _build_knowledge_base(self):
+        """
+        Prepares the knowledge base for the RAG system by loading FAQs,
+        creating vector embeddings, and storing them in a FAISS index.
+        """
+        print("\n[2/4] Preparing Knowledge Base...")
+        try:
+            dataset = load_dataset("MakTek/Customer_support_faqs_dataset", split="train")
+            self.knowledge_base = [item for item in dataset['answer'] if item and item.strip()]
+            print(f"Successfully loaded {len(self.knowledge_base)} documents.")
+        except Exception as e:
+            print(f"Failed to load dataset. Using a fallback. Error: {e}")
+            self.knowledge_base = [
+                "You can update your payment method by going to the 'Billing' section in your account settings.",
+                "To check your order status, please log in to your account and navigate to the 'My Orders' page.",
+                "I am very sorry to hear your package has not arrived. Please provide your order number so I can investigate.",
+            ]
+        print("\n[3/4] Creating embeddings for the knowledge base...")
+        embeddings = self.embedding_model.encode(self.knowledge_base, show_progress_bar=True)
+        print("\n[4/4] Setting up FAISS vector index...")
+        self.index = faiss.IndexFlatL2(embeddings.shape[1])
+        self.index.add(np.array(embeddings))
+        print("FAISS retriever is ready.")
+    def get_rag_response(self, query, history, k=3):
+        """
+        Generates a response using the Retrieval-Augmented Generation (RAG) pipeline.
+        """
+        print(f"\nProcessing query: '{query}'")
+        sentiment = self.sentiment_classifier(query)[0]['label']
+        print(f"Detected Sentiment: {sentiment}")
+        history_string = "".join([f"User: {turn['user']}\nAssistant: {turn['assistant']}\n" for turn in history])
+        query_embedding = self.embedding_model.encode([query])
+        _, indices = self.index.search(np.array(query_embedding), k)
+        context = "\n\n".join([self.knowledge_base[i] for i in indices[0]])
+        persona = "You are an empathetic customer support agent." if sentiment == 'NEGATIVE' else "You are a helpful customer support agent."
+        prompt = f"""
+        {persona}
+        Based on history and context, answer the user's question.
+        ### History:
+        {history_string}
+        ### Context:
+        {context}
+        ### Question:
+        {query}
+        ### Answer:
+        """
+        start_time = time.time()
+        response = self.llm_pipeline(prompt, max_new_tokens=100, num_beams=5, early_stopping=True)[0]['generated_text']
+        print(f"LLM Response Time: {time.time() - start_time:.2f} seconds")
+        return response.strip()
+# --- Terminal-based Demo ---
+if __name__ == "__main__":
+    agent = CustomerServiceAgent()
+    conversation_history = []
+    print("\n--- Starting Terminal Demo ---")
+    # First query
+    query1 = "This is so frustrating, my package never arrived!"
+    response1 = agent.get_rag_response(query1, conversation_history)
+    conversation_history.append({'user': query1, 'assistant': response1})
+    print(f"\nUser: {query1}")
+    print(f"Agent: {response1}")
+    # Follow-up query to test memory
+    query2 = "Okay, what do you need from me to find it?"
+    response2 = agent.get_rag_response(query2, conversation_history)
+    conversation_history.append({'user': query2, 'assistant': response2})
+    print(f"\nUser: {query2}")
+    print(f"Agent: {response2}")
+    print("\n--- Demo Complete ---")

scripts/app.py ADDED Viewed

	@@ -0,0 +1,72 @@

+import gradio as gr
+from gtts import gTTS
+# --- Gradio UI Functions ---
+def generate_audio_response(text):
+    """
+    Converts the agent's text response into an audio file using gTTS.
+    """
+    if not text:
+        return None
+    output_path = "assistant_response.mp3"
+    tts = gTTS(text=text, lang='en')
+    tts.save(output_path)
+    return output_path
+def respond(text_query, history_state):
+    """
+    The main interaction function called by the Gradio interface.
+    """
+    if not text_query or not text_query.strip():
+        formatted_history = "\n".join([f"**You:** {turn['user']}\n**Agent:** {turn['assistant']}" for turn in history_state])
+        return "", history_state, formatted_history
+    query = text_query.strip()
+    assistant_response_text = agent.get_rag_response(query, history_state)
+    new_history = history_state + [{'user': query, 'assistant': assistant_response_text}]
+    formatted_history = "\n".join([f"**You:** {turn['user']}\n**Agent:** {turn['assistant']}" for turn in new_history])
+    return assistant_response_text, new_history, formatted_history
+# --- Launch the Gradio Web Interface ---
+print("Launching Gradio Interface...")
+# Instantiate the agent from agent.py
+agent = CustomerServiceAgent()
+# Define the UI layout
+with gr.Blocks(theme=gr.themes.Soft(), title="Customer Service Agent") as app:
+    gr.Markdown("# Advanced Customer Service Agent")
+    gr.Markdown("Type your query below and press Submit.")
+    history_state = gr.State([])
+    with gr.Row():
+        with gr.Column(scale=2):
+            text_input = gr.Textbox(label="Your Question", lines=4, placeholder="Type your question here...")
+            text_submit_btn = gr.Button("Submit")
+            with gr.Accordion("Agent's Response", open=True):
+                agent_response_text = gr.Textbox(label="Response Text", interactive=False, lines=4)
+                with gr.Row():
+                    read_aloud_btn = gr.Button("Read Response Aloud")
+                    audio_output = gr.Audio(label="Agent's Voice", autoplay=False)
+        with gr.Column(scale=3):
+            history_display = gr.Markdown("Conversation history will appear here.", label="Conversation")
+    text_submit_btn.click(
+        fn=respond,
+        inputs=[text_input, history_state],
+        outputs=[agent_response_text, history_state, history_display]
+    ).then(lambda: "", outputs=[text_input])
+    read_aloud_btn.click(
+        fn=generate_audio_response,
+        inputs=[agent_response_text],
+        outputs=[audio_output]
+    )
+app.launch(debug=True, share=True)