DanielKiani commited on
Commit
49f9f52
·
0 Parent(s):

initial commit

Browse files
Files changed (9) hide show
  1. .gitattributes +1 -0
  2. .gitignore +7 -0
  3. LICENSE +21 -0
  4. README.md +123 -0
  5. assets/banner.png +3 -0
  6. assets/gradio.png +3 -0
  7. requirements.txt +7 -0
  8. scripts/agent.py +122 -0
  9. scripts/app.py +72 -0
.gitattributes ADDED
@@ -0,0 +1 @@
 
 
1
+ *.png filter=lfs diff=lfs merge=lfs -text
.gitignore ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ __pycache__/
2
+ *.pyc
3
+ venv/
4
+ .venv/
5
+ .vscode/
6
+ .idea/
7
+
LICENSE ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ MIT License
2
+
3
+ Copyright (c) 2025 Daniel
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
README.md ADDED
@@ -0,0 +1,123 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ![Banner](assets/banner.png)
2
+ [![Python](https://img.shields.io/badge/Python-3.12.11-blue?logo=python)](https://www.python.org/)[![PyTorch](https://img.shields.io/badge/PyTorch-2.8-EE4C2C?logo=pytorch)](https://pytorch.org/)![Made with ML](https://img.shields.io/badge/Made%20with-ML-blueviolet?logo=openai)[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
3
+
4
+ # 🤖 Advanced Customer Service Agent
5
+
6
+ An intelligent, multi-modal customer service agent built with a Retrieval-Augmented Generation (RAG) pipeline. This agent can understand user sentiment, retrieve relevant information from a knowledge base, and provide empathetic, context-aware responses in both text and voice.
7
+
8
+ the gradio demo can be found [Here](https://huggingface.co/datasets/MakTek/Customer_support_faqs_dataset)
9
+
10
+ ![Gradio](assets/gradio.png)
11
+
12
+ ---
13
+
14
+ ## 📋 Table of Contents
15
+
16
+ - [📖 About The Project](#-about-the-project)
17
+ - [✨ Features](#-features)
18
+ - [🛠️ Tech Stack & Model Architecture](#️-tech-stack--model-architecture)
19
+ - [Model Selection Rationale](#model-selection-rationale)
20
+ - [📊 Performance Benchmark](#-performance-benchmark)
21
+ - [🔮 Future Improvements](#-future-improvements)
22
+ - [🚀 Getting Started](#-getting-started)
23
+ - [Prerequisites](#prerequisites)
24
+ - [Installation & Usage](#installation--usage)
25
+
26
+ ---
27
+
28
+ ## 📖 About The Project
29
+
30
+ This project is a complete implementation of an advanced AI customer service agent. The core of the agent is a RAG pipeline that allows it to answer user queries based on a predefined knowledge base, ensuring factual and relevant responses. It includes conversation memory to handle follow-up questions and sentiment analysis to adapt its tone, making the interaction feel more natural and empathetic.
31
+
32
+ ---
33
+
34
+ ## ✨ Features
35
+
36
+ - **🧠 Conversation Memory**: Remembers previous turns in the conversation to understand context.
37
+ - **😠 Sentiment-Aware**: Detects user sentiment (Positive/Negative) and adjusts its persona to be more helpful or empathetic.
38
+ - **📚 Retrieval-Augmented Generation (RAG)**: Retrieves relevant information from a vector database to provide accurate, knowledge-based answers.
39
+ - **🔊 Text-to-Speech**: Can read its responses aloud for a complete voice-enabled experience.
40
+ - **🌐 Interactive UI**: Built with Gradio for an easy-to-use web interface.
41
+
42
+ ---
43
+
44
+ ## 🛠️ Tech Stack & Model Architecture
45
+
46
+ The agent is built on a modern RAG architecture using the Hugging Face ecosystem.
47
+
48
+ 1. **User Query**: The user asks a question.
49
+ 2. **Sentiment Analysis**: The query's sentiment is analyzed.
50
+ 3. **Embedding & Retrieval**: The query is converted into a vector embedding. This embedding is used to search a FAISS vector database to find the most relevant documents from the knowledge base.
51
+ 4. **Prompt Engineering**: A detailed prompt is constructed containing the agent's persona (based on sentiment), the conversation history, the retrieved documents (context), and the user's current query.
52
+ 5. **LLM Response Generation**: The complete prompt is sent to the LLM, which generates a context-aware and tonally appropriate response.
53
+ 6. **Text-to-Speech**: The final text response can be converted to audio.
54
+
55
+ ### Model Selection Rationale
56
+
57
+ | Component | Model | Reason for Choice |
58
+ | :--- | :--- | :--- |
59
+ | **Embedding** | `sentence-transformers/all-MiniLM-L6-v2` | A very lightweight and fast model that provides excellent performance for semantic retrieval. It's ideal for creating knowledge base embeddings without requiring massive computational resources. |
60
+ | **Response Generation** | `google/flan-t5-large` | We chose this model after benchmarking it against the smaller `flan-t5-base`. While slower, `flan-t5-large` is significantly better at following complex instructions, such as adopting an empathetic persona. This was crucial for handling negative user sentiment effectively. |
61
+ | **Sentiment Analysis** | `distilbert-base-uncased-finetuned-sst-2-english` | A small, fast, and accurate sentiment classifier. Its efficiency ensures that adding sentiment awareness doesn't create a bottleneck in the response pipeline. |
62
+ | **Text-to-Speech** | `gTTS` (Google Text-to-Speech) | Chosen for its simplicity and reliability. It's very easy to implement and works consistently across different environments, making it perfect for this project. |
63
+
64
+ ---
65
+
66
+ ## 📊 Performance Benchmark
67
+
68
+ A key decision in this project was selecting the right LLM for response generation. We tested two models on a Google Colab CPU environment to measure the trade-off between response time and quality.
69
+
70
+ | Model | Average Response Time (Colab CPU) | Response Quality |
71
+ | :--- | :--- | :--- |
72
+ | `google/flan-t5-base` | ~4 seconds | Fast, but often ignored persona instructions and provided blunt, unhelpful answers to negative queries. |
73
+ | `google/flan-t5-large` | ~20 seconds | Significantly slower, but consistently followed the empathetic persona instructions, leading to much higher-quality, more appropriate responses. |
74
+
75
+ **Conclusion**: We chose `flan-t5-large` because the improvement in response quality and instruction-following was critical for the agent's primary function, justifying the longer response time for a portfolio demonstration.
76
+
77
+ ---
78
+
79
+ ## 🔮 Future Improvements
80
+
81
+ While this project is a fully functional proof-of-concept, there are several ways it could be enhanced for a production environment:
82
+
83
+ - **📈 Scale the LLM**: For even higher quality responses and more nuanced conversations, we could upgrade to a much larger model (e.g., Llama 3, Mistral Large). This would require a more powerful GPU for inference to maintain an acceptable response time.
84
+
85
+ - **🎯 Customize the Knowledge Base**: Instead of a generic FAQ dataset [(MakTek/Customer_support_faqs_dataset)](https://huggingface.co/datasets/MakTek/Customer_support_faqs_dataset), the agent could be provided with a company's internal documentation, product manuals, or past support tickets. This would make it a highly specialized and valuable internal tool.
86
+
87
+ - **⚙️ Fine-Tune the Embedding Model**: For a highly specific domain (e.g., medical or legal support), the `all-MiniLM-L6-v2` embedding model could be fine-tuned on domain-specific text to improve the accuracy of the document retrieval step.
88
+
89
+ - **🗣️ Higher-Quality TTS**: While `gTTS` is reliable, we could integrate a more advanced, natural-sounding TTS model (like those from Coqui AI or Microsoft) for a more polished user experience.
90
+
91
+ - **🎤 Add Speech-to-Text (STT)**: Re-integrate a robust STT model (like `openai/whisper`) to create a full voice-to-voice conversation flow, allowing users to speak their queries directly to the agent.
92
+
93
+ - **🐳 Dockerize for Deployment**: The application could be containerized using Docker, making it easy to deploy consistently across different environments, from local machines to cloud servers.
94
+
95
+ ---
96
+
97
+ ## 🚀 Getting Started
98
+
99
+ Follow these steps to get the agent running locally.
100
+
101
+ ### Prerequisites
102
+
103
+ You need to have Python 3.8+ installed on your system.
104
+
105
+ ### Installation & Usage
106
+
107
+ 1. **Clone the repository (or download the files):**
108
+
109
+ ```sh
110
+ git clone <your-repo-url>
111
+ cd <your-repo-directory>
112
+ ```
113
+
114
+ 2. **Install the dependencies:**
115
+ ```sh
116
+ pip install -r requirements.txt
117
+ ```
118
+
119
+ 3. **Run the terminal-based demo (optional):**
120
+ To see the core agent logic in action, run the `agent.py` script.
121
+ ```sh
122
+ python agent.py
123
+ ```
assets/banner.png ADDED

Git LFS Details

  • SHA256: 2a23a1870e5c7a4332c9605d2eedaf331503970f90c0927ae76b397ab88a05b1
  • Pointer size: 132 Bytes
  • Size of remote file: 1.03 MB
assets/gradio.png ADDED

Git LFS Details

  • SHA256: 7abd571224e3b7ab563cba596d854cb0a86140340e72d63036ba0db29e20ba3b
  • Pointer size: 130 Bytes
  • Size of remote file: 48.4 kB
requirements.txt ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ datasets==4.0.0
2
+ transformers==4.56.1
3
+ sentence-transformers==5.1.0
4
+ faiss-cpu==1.12.0
5
+ torch==2.8.0+cu126
6
+ gTTS==2.5.4
7
+ gradio==5.44.1
scripts/agent.py ADDED
@@ -0,0 +1,122 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import faiss
2
+ import numpy as np
3
+ import torch
4
+ import time
5
+ from datasets import load_dataset
6
+ from sentence_transformers import SentenceTransformer
7
+ from transformers import pipeline
8
+
9
+ class CustomerServiceAgent:
10
+ """
11
+ Encapsulates all the functionality of the AI customer service agent,
12
+ including model loading, knowledge base preparation, and response generation.
13
+ """
14
+ def __init__(self):
15
+ """
16
+ Initializes the agent by loading all necessary models and building the
17
+ retrieval-augmented generation (RAG) knowledge base.
18
+ """
19
+ print("Initializing Customer Service Agent...")
20
+ self._load_models()
21
+ self._build_knowledge_base()
22
+ print("\nAgent is ready.")
23
+
24
+ def _load_models(self):
25
+ """
26
+ Loads all the machine learning models required for the agent to function.
27
+ """
28
+ print("\n[1/4] Loading all models...")
29
+ device = 0 if torch.cuda.is_available() else -1
30
+
31
+ self.embedding_model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')
32
+ self.llm_pipeline = pipeline("text2text-generation", model='google/flan-t5-large', device=device)
33
+ self.sentiment_classifier = pipeline("sentiment-analysis", model="distilbert-base-uncased-finetuned-sst-2-english", device=device)
34
+
35
+ print("All models loaded successfully.")
36
+
37
+ def _build_knowledge_base(self):
38
+ """
39
+ Prepares the knowledge base for the RAG system by loading FAQs,
40
+ creating vector embeddings, and storing them in a FAISS index.
41
+ """
42
+ print("\n[2/4] Preparing Knowledge Base...")
43
+ try:
44
+ dataset = load_dataset("MakTek/Customer_support_faqs_dataset", split="train")
45
+ self.knowledge_base = [item for item in dataset['answer'] if item and item.strip()]
46
+ print(f"Successfully loaded {len(self.knowledge_base)} documents.")
47
+ except Exception as e:
48
+ print(f"Failed to load dataset. Using a fallback. Error: {e}")
49
+ self.knowledge_base = [
50
+ "You can update your payment method by going to the 'Billing' section in your account settings.",
51
+ "To check your order status, please log in to your account and navigate to the 'My Orders' page.",
52
+ "I am very sorry to hear your package has not arrived. Please provide your order number so I can investigate.",
53
+ ]
54
+
55
+ print("\n[3/4] Creating embeddings for the knowledge base...")
56
+ embeddings = self.embedding_model.encode(self.knowledge_base, show_progress_bar=True)
57
+
58
+ print("\n[4/4] Setting up FAISS vector index...")
59
+ self.index = faiss.IndexFlatL2(embeddings.shape[1])
60
+ self.index.add(np.array(embeddings))
61
+ print("FAISS retriever is ready.")
62
+
63
+ def get_rag_response(self, query, history, k=3):
64
+ """
65
+ Generates a response using the Retrieval-Augmented Generation (RAG) pipeline.
66
+ """
67
+ print(f"\nProcessing query: '{query}'")
68
+
69
+ sentiment = self.sentiment_classifier(query)[0]['label']
70
+ print(f"Detected Sentiment: {sentiment}")
71
+
72
+ history_string = "".join([f"User: {turn['user']}\nAssistant: {turn['assistant']}\n" for turn in history])
73
+
74
+ query_embedding = self.embedding_model.encode([query])
75
+ _, indices = self.index.search(np.array(query_embedding), k)
76
+ context = "\n\n".join([self.knowledge_base[i] for i in indices[0]])
77
+
78
+ persona = "You are an empathetic customer support agent." if sentiment == 'NEGATIVE' else "You are a helpful customer support agent."
79
+
80
+ prompt = f"""
81
+ {persona}
82
+ Based on history and context, answer the user's question.
83
+
84
+ ### History:
85
+ {history_string}
86
+ ### Context:
87
+ {context}
88
+ ### Question:
89
+ {query}
90
+ ### Answer:
91
+ """
92
+
93
+ start_time = time.time()
94
+ response = self.llm_pipeline(prompt, max_new_tokens=100, num_beams=5, early_stopping=True)[0]['generated_text']
95
+ print(f"LLM Response Time: {time.time() - start_time:.2f} seconds")
96
+
97
+ return response.strip()
98
+
99
+ # --- Terminal-based Demo ---
100
+ if __name__ == "__main__":
101
+ agent = CustomerServiceAgent()
102
+ conversation_history = []
103
+
104
+ print("\n--- Starting Terminal Demo ---")
105
+
106
+ # First query
107
+ query1 = "This is so frustrating, my package never arrived!"
108
+ response1 = agent.get_rag_response(query1, conversation_history)
109
+ conversation_history.append({'user': query1, 'assistant': response1})
110
+
111
+ print(f"\nUser: {query1}")
112
+ print(f"Agent: {response1}")
113
+
114
+ # Follow-up query to test memory
115
+ query2 = "Okay, what do you need from me to find it?"
116
+ response2 = agent.get_rag_response(query2, conversation_history)
117
+ conversation_history.append({'user': query2, 'assistant': response2})
118
+
119
+ print(f"\nUser: {query2}")
120
+ print(f"Agent: {response2}")
121
+
122
+ print("\n--- Demo Complete ---")
scripts/app.py ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import gradio as gr
2
+ from gtts import gTTS
3
+
4
+ # --- Gradio UI Functions ---
5
+
6
+ def generate_audio_response(text):
7
+ """
8
+ Converts the agent's text response into an audio file using gTTS.
9
+ """
10
+ if not text:
11
+ return None
12
+ output_path = "assistant_response.mp3"
13
+ tts = gTTS(text=text, lang='en')
14
+ tts.save(output_path)
15
+ return output_path
16
+
17
+ def respond(text_query, history_state):
18
+ """
19
+ The main interaction function called by the Gradio interface.
20
+ """
21
+ if not text_query or not text_query.strip():
22
+ formatted_history = "\n".join([f"**You:** {turn['user']}\n**Agent:** {turn['assistant']}" for turn in history_state])
23
+ return "", history_state, formatted_history
24
+
25
+ query = text_query.strip()
26
+ assistant_response_text = agent.get_rag_response(query, history_state)
27
+
28
+ new_history = history_state + [{'user': query, 'assistant': assistant_response_text}]
29
+ formatted_history = "\n".join([f"**You:** {turn['user']}\n**Agent:** {turn['assistant']}" for turn in new_history])
30
+
31
+ return assistant_response_text, new_history, formatted_history
32
+
33
+ # --- Launch the Gradio Web Interface ---
34
+ print("Launching Gradio Interface...")
35
+
36
+ # Instantiate the agent from agent.py
37
+ agent = CustomerServiceAgent()
38
+
39
+ # Define the UI layout
40
+ with gr.Blocks(theme=gr.themes.Soft(), title="Customer Service Agent") as app:
41
+ gr.Markdown("# Advanced Customer Service Agent")
42
+ gr.Markdown("Type your query below and press Submit.")
43
+
44
+ history_state = gr.State([])
45
+
46
+ with gr.Row():
47
+ with gr.Column(scale=2):
48
+ text_input = gr.Textbox(label="Your Question", lines=4, placeholder="Type your question here...")
49
+ text_submit_btn = gr.Button("Submit")
50
+
51
+ with gr.Accordion("Agent's Response", open=True):
52
+ agent_response_text = gr.Textbox(label="Response Text", interactive=False, lines=4)
53
+ with gr.Row():
54
+ read_aloud_btn = gr.Button("Read Response Aloud")
55
+ audio_output = gr.Audio(label="Agent's Voice", autoplay=False)
56
+
57
+ with gr.Column(scale=3):
58
+ history_display = gr.Markdown("Conversation history will appear here.", label="Conversation")
59
+
60
+ text_submit_btn.click(
61
+ fn=respond,
62
+ inputs=[text_input, history_state],
63
+ outputs=[agent_response_text, history_state, history_display]
64
+ ).then(lambda: "", outputs=[text_input])
65
+
66
+ read_aloud_btn.click(
67
+ fn=generate_audio_response,
68
+ inputs=[agent_response_text],
69
+ outputs=[audio_output]
70
+ )
71
+
72
+ app.launch(debug=True, share=True)