ducnguyen1978 commited on
Commit
a3a19b5
·
verified ·
1 Parent(s): 9184c3a

Upload folder using huggingface_hub

Browse files
.gitattributes CHANGED
@@ -33,3 +33,15 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
 
 
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ document/8[[:space:]]bài[[:space:]]học[[:space:]]kinh[[:space:]]nghiệm[[:space:]]tại[[:space:]]Hồ[[:space:]]Nam[[:space:]]Trung[[:space:]]Quốc.pdf filter=lfs diff=lfs merge=lfs -text
37
+ document/Bài[[:space:]]Tham[[:space:]]Luận_Chuyen[[:space:]]doi[[:space:]]so.pdf filter=lfs diff=lfs merge=lfs -text
38
+ document/chuyendoiso.pdf filter=lfs diff=lfs merge=lfs -text
39
+ document/Chuyển[[:space:]]đổi[[:space:]]số[[:space:]]VieON.pdf filter=lfs diff=lfs merge=lfs -text
40
+ document/CHUYỂN[[:space:]]ĐỔI[[:space:]]SỐ[[:space:]]ĐÀI[[:space:]]TRUYỀN[[:space:]]HÌNH[[:space:]]TP[[:space:]]HCM_ver1.pdf filter=lfs diff=lfs merge=lfs -text
41
+ document/digitizedbrains.pdf filter=lfs diff=lfs merge=lfs -text
42
+ document/My[[:space:]]Changsha[[:space:]]App.pdf filter=lfs diff=lfs merge=lfs -text
43
+ document/Phần[[:space:]]trình[[:space:]]bày[[:space:]]trong[[:space:]]Diễn[[:space:]]Đàn[[:space:]]Chuyển[[:space:]]Đổi[[:space:]]Số_Nguyễn[[:space:]]Tấn[[:space:]]Đức.pdf filter=lfs diff=lfs merge=lfs -text
44
+ document/Phụ[[:space:]]luc[[:space:]]1_[[:space:]]SỬ[[:space:]]DỤNG[[:space:]]API[[:space:]]MỞ[[:space:]]ĐỂ[[:space:]]CẢI[[:space:]]THIỆN[[:space:]]DỊCH[[:space:]]VỤ[[:space:]]VÀ[[:space:]]MỞ[[:space:]]RỘNG[[:space:]]HỆ[[:space:]]SINH[[:space:]]THÁI.pdf filter=lfs diff=lfs merge=lfs -text
45
+ document/Thời[[:space:]]cuộc[[:space:]]chuyển[[:space:]]đổi[[:space:]]số[[:space:]]của[[:space:]]ngành[[:space:]][[:space:]]truyền[[:space:]]thông.pdf filter=lfs diff=lfs merge=lfs -text
46
+ document/Tong[[:space:]]hop[[:space:]]cac[[:space:]]tai[[:space:]]lieu[[:space:]]Chuyen[[:space:]]doi[[:space:]]so.pdf filter=lfs diff=lfs merge=lfs -text
47
+ document/Xay[[:space:]]dung[[:space:]]mo[[:space:]]hinh[[:space:]]chuyen[[:space:]]doi[[:space:]]so[[:space:]]HTV_NGUYEN[[:space:]]TAN[[:space:]]DUC.pdf filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,12 +1,87 @@
1
  ---
2
- title: Digitizedgemini
3
- emoji: 🚀
4
- colorFrom: red
5
- colorTo: blue
6
  sdk: gradio
7
- sdk_version: 5.42.0
8
- app_file: app.py
9
- pinned: false
10
  ---
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: digitizedgemini
3
+ app_file: app_gemini.py
 
 
4
  sdk: gradio
5
+ sdk_version: 5.33.1
 
 
6
  ---
7
 
8
+ # DigitizedBrains RAG Chatbot
9
+
10
+ An intelligent chatbot powered by Google's Gemini AI and Retrieval-Augmented Generation (RAG) technology.
11
+
12
+ ## Features
13
+
14
+ - **RAG-based Knowledge Retrieval**: Uses comprehensive document knowledge base for accurate responses
15
+ - **Multi-document Search**: Intelligently searches and ranks relevant documents
16
+ - **Professional AI Representative**: Represents Duc Nguyen and DigitizedBrains company
17
+ - **Lead Capture**: Automatically detects and records user contact information
18
+ - **Unknown Question Tracking**: Logs questions that couldn't be answered for improvement
19
+
20
+ ## Knowledge Base
21
+
22
+ The chatbot has access to extensive documentation about:
23
+ - Digital transformation strategies
24
+ - Business intelligence solutions
25
+ - AI implementation approaches
26
+ - Company services and expertise
27
+ - Industry best practices
28
+
29
+ ## Technology Stack
30
+
31
+ - **Google Gemini AI**: Advanced language model for natural conversation
32
+ - **Gradio**: User-friendly web interface
33
+ - **RAG Architecture**: Retrieval-Augmented Generation for knowledge-based responses
34
+ - **Python**: Backend implementation
35
+
36
+ ## Usage
37
+
38
+ Simply start a conversation with the chatbot about:
39
+ - Digital transformation consulting
40
+ - AI and business intelligence solutions
41
+ - Company services and capabilities
42
+ - Industry expertise and case studies
43
+
44
+ The chatbot will provide informed responses based on its comprehensive knowledge base.
45
+
46
+ ## Environment Variables Required
47
+
48
+ To run this application, you need to set:
49
+ - `GEMINI_API_KEY`: Your Google Gemini API key
50
+ - `PUSHOVER_TOKEN`: (Optional) For notifications
51
+ - `PUSHOVER_USER`: (Optional) For notifications
52
+
53
+ ## Configuration
54
+
55
+ This Space is configured to run with Gradio and uses the Gemini AI model for intelligent responses.
56
+
57
+
58
+ Bash(cd Agents_Gemini && huggingface-cli login)
59
+ ⎿ Running in the background (down arrow to manage)
60
+
61
+ ● BashOutput(Reading shell output)
62
+ ⎿ To log in, `huggingface_hub` requires a token generated from htt
63
+ ps://huggingface.co/settings/tokens .
64
+ Token can be pasted using 'Right-Click'.
65
+
66
+ Bạn cần:
67
+
68
+ 1. Truy cập https://huggingface.co/settings/tokens để tạo token
69
+ 2. Paste token vào terminal
70
+ 3. Sau khi login, bạn có thể tạo Space bằng lệnh:
71
+
72
+ huggingface-cli repo create digitizedbrains-rag-chatbot --type=space
73
+ --space_sdk=gradio
74
+
75
+
76
+
77
+ 4. Upload files:
78
+ git init
79
+ git add .
80
+ git commit -m "Initial commit"
81
+ git remote add origin https://huggingface.co/spaces/YOUR_USERNAME/digi
82
+ tizedbrains-rag-chatbot
83
+ git push origin main
84
+
85
+ Lưu ý quan trọng:
86
+ - Bạn cần đặt GEMINI_API_KEY trong Space Settings của Hugging Face
87
+ - Thư mục document/ cần được upload cùng để RAG hoạt động
__pycache__/app_gemini.cpython-312.pyc ADDED
Binary file (9.96 kB). View file
 
__pycache__/app_gemini.cpython-313.pyc ADDED
Binary file (14.1 kB). View file
 
app.py ADDED
@@ -0,0 +1,223 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import google.generativeai as genai
2
+ import json
3
+ import os
4
+ import requests
5
+ import gradio as gr
6
+ import re
7
+ import glob
8
+ from collections import defaultdict
9
+
10
+ # Configure Gemini API - Use environment variables for security
11
+ genai.configure(api_key=os.getenv("GEMINI_API_KEY"))
12
+
13
+ def push(text):
14
+ try:
15
+ requests.post(
16
+ "https://api.pushover.net/1/messages.json",
17
+ data={
18
+ "token": os.getenv("PUSHOVER_TOKEN"),
19
+ "user": os.getenv("PUSHOVER_USER"),
20
+ "message": text,
21
+ }
22
+ )
23
+ except:
24
+ print(f"Push notification: {text}")
25
+
26
+ def record_user_details(email, name="Name not provided", notes="not provided"):
27
+ push(f"Recording {name} with email {email} and notes {notes}")
28
+ return {"recorded": "ok"}
29
+
30
+ def record_unknown_question(question):
31
+ push(f"Recording {question}")
32
+ return {"recorded": "ok"}
33
+
34
+ record_user_details_json = {
35
+ "name": "record_user_details",
36
+ "description": "Use this tool to record that a user is interested in being in touch and provided an email address",
37
+ "parameters": {
38
+ "type": "object",
39
+ "properties": {
40
+ "email": {
41
+ "type": "string",
42
+ "description": "The email address of this user"
43
+ },
44
+ "name": {
45
+ "type": "string",
46
+ "description": "The user's name, if they provided it"
47
+ },
48
+ "notes": {
49
+ "type": "string",
50
+ "description": "Any additional information about the conversation that's worth recording to give context"
51
+ }
52
+ },
53
+ "required": ["email"],
54
+ "additionalProperties": False
55
+ }
56
+ }
57
+
58
+ record_unknown_question_json = {
59
+ "name": "record_unknown_question",
60
+ "description": "Always use this tool to record any question that couldn't be answered as you didn't know the answer",
61
+ "parameters": {
62
+ "type": "object",
63
+ "properties": {
64
+ "question": {
65
+ "type": "string",
66
+ "description": "The question that couldn't be answered"
67
+ }
68
+ },
69
+ "required": ["question"],
70
+ "additionalProperties": False
71
+ }
72
+ }
73
+
74
+ tools = [record_user_details_json, record_unknown_question_json]
75
+
76
+ class Me:
77
+
78
+ def __init__(self):
79
+ self.model = genai.GenerativeModel("gemini-1.5-flash")
80
+ self.owner_name = "Duc Nguyen"
81
+ self.chatbot_name = "DigitizedBrains"
82
+
83
+ # RAG Knowledge Base - Load text documents only (fast loading)
84
+ self.knowledge_base = self.load_text_documents()
85
+ print(f"Loaded {len(self.knowledge_base)} text documents into RAG knowledge base")
86
+
87
+ # Core information
88
+ self.linkedin = self.knowledge_base.get('linkedin_profile.txt', '[LinkedIn profile not found]')
89
+ self.summary = self.knowledge_base.get('summary.txt', '[Summary not found]')
90
+ self.digitizedbrains_info = self.knowledge_base.get('digitizedbrains_profile.txt', '[DigitizedBrains profile not found]')
91
+
92
+ def load_text_documents(self):
93
+ """Load only text documents for fast startup"""
94
+ knowledge_base = {}
95
+ document_dir = "document/"
96
+
97
+ # Load all text files (fast)
98
+ for txt_file in glob.glob(os.path.join(document_dir, "*.txt")):
99
+ filename = os.path.basename(txt_file)
100
+ try:
101
+ with open(txt_file, "r", encoding="utf-8") as f:
102
+ content = f.read()
103
+ knowledge_base[filename] = content
104
+ print(f"Loaded: {filename} ({len(content)} chars)")
105
+ except Exception as e:
106
+ print(f"Failed: {filename}")
107
+
108
+ return knowledge_base
109
+
110
+ def search_relevant_content(self, query):
111
+ """Simple RAG retrieval based on keyword matching"""
112
+ query_lower = query.lower()
113
+ relevant_docs = []
114
+
115
+ # Score documents based on relevance
116
+ doc_scores = defaultdict(int)
117
+ for filename, content in self.knowledge_base.items():
118
+ content_lower = content.lower()
119
+
120
+ # Direct query match (highest score)
121
+ if query_lower in content_lower:
122
+ doc_scores[filename] += 10
123
+
124
+ # Word-by-word matching
125
+ query_words = query_lower.split()
126
+ for word in query_words:
127
+ if len(word) > 2 and word in content_lower:
128
+ doc_scores[filename] += 2
129
+
130
+ # Return top relevant documents
131
+ sorted_docs = sorted(doc_scores.items(), key=lambda x: x[1], reverse=True)
132
+
133
+ # Get top 3 most relevant documents
134
+ for filename, score in sorted_docs[:3]:
135
+ if score > 0:
136
+ relevant_docs.append({
137
+ 'filename': filename,
138
+ 'content': self.knowledge_base[filename],
139
+ 'score': score
140
+ })
141
+
142
+ return relevant_docs
143
+
144
+ def system_prompt(self, relevant_docs=None):
145
+ system_prompt = f"You are {self.chatbot_name}, an AI representative for {self.owner_name}. \
146
+ You represent both {self.owner_name} personally and {self.chatbot_name} company. \
147
+ \n\nYou have access to a comprehensive knowledge base with {len(self.knowledge_base)} documents. \
148
+ Be professional, engaging, and use the knowledge base to provide accurate responses. \
149
+ \n\nIf you don't know something, use record_unknown_question tool. \
150
+ If users provide emails, use record_user_details tool."
151
+
152
+ # Add core information (truncated for context limit)
153
+ system_prompt += f"\n\n## Core Information:"
154
+ system_prompt += f"\n### {self.owner_name}'s Summary:\n{self.summary[:800]}..."
155
+ system_prompt += f"\n\n### {self.chatbot_name} Business:\n{self.digitizedbrains_info[:800]}..."
156
+
157
+ # Add relevant documents
158
+ if relevant_docs:
159
+ system_prompt += f"\n\n## Relevant Documents:"
160
+ for doc in relevant_docs:
161
+ system_prompt += f"\n\n### {doc['filename']} (Score: {doc['score']}):\n"
162
+ content = doc['content'][:1500] + "..." if len(doc['content']) > 1500 else doc['content']
163
+ system_prompt += content
164
+
165
+ return system_prompt
166
+
167
+ def chat(self, message, history):
168
+ # RAG Retrieval
169
+ relevant_docs = self.search_relevant_content(message)
170
+ print(f"\nQuery: {message[:50]}...")
171
+ print(f"Found {len(relevant_docs)} relevant documents:")
172
+ for doc in relevant_docs:
173
+ print(f" - {doc['filename']} (score: {doc['score']})")
174
+
175
+ # Generate response
176
+ prompt = self.system_prompt(relevant_docs) + "\n\n"
177
+
178
+ # Add conversation history
179
+ for h in history:
180
+ prompt += f"{h['role'].capitalize()}: {h['content']}\n"
181
+ prompt += f"User: {message}\nAssistant:"
182
+
183
+ try:
184
+ response = self.model.generate_content(prompt)
185
+ reply = response.text
186
+ except Exception as e:
187
+ reply = f"Xin lỗi, tôi gặp lỗi khi xử lý câu hỏi của bạn. Vui lòng thử lại. Error: {str(e)}"
188
+
189
+ # Email detection
190
+ email_match = re.search(r'[\w\.-]+@[\w\.-]+', message)
191
+ if email_match:
192
+ email = email_match.group(0)
193
+ record_user_details(email, "Website Contact", f"RAG chat: {message[:100]}")
194
+
195
+ # Unknown question detection
196
+ if "I don't know" in reply or "không biết" in reply.lower():
197
+ record_unknown_question(message)
198
+
199
+ return reply
200
+
201
+ # Initialize the chatbot
202
+ print("Starting RAG-Enhanced DigitizedBrains Chatbot...")
203
+ me = Me()
204
+ print("\n" + "="*60)
205
+ print("RAG-ENHANCED DIGITIZEDBRAINS CHATBOT READY!")
206
+ print("="*60)
207
+ print("Features:")
208
+ print(" - RAG-based knowledge retrieval")
209
+ print(" - Multi-document search")
210
+ print(" - Intelligent response generation")
211
+ print(" - Lead capture & unknown question tracking")
212
+ print("="*60)
213
+
214
+ # Launch Gradio interface
215
+ iface = gr.ChatInterface(
216
+ me.chat,
217
+ type="messages",
218
+ title="DigitizedBrains RAG Chatbot",
219
+ description="AI-powered chatbot with comprehensive knowledge base about Duc Nguyen and DigitizedBrains services."
220
+ )
221
+
222
+ if __name__ == "__main__":
223
+ iface.launch(share=False, server_name="0.0.0.0")
app_gemini.py ADDED
@@ -0,0 +1,271 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from dotenv import load_dotenv
2
+ import google.generativeai as genai
3
+ import json
4
+ import os
5
+ import requests
6
+ from pypdf import PdfReader
7
+ import gradio as gr
8
+ import re
9
+ import glob
10
+ from collections import defaultdict
11
+
12
+
13
+ load_dotenv(override=True)
14
+
15
+ def push(text):
16
+ requests.post(
17
+ "https://api.pushover.net/1/messages.json",
18
+ data={
19
+ "token": os.getenv("PUSHOVER_TOKEN"),
20
+ "user": os.getenv("PUSHOVER_USER"),
21
+ "message": text,
22
+ }
23
+ )
24
+
25
+
26
+ def record_user_details(email, name="Name not provided", notes="not provided"):
27
+ push(f"Recording {name} with email {email} and notes {notes}")
28
+ return {"recorded": "ok"}
29
+
30
+ def record_unknown_question(question):
31
+ push(f"Recording {question}")
32
+ return {"recorded": "ok"}
33
+
34
+ record_user_details_json = {
35
+ "name": "record_user_details",
36
+ "description": "Use this tool to record that a user is interested in being in touch and provided an email address",
37
+ "parameters": {
38
+ "type": "object",
39
+ "properties": {
40
+ "email": {
41
+ "type": "string",
42
+ "description": "The email address of this user"
43
+ },
44
+ "name": {
45
+ "type": "string",
46
+ "description": "The user's name, if they provided it"
47
+ },
48
+ "notes": {
49
+ "type": "string",
50
+ "description": "Any additional information about the conversation that's worth recording to give context"
51
+ }
52
+ },
53
+ "required": ["email"],
54
+ "additionalProperties": False
55
+ }
56
+ }
57
+
58
+ record_unknown_question_json = {
59
+ "name": "record_unknown_question",
60
+ "description": "Always use this tool to record any question that couldn't be answered as you didn't know the answer",
61
+ "parameters": {
62
+ "type": "object",
63
+ "properties": {
64
+ "question": {
65
+ "type": "string",
66
+ "description": "The question that couldn't be answered"
67
+ }
68
+ },
69
+ "required": ["question"],
70
+ "additionalProperties": False
71
+ }
72
+ }
73
+
74
+ tools = [record_user_details_json, record_unknown_question_json]
75
+
76
+
77
+ class Me:
78
+
79
+ def __init__(self):
80
+ genai.configure(api_key=os.getenv("GEMINI_API_KEY"))
81
+ self.model = genai.GenerativeModel("gemini-2.0-flash")
82
+ self.owner_name = "Duc Nguyen" # Chủ sở hữu website và DigitizedBrains
83
+ self.chatbot_name = "DigitizedBrains" # Nhân vật đại diện chatbot
84
+
85
+ # RAG Knowledge Base - Load all documents
86
+ self.knowledge_base = self.load_all_documents()
87
+ print(f"Loaded {len(self.knowledge_base)} documents into RAG knowledge base")
88
+
89
+ # Core information (backwards compatibility)
90
+ self.linkedin = self.knowledge_base.get('linkedin_profile.txt', '[LinkedIn profile not found]')
91
+ self.summary = self.knowledge_base.get('summary.txt', '[Summary not found]')
92
+ self.digitizedbrains_info = self.knowledge_base.get('digitizedbrains_profile.txt', '[DigitizedBrains profile not found]')
93
+
94
+ def load_all_documents(self):
95
+ """Load all documents from the document folder using RAG technique"""
96
+ knowledge_base = {}
97
+ document_dir = "document/"
98
+
99
+ # Load all text files
100
+ for txt_file in glob.glob(os.path.join(document_dir, "*.txt")):
101
+ filename = os.path.basename(txt_file)
102
+ try:
103
+ with open(txt_file, "r", encoding="utf-8") as f:
104
+ content = f.read()
105
+ knowledge_base[filename] = content
106
+ # Safe filename encoding for print
107
+ safe_filename = filename.encode('ascii', errors='replace').decode('ascii')
108
+ print(f"Loaded text document: {safe_filename} ({len(content)} chars)")
109
+ except Exception as e:
110
+ safe_filename = filename.encode('ascii', errors='replace').decode('ascii')
111
+ print(f"Warning: Could not load {safe_filename}: text loading error")
112
+
113
+ # Load all PDF files
114
+ for pdf_file in glob.glob(os.path.join(document_dir, "*.pdf")):
115
+ filename = os.path.basename(pdf_file)
116
+ try:
117
+ reader = PdfReader(pdf_file)
118
+ pdf_content = ""
119
+ for page in reader.pages:
120
+ text = page.extract_text()
121
+ if text:
122
+ pdf_content += text + "\n"
123
+ knowledge_base[filename] = pdf_content
124
+ # Safe filename encoding for print
125
+ safe_filename = filename.encode('utf-8', errors='replace').decode('utf-8')
126
+ print(f"Loaded PDF document: {safe_filename} ({len(pdf_content)} chars)")
127
+ except Exception as e:
128
+ # Handle encoding issues in error messages
129
+ safe_filename = filename.encode('ascii', errors='replace').decode('ascii')
130
+ print(f"Warning: Could not load PDF {safe_filename}: PDF loading error")
131
+
132
+ return knowledge_base
133
+
134
+ def search_relevant_content(self, query):
135
+ """Simple RAG retrieval - find most relevant documents based on keyword matching"""
136
+ query_lower = query.lower()
137
+ relevant_docs = []
138
+
139
+ # Keywords for different document types
140
+ keywords = {
141
+ 'personal': ['duc nguyen', 'linkedin', 'career', 'experience', 'education', 'background', 'profile'],
142
+ 'business': ['digitizedbrains', 'company', 'services', 'solutions', 'automation', 'ai agent'],
143
+ 'digital_transformation': ['chuyển đổi số', 'digital transformation', 'technology', 'broadcasting', 'htv'],
144
+ 'experience': ['kinh nghiệm', 'experience', 'học', 'tham luận', 'diễn đàn'],
145
+ 'hunan_broadcasting': ['hồ nam', 'hunan', 'truyền hình', 'broadcasting', 'television', 'đài', 'tập đoàn', 'ngụy văn bân', 'mango', 'bài học', 'lesson', 'kinh nghiệm']
146
+ }
147
+
148
+ # Score documents based on keyword relevance
149
+ doc_scores = defaultdict(int)
150
+ for filename, content in self.knowledge_base.items():
151
+ content_lower = content.lower()
152
+
153
+ # Direct query match
154
+ if query_lower in content_lower:
155
+ doc_scores[filename] += 10
156
+
157
+ # Keyword category matching
158
+ for category, category_keywords in keywords.items():
159
+ for keyword in category_keywords:
160
+ if keyword in query_lower and keyword in content_lower:
161
+ doc_scores[filename] += 5
162
+
163
+ # Additional scoring for query words
164
+ query_words = query_lower.split()
165
+ for word in query_words:
166
+ if len(word) > 2 and word in content_lower:
167
+ doc_scores[filename] += 2
168
+
169
+ # Return top relevant documents
170
+ sorted_docs = sorted(doc_scores.items(), key=lambda x: x[1], reverse=True)
171
+
172
+ # Get top 5 most relevant documents
173
+ for filename, score in sorted_docs[:5]:
174
+ if score > 0:
175
+ relevant_docs.append({
176
+ 'filename': filename,
177
+ 'content': self.knowledge_base[filename],
178
+ 'score': score
179
+ })
180
+
181
+ return relevant_docs
182
+
183
+
184
+ def handle_tool_call(self, tool_calls):
185
+ results = []
186
+ for tool_call in tool_calls:
187
+ tool_name = tool_call.function.name
188
+ arguments = json.loads(tool_call.function.arguments)
189
+ print(f"Tool called: {tool_name}", flush=True)
190
+ tool = globals().get(tool_name)
191
+ result = tool(**arguments) if tool else {}
192
+ results.append({"role": "tool","content": json.dumps(result),"tool_call_id": tool_call.id})
193
+ return results
194
+
195
+ def system_prompt(self, relevant_docs=None):
196
+ system_prompt = f"You are {self.chatbot_name}, an AI representative acting on behalf of {self.owner_name}. \
197
+ You are answering questions on {self.owner_name}'s website, representing both {self.owner_name} personally and the {self.chatbot_name} company/brand. \
198
+ \n\nYour responsibilities include: \
199
+ 1. Representing {self.owner_name}'s career, background, skills and experience using his comprehensive knowledge base \
200
+ 2. Representing {self.chatbot_name} as a digital transformation and AI solutions company \
201
+ 3. Answering questions about digital transformation, broadcasting, and technology expertise \
202
+ 4. Using the extensive document knowledge base to provide detailed, accurate responses \
203
+ \n\nYou have access to a comprehensive RAG knowledge base with {len(self.knowledge_base)} documents including: \
204
+ - Personal information about {self.owner_name} (career, LinkedIn, education, experience) \
205
+ - Business information about {self.chatbot_name} (services, solutions, capabilities) \
206
+ - Digital transformation expertise and case studies \
207
+ - Broadcasting and media technology knowledge \
208
+ - Academic papers and industry presentations \
209
+ \n\nBe professional and engaging, using the knowledge base to provide comprehensive answers. \
210
+ When discussing {self.owner_name}, speak about him in first person as his representative. \
211
+ When discussing {self.chatbot_name}, represent the company's capabilities and services. \
212
+ \n\nIf you don't know the answer to any question, use your record_unknown_question tool to record it. \
213
+ Only ask for contact information if the user specifically expresses interest in getting in touch or requests services. Do not proactively push for contact details or add unnecessary calls-to-action about API services."
214
+
215
+ # Add core information
216
+ system_prompt += f"\n\n## Core Information:"
217
+ system_prompt += f"\n### {self.owner_name}'s Summary:\n{self.summary[:2000]}..."
218
+ system_prompt += f"\n\n### {self.chatbot_name} Business Profile:\n{self.digitizedbrains_info[:2000]}..."
219
+
220
+ # Add relevant documents if provided
221
+ if relevant_docs:
222
+ system_prompt += f"\n\n## Relevant Knowledge Base Documents:"
223
+ for doc in relevant_docs:
224
+ system_prompt += f"\n\n### Document: {doc['filename']} (Relevance Score: {doc['score']})\n"
225
+ # Truncate content to avoid context limit
226
+ content = doc['content'][:3000] + "..." if len(doc['content']) > 3000 else doc['content']
227
+ system_prompt += content
228
+
229
+ system_prompt += f"\n\nWith this comprehensive RAG knowledge base, please provide detailed and accurate responses as {self.chatbot_name}, \
230
+ representing both {self.owner_name} personally and the {self.chatbot_name} business professionally."
231
+ return system_prompt
232
+
233
+ def chat(self, message, history):
234
+ # RAG Retrieval - Find relevant documents for the user's question
235
+ relevant_docs = self.search_relevant_content(message)
236
+ try:
237
+ safe_message = message[:100].encode('ascii', errors='replace').decode('ascii')
238
+ print(f"Found {len(relevant_docs)} relevant documents for query: {safe_message}...")
239
+ except:
240
+ print(f"Found {len(relevant_docs)} relevant documents for user query")
241
+
242
+ # Generate prompt with relevant context
243
+ prompt = self.system_prompt(relevant_docs) + "\n\n"
244
+
245
+ # Add conversation history
246
+ for h in history:
247
+ prompt += f"{h['role'].capitalize()}: {h['content']}\n"
248
+ prompt += f"User: {message}\nAssistant:"
249
+
250
+ # Generate response
251
+ response = self.model.generate_content(prompt)
252
+ reply = response.text
253
+
254
+ # Tìm email trong message hoặc reply
255
+ email_match = re.search(r'[\w\.-]+@[\w\.-]+', message)
256
+ if email_match:
257
+ email = email_match.group(0)
258
+ name = "Contact from website" # hoặc trích xuất tên nếu muốn
259
+ notes = f"User provided email via {self.chatbot_name} chat with RAG knowledge base"
260
+ record_user_details(email, name, notes)
261
+
262
+ # Nếu Gemini trả lời không biết, thì ghi lại câu hỏi
263
+ if "I don't know" in reply or "I'm not sure" in reply or "Tôi không biết" in reply:
264
+ record_unknown_question(message)
265
+
266
+ return reply
267
+
268
+
269
+ if __name__ == "__main__":
270
+ me = Me()
271
+ gr.ChatInterface(me.chat, type="messages").launch()
app_rag_simple.py ADDED
@@ -0,0 +1,226 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from dotenv import load_dotenv
2
+ import google.generativeai as genai
3
+ import json
4
+ import os
5
+ import requests
6
+ import gradio as gr
7
+ import re
8
+ import glob
9
+ from collections import defaultdict
10
+
11
+
12
+ load_dotenv(override=True)
13
+
14
+ def push(text):
15
+ try:
16
+ requests.post(
17
+ "https://api.pushover.net/1/messages.json",
18
+ data={
19
+ "token": os.getenv("PUSHOVER_TOKEN"),
20
+ "user": os.getenv("PUSHOVER_USER"),
21
+ "message": text,
22
+ }
23
+ )
24
+ except:
25
+ print(f"Push notification: {text}")
26
+
27
+
28
+ def record_user_details(email, name="Name not provided", notes="not provided"):
29
+ push(f"Recording {name} with email {email} and notes {notes}")
30
+ return {"recorded": "ok"}
31
+
32
+ def record_unknown_question(question):
33
+ push(f"Recording {question}")
34
+ return {"recorded": "ok"}
35
+
36
+ record_user_details_json = {
37
+ "name": "record_user_details",
38
+ "description": "Use this tool to record that a user is interested in being in touch and provided an email address",
39
+ "parameters": {
40
+ "type": "object",
41
+ "properties": {
42
+ "email": {
43
+ "type": "string",
44
+ "description": "The email address of this user"
45
+ },
46
+ "name": {
47
+ "type": "string",
48
+ "description": "The user's name, if they provided it"
49
+ },
50
+ "notes": {
51
+ "type": "string",
52
+ "description": "Any additional information about the conversation that's worth recording to give context"
53
+ }
54
+ },
55
+ "required": ["email"],
56
+ "additionalProperties": False
57
+ }
58
+ }
59
+
60
+ record_unknown_question_json = {
61
+ "name": "record_unknown_question",
62
+ "description": "Always use this tool to record any question that couldn't be answered as you didn't know the answer",
63
+ "parameters": {
64
+ "type": "object",
65
+ "properties": {
66
+ "question": {
67
+ "type": "string",
68
+ "description": "The question that couldn't be answered"
69
+ }
70
+ },
71
+ "required": ["question"],
72
+ "additionalProperties": False
73
+ }
74
+ }
75
+
76
+ tools = [record_user_details_json, record_unknown_question_json]
77
+
78
+
79
+ class Me:
80
+
81
+ def __init__(self):
82
+ genai.configure(api_key=os.getenv("GEMINI_API_KEY"))
83
+ self.model = genai.GenerativeModel("gemini-1.5-flash")
84
+ self.owner_name = "Duc Nguyen"
85
+ self.chatbot_name = "DigitizedBrains"
86
+
87
+ # RAG Knowledge Base - Load text documents only (fast loading)
88
+ self.knowledge_base = self.load_text_documents()
89
+ print(f"Loaded {len(self.knowledge_base)} text documents into RAG knowledge base")
90
+
91
+ # Core information
92
+ self.linkedin = self.knowledge_base.get('linkedin_profile.txt', '[LinkedIn profile not found]')
93
+ self.summary = self.knowledge_base.get('summary.txt', '[Summary not found]')
94
+ self.digitizedbrains_info = self.knowledge_base.get('digitizedbrains_profile.txt', '[DigitizedBrains profile not found]')
95
+
96
+ def load_text_documents(self):
97
+ """Load only text documents for fast startup"""
98
+ knowledge_base = {}
99
+ document_dir = "document/"
100
+
101
+ # Load all text files (fast)
102
+ for txt_file in glob.glob(os.path.join(document_dir, "*.txt")):
103
+ filename = os.path.basename(txt_file)
104
+ try:
105
+ with open(txt_file, "r", encoding="utf-8") as f:
106
+ content = f.read()
107
+ knowledge_base[filename] = content
108
+ print(f"Loaded: {filename} ({len(content)} chars)")
109
+ except Exception as e:
110
+ print(f"Failed: {filename}")
111
+
112
+ return knowledge_base
113
+
114
+ def search_relevant_content(self, query):
115
+ """Simple RAG retrieval based on keyword matching"""
116
+ query_lower = query.lower()
117
+ relevant_docs = []
118
+
119
+ # Score documents based on relevance
120
+ doc_scores = defaultdict(int)
121
+ for filename, content in self.knowledge_base.items():
122
+ content_lower = content.lower()
123
+
124
+ # Direct query match (highest score)
125
+ if query_lower in content_lower:
126
+ doc_scores[filename] += 10
127
+
128
+ # Word-by-word matching
129
+ query_words = query_lower.split()
130
+ for word in query_words:
131
+ if len(word) > 2 and word in content_lower:
132
+ doc_scores[filename] += 2
133
+
134
+ # Return top relevant documents
135
+ sorted_docs = sorted(doc_scores.items(), key=lambda x: x[1], reverse=True)
136
+
137
+ # Get top 3 most relevant documents
138
+ for filename, score in sorted_docs[:3]:
139
+ if score > 0:
140
+ relevant_docs.append({
141
+ 'filename': filename,
142
+ 'content': self.knowledge_base[filename],
143
+ 'score': score
144
+ })
145
+
146
+ return relevant_docs
147
+
148
+ def system_prompt(self, relevant_docs=None):
149
+ system_prompt = f"You are {self.chatbot_name}, an AI representative for {self.owner_name}. \
150
+ You represent both {self.owner_name} personally and {self.chatbot_name} company. \
151
+ \n\nYou have access to a comprehensive knowledge base with {len(self.knowledge_base)} documents. \
152
+ Be professional, engaging, and use the knowledge base to provide accurate responses. \
153
+ \n\nIf you don't know something, use record_unknown_question tool. \
154
+ If users provide emails, use record_user_details tool."
155
+
156
+ # Add core information (truncated for context limit)
157
+ system_prompt += f"\n\n## Core Information:"
158
+ system_prompt += f"\n### {self.owner_name}'s Summary:\n{self.summary[:800]}..."
159
+ system_prompt += f"\n\n### {self.chatbot_name} Business:\n{self.digitizedbrains_info[:800]}..."
160
+
161
+ # Add relevant documents
162
+ if relevant_docs:
163
+ system_prompt += f"\n\n## Relevant Documents:"
164
+ for doc in relevant_docs:
165
+ system_prompt += f"\n\n### {doc['filename']} (Score: {doc['score']}):\n"
166
+ content = doc['content'][:1500] + "..." if len(doc['content']) > 1500 else doc['content']
167
+ system_prompt += content
168
+
169
+ return system_prompt
170
+
171
+ def chat(self, message, history):
172
+ # RAG Retrieval
173
+ relevant_docs = self.search_relevant_content(message)
174
+ print(f"\nQuery: {message[:50]}...")
175
+ print(f"Found {len(relevant_docs)} relevant documents:")
176
+ for doc in relevant_docs:
177
+ print(f" - {doc['filename']} (score: {doc['score']})")
178
+
179
+ # Generate response
180
+ prompt = self.system_prompt(relevant_docs) + "\n\n"
181
+
182
+ # Add conversation history
183
+ for h in history:
184
+ prompt += f"{h['role'].capitalize()}: {h['content']}\n"
185
+ prompt += f"User: {message}\nAssistant:"
186
+
187
+ try:
188
+ response = self.model.generate_content(prompt)
189
+ reply = response.text
190
+ except Exception as e:
191
+ reply = f"Xin lỗi, tôi gặp lỗi khi xử lý câu hỏi của bạn. Vui lòng thử lại. Error: {str(e)}"
192
+
193
+ # Email detection
194
+ email_match = re.search(r'[\w\.-]+@[\w\.-]+', message)
195
+ if email_match:
196
+ email = email_match.group(0)
197
+ record_user_details(email, "Website Contact", f"RAG chat: {message[:100]}")
198
+
199
+ # Unknown question detection
200
+ if "I don't know" in reply or "không biết" in reply.lower():
201
+ record_unknown_question(message)
202
+
203
+ return reply
204
+
205
+
206
+ if __name__ == "__main__":
207
+ print("Starting RAG-Enhanced DigitizedBrains Chatbot...")
208
+ me = Me()
209
+ print("\n" + "="*60)
210
+ print("RAG-ENHANCED DIGITIZEDBRAINS CHATBOT READY!")
211
+ print("="*60)
212
+ print("Features:")
213
+ print(" - RAG-based knowledge retrieval")
214
+ print(" - Multi-document search")
215
+ print(" - Intelligent response generation")
216
+ print(" - Lead capture & unknown question tracking")
217
+ print("="*60)
218
+
219
+ # Launch Gradio interface
220
+ iface = gr.ChatInterface(
221
+ me.chat,
222
+ type="messages",
223
+ title="DigitizedBrains RAG Chatbot",
224
+ description="AI-powered chatbot with comprehensive knowledge base about Duc Nguyen and DigitizedBrains services."
225
+ )
226
+ iface.launch(share=False, server_name="0.0.0.0")
app_rag_test.py ADDED
@@ -0,0 +1,265 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from dotenv import load_dotenv
2
+ import google.generativeai as genai
3
+ import json
4
+ import os
5
+ import requests
6
+ from pypdf import PdfReader
7
+ import gradio as gr
8
+ import re
9
+ import glob
10
+ from collections import defaultdict
11
+
12
+
13
+ load_dotenv(override=True)
14
+
15
+ def push(text):
16
+ requests.post(
17
+ "https://api.pushover.net/1/messages.json",
18
+ data={
19
+ "token": os.getenv("PUSHOVER_TOKEN"),
20
+ "user": os.getenv("PUSHOVER_USER"),
21
+ "message": text,
22
+ }
23
+ )
24
+
25
+
26
+ def record_user_details(email, name="Name not provided", notes="not provided"):
27
+ push(f"Recording {name} with email {email} and notes {notes}")
28
+ return {"recorded": "ok"}
29
+
30
+ def record_unknown_question(question):
31
+ push(f"Recording {question}")
32
+ return {"recorded": "ok"}
33
+
34
+ record_user_details_json = {
35
+ "name": "record_user_details",
36
+ "description": "Use this tool to record that a user is interested in being in touch and provided an email address",
37
+ "parameters": {
38
+ "type": "object",
39
+ "properties": {
40
+ "email": {
41
+ "type": "string",
42
+ "description": "The email address of this user"
43
+ },
44
+ "name": {
45
+ "type": "string",
46
+ "description": "The user's name, if they provided it"
47
+ },
48
+ "notes": {
49
+ "type": "string",
50
+ "description": "Any additional information about the conversation that's worth recording to give context"
51
+ }
52
+ },
53
+ "required": ["email"],
54
+ "additionalProperties": False
55
+ }
56
+ }
57
+
58
+ record_unknown_question_json = {
59
+ "name": "record_unknown_question",
60
+ "description": "Always use this tool to record any question that couldn't be answered as you didn't know the answer",
61
+ "parameters": {
62
+ "type": "object",
63
+ "properties": {
64
+ "question": {
65
+ "type": "string",
66
+ "description": "The question that couldn't be answered"
67
+ }
68
+ },
69
+ "required": ["question"],
70
+ "additionalProperties": False
71
+ }
72
+ }
73
+
74
+ tools = [record_user_details_json, record_unknown_question_json]
75
+
76
+
77
+ class Me:
78
+
79
+ def __init__(self):
80
+ genai.configure(api_key=os.getenv("GEMINI_API_KEY"))
81
+ self.model = genai.GenerativeModel("gemini-1.5-flash")
82
+ self.owner_name = "Duc Nguyen" # Chủ sở hữu website và DigitizedBrains
83
+ self.chatbot_name = "DigitizedBrains" # Nhân vật đại diện chatbot
84
+
85
+ # RAG Knowledge Base - Load all documents (text files only for faster loading)
86
+ self.knowledge_base = self.load_documents_optimized()
87
+ print(f"Loaded {len(self.knowledge_base)} documents into RAG knowledge base")
88
+
89
+ # Core information (backwards compatibility)
90
+ self.linkedin = self.knowledge_base.get('linkedin_profile.txt', '[LinkedIn profile not found]')
91
+ self.summary = self.knowledge_base.get('summary.txt', '[Summary not found]')
92
+ self.digitizedbrains_info = self.knowledge_base.get('digitizedbrains_profile.txt', '[DigitizedBrains profile not found]')
93
+
94
+ def load_documents_optimized(self):
95
+ """Load documents optimized for faster startup (text files only)"""
96
+ knowledge_base = {}
97
+ document_dir = "document/"
98
+
99
+ # Load all text files (fast)
100
+ for txt_file in glob.glob(os.path.join(document_dir, "*.txt")):
101
+ filename = os.path.basename(txt_file)
102
+ try:
103
+ with open(txt_file, "r", encoding="utf-8") as f:
104
+ content = f.read()
105
+ knowledge_base[filename] = content
106
+ print(f"Loaded text document: {filename} ({len(content)} chars)")
107
+ except Exception as e:
108
+ print(f"Warning: Could not load {filename}: text loading error")
109
+
110
+ # Load only essential PDF files for demo
111
+ essential_pdfs = ['chuyendoiso.pdf', 'digitizedbrains.pdf', 'linkedin.pdf']
112
+ for pdf_name in essential_pdfs:
113
+ pdf_path = os.path.join(document_dir, pdf_name)
114
+ if os.path.exists(pdf_path):
115
+ try:
116
+ reader = PdfReader(pdf_path)
117
+ pdf_content = ""
118
+ for page in reader.pages[:5]: # Only first 5 pages for speed
119
+ text = page.extract_text()
120
+ if text:
121
+ pdf_content += text + "\n"
122
+ knowledge_base[pdf_name] = pdf_content
123
+ print(f"Loaded PDF document: {pdf_name} ({len(pdf_content)} chars)")
124
+ except Exception as e:
125
+ print(f"Warning: Could not load PDF {pdf_name}: PDF loading error")
126
+
127
+ return knowledge_base
128
+
129
+ def search_relevant_content(self, query):
130
+ """Simple RAG retrieval - find most relevant documents based on keyword matching"""
131
+ query_lower = query.lower()
132
+ relevant_docs = []
133
+
134
+ # Keywords for different document types
135
+ keywords = {
136
+ 'personal': ['duc nguyen', 'linkedin', 'career', 'experience', 'education', 'background', 'profile'],
137
+ 'business': ['digitizedbrains', 'company', 'services', 'solutions', 'automation', 'ai agent'],
138
+ 'digital_transformation': ['chuyển đổi số', 'digital transformation', 'technology', 'broadcasting', 'htv'],
139
+ 'experience': ['kinh nghiệm', 'experience', 'học', 'tham luận', 'diễn đàn']
140
+ }
141
+
142
+ # Score documents based on keyword relevance
143
+ doc_scores = defaultdict(int)
144
+ for filename, content in self.knowledge_base.items():
145
+ content_lower = content.lower()
146
+
147
+ # Direct query match
148
+ if query_lower in content_lower:
149
+ doc_scores[filename] += 10
150
+
151
+ # Keyword category matching
152
+ for category, category_keywords in keywords.items():
153
+ for keyword in category_keywords:
154
+ if keyword in query_lower and keyword in content_lower:
155
+ doc_scores[filename] += 5
156
+
157
+ # Additional scoring for query words
158
+ query_words = query_lower.split()
159
+ for word in query_words:
160
+ if len(word) > 2 and word in content_lower:
161
+ doc_scores[filename] += 2
162
+
163
+ # Return top relevant documents
164
+ sorted_docs = sorted(doc_scores.items(), key=lambda x: x[1], reverse=True)
165
+
166
+ # Get top 5 most relevant documents
167
+ for filename, score in sorted_docs[:5]:
168
+ if score > 0:
169
+ relevant_docs.append({
170
+ 'filename': filename,
171
+ 'content': self.knowledge_base[filename],
172
+ 'score': score
173
+ })
174
+
175
+ return relevant_docs
176
+
177
+ def system_prompt(self, relevant_docs=None):
178
+ system_prompt = f"You are {self.chatbot_name}, an AI representative acting on behalf of {self.owner_name}. \
179
+ You are answering questions on {self.owner_name}'s website, representing both {self.owner_name} personally and the {self.chatbot_name} company/brand. \
180
+ \n\nYour responsibilities include: \
181
+ 1. Representing {self.owner_name}'s career, background, skills and experience using his comprehensive knowledge base \
182
+ 2. Representing {self.chatbot_name} as a digital transformation and AI solutions company \
183
+ 3. Answering questions about digital transformation, broadcasting, and technology expertise \
184
+ 4. Using the extensive document knowledge base to provide detailed, accurate responses \
185
+ \n\nYou have access to a comprehensive RAG knowledge base with {len(self.knowledge_base)} documents including: \
186
+ - Personal information about {self.owner_name} (career, LinkedIn, education, experience) \
187
+ - Business information about {self.chatbot_name} (services, solutions, capabilities) \
188
+ - Digital transformation expertise and case studies \
189
+ - Broadcasting and media technology knowledge \
190
+ - Academic papers and industry presentations \
191
+ \n\nBe professional and engaging, using the knowledge base to provide comprehensive answers. \
192
+ When discussing {self.owner_name}, speak about him in first person as his representative. \
193
+ When discussing {self.chatbot_name}, represent the company's capabilities and services. \
194
+ \n\nIf you don't know the answer to any question, use your record_unknown_question tool to record it. \
195
+ If the user shows interest in services or wants to connect, try to get their email using your record_user_details tool."
196
+
197
+ # Add core information
198
+ system_prompt += f"\n\n## Core Information:"
199
+ system_prompt += f"\n### {self.owner_name}'s Summary:\n{self.summary[:1000]}..."
200
+ system_prompt += f"\n\n### {self.chatbot_name} Business Profile:\n{self.digitizedbrains_info[:1000]}..."
201
+
202
+ # Add relevant documents if provided
203
+ if relevant_docs:
204
+ system_prompt += f"\n\n## Relevant Knowledge Base Documents:"
205
+ for doc in relevant_docs:
206
+ system_prompt += f"\n\n### Document: {doc['filename']} (Relevance Score: {doc['score']})\n"
207
+ # Truncate content to avoid context limit
208
+ content = doc['content'][:2000] + "..." if len(doc['content']) > 2000 else doc['content']
209
+ system_prompt += content
210
+
211
+ system_prompt += f"\n\nWith this comprehensive RAG knowledge base, please provide detailed and accurate responses as {self.chatbot_name}, \
212
+ representing both {self.owner_name} personally and the {self.chatbot_name} business professionally."
213
+ return system_prompt
214
+
215
+ def handle_tool_call(self, tool_calls):
216
+ results = []
217
+ for tool_call in tool_calls:
218
+ tool_name = tool_call.function.name
219
+ arguments = json.loads(tool_call.function.arguments)
220
+ print(f"Tool called: {tool_name}", flush=True)
221
+ tool = globals().get(tool_name)
222
+ result = tool(**arguments) if tool else {}
223
+ results.append({"role": "tool","content": json.dumps(result),"tool_call_id": tool_call.id})
224
+ return results
225
+
226
+ def chat(self, message, history):
227
+ # RAG Retrieval - Find relevant documents for the user's question
228
+ relevant_docs = self.search_relevant_content(message)
229
+ print(f"Found {len(relevant_docs)} relevant documents for query: {message[:100]}...")
230
+ for doc in relevant_docs[:3]:
231
+ print(f" - {doc['filename']} (score: {doc['score']})")
232
+
233
+ # Generate prompt with relevant context
234
+ prompt = self.system_prompt(relevant_docs) + "\n\n"
235
+
236
+ # Add conversation history
237
+ for h in history:
238
+ prompt += f"{h['role'].capitalize()}: {h['content']}\n"
239
+ prompt += f"User: {message}\nAssistant:"
240
+
241
+ # Generate response
242
+ response = self.model.generate_content(prompt)
243
+ reply = response.text
244
+
245
+ # Tìm email trong message hoặc reply
246
+ email_match = re.search(r'[\w\.-]+@[\w\.-]+', message)
247
+ if email_match:
248
+ email = email_match.group(0)
249
+ name = "Contact from website" # hoặc trích xuất tên nếu muốn
250
+ notes = f"User provided email via {self.chatbot_name} chat with RAG knowledge base"
251
+ record_user_details(email, name, notes)
252
+
253
+ # Nếu Gemini trả lời không biết, thì ghi lại câu hỏi
254
+ if "I don't know" in reply or "I'm not sure" in reply or "Tôi không biết" in reply:
255
+ record_unknown_question(message)
256
+
257
+ return reply
258
+
259
+
260
+ if __name__ == "__main__":
261
+ me = Me()
262
+ print("\n" + "="*50)
263
+ print("RAG-ENHANCED DIGITIZEDBRAINS CHATBOT READY!")
264
+ print("="*50)
265
+ gr.ChatInterface(me.chat, type="messages").launch()
document/8 bài học kinh nghiệm tại Hồ Nam Trung Quốc.pdf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:66752b2bba3fda91201b7589d9583d152785c04f78cb330bf45b9d7ba80993c0
3
+ size 459099
document/Bài Tham Luận_Chuyen doi so.pdf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8a463b4330915f63b9c643f43ad6ea6ffd37dbbd259acba1812a0e84c1185236
3
+ size 190744
document/CHUYỂN ĐỔI SỐ ĐÀI TRUYỀN HÌNH TP HCM_ver1.pdf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d30ec0e63038192ec4d018415cec5d7f6314f53669a5c23563edf6deeca75c1e
3
+ size 1694083
document/Chuyển đổi số VieON.pdf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:09c2d2b389519fa50965335378fdbd55c54875e90333c3e0a02ebd5c6178e10c
3
+ size 431240
document/My Changsha App.pdf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:650ec6056a2704146b156828432ccd67f7200344fcadbc96d99e318c83d76ebe
3
+ size 258660
document/Phần trình bày trong Diễn Đàn Chuyển Đổi Số_Nguyễn Tấn Đức.pdf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8e19a116d5f2f1a98c8919d77054b9747f86096a5a300782bd5a6713da5b7719
3
+ size 323153
document/Phụ luc 1_ SỬ DỤNG API MỞ ĐỂ CẢI THIỆN DỊCH VỤ VÀ MỞ RỘNG HỆ SINH THÁI.pdf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4943f09adccd1bc30b75cf3c6f7f336954aa39f199af0501dca6db674fe9bc8a
3
+ size 195399
document/Summary-ThanhHoa.txt ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ Dưới đây là thông tin tổng hợp về Tiến sĩ Nguyễn Thanh Hòa – Phó Giám đốc Trung tâm Chuyển đổi số Thành phố Hồ Chí Minh (HCMC-DXCENTER) phù hợp để trình bày trong một file Word:
3
+ Tiến sĩ Nguyễn Thanh Hòa
4
+ Phó Giám đốc Trung tâm Chuyển đổi số Thành phố Hồ Chí Minh (HCMC-DXCENTER)
5
+ Thông tin chính
6
+ • Chức vụ: Phó Giám đốc Trung tâm Chuyển đổi số TP.HCM[1].
7
+ • Chuyên ngành: Báo chí truyền thông, chuyên gia chuyển đổi số.
8
+ • Trung tâm Chuyển đổi số TP.HCM là đơn vị sự nghiệp công lập thuộc lĩnh vực thông tin truyền thông trực thuộc Ủy ban nhân dân TP.HCM[1].
9
+ • Ông Nguyễn Thanh Hòa thường xuyên tham gia các hội thảo, tọa đàm về chuyển đổi số tại TP.HCM, nhấn mạnh yếu tố "chuyển đổi" (thay đổi quy trình, mô hình vận hành, mô hình kinh doanh) song song với đầu tư hạ tầng công nghệ số[2].
10
+ • Trung tâm Chuyển đổi số TP.HCM có vai trò: Triển khai Kiến trúc Chính quyền điện tử, vận hành trung tâm dữ liệu, quản lý mạng truyền số liệu, phối hợp với các sở ngành xây dựng thành phố thông minh, chính quyền số, kinh tế số và xã hội số.
11
+ • Trung tâm tập trung đẩy mạnh hiệu quả hành chính điện tử, vận hành nền kinh tế và xã hội số, đặc biệt trong giai đoạn năm 2024 và những năm tiếp theo nhằm số hóa hoạt động của chính quyền, doanh nghiệp và phục vụ người dân.
12
+ Quan điểm chuyên môn
13
+ “Trong quá trình chuyển đổi số, phần 'số' (hạ tầng công nghệ) thường được chú trọng trong khi phần 'chuyển đổi' (ứng dụng mô hình mới, hình thức kinh doanh mới) lại chưa được quan tâm đúng mức nên cần đẩy mạnh công tác này.”
14
+ — TS. Nguyễn Thanh Hòa, Phó Giám đốc Trung tâm Chuyển đổi số TP.HCM[2]
15
+ Một số hoạt động nổi bật
16
+ • Tham gia, trình bày tại các diễn đàn kinh tế, hội thảo khoa học – công nghệ và chuyển đổi số cấp thành phố.
17
+ • Đại diện Trung tâm nêu rõ vai trò của dữ liệu, nền tảng số; tham gia góp ý về luật Công nghiệp Công nghệ Số và vận động ứng dụng blockchain trong quản lý tài sản trí tuệ tại địa phương.
18
+ • Tích cực tham gia vào việc xây dựng và phổ biến các giải pháp số phục vụ hoạt động quản lý, kinh doanh mới cho thành phố.
19
+
20
+
21
+ 1. https://congluan.vn/bao-chi-dia-phuong-van-loay-hoay-lua-chon-mo-hinh-kinh-doanh-10295255.html
22
+ 2. https://www.sggp.org.vn/can-quan-tam-dung-muc-phan-chuyen-doi-trong-chuyen-doi-so-post800017.html
23
+ 3. https://chuyendoiso.hochiminhcity.gov.vn/danh-sách-chuyên-gia1
24
+
25
+ TS. Nguyễn Thanh Hòa
26
+ Phó Giám đốc Trung tâm Chuyển đổi số TP.HCM
27
+
28
+ TS. Nguyễn Thanh Hòa là chuyên gia hàng đầu trong lĩnh vực chuyển đổi số và truyền thông tại Việt Nam. Với hơn 10 năm nghiên cứu, giảng dạy và quản lý về chuyển đổi số, ông đã đóng vai trò quan trọng trong việc xây dựng chiến lược và triển khai các dự án chuyển đổi số quy mô lớn cho các tổ chức, doanh nghiệp và cơ quan truyền thông tại TP.HCM.
29
+
30
+ Trước khi đảm nhận vị trí Phó Giám đốc Trung tâm Chuyển đổi số TP.HCM, TS. Hòa có hơn 17 năm kinh nghiệm làm việc và quản lý tại các đơn vị báo chí, truyền hình lớn, nơi ông đã dẫn dắt nhiều sáng kiến đổi mới sáng tạo, ứng dụng công nghệ AI và dữ liệu lớn vào sản xuất nội dung, tối ưu hóa quy trình vận hành và nâng cao trải nghiệm người dùng.
31
+
32
+ Ông là tác giả của nhiều công trình nghiên cứu, bài báo khoa học về chuyển đổi số, đồng thời là diễn giả uy tín tại các hội thảo, diễn đàn trong nước và quốc tế về công nghệ, truyền thông và quản trị số. TS. Hòa luôn tiên phong trong việc kết nối tri thức toàn cầu với thực tiễn Việt Nam, góp phần thúc đẩy sự phát triển bền vững của hệ sinh thái số quốc gia.
document/Thời cuộc chuyển đổi số của ngành truyền thông.pdf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5408f19bafb746f4ad5ffa130d9d94bbe7db460c6b69937c6d992d63b0d7abc8
3
+ size 324582
document/Tong hop cac tai lieu Chuyen doi so.pdf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:19b799a703f2ea14814b24d240bec4c6bba2ec63aae5040e7971edcabde90009
3
+ size 1838422
document/Xay dung mo hinh chuyen doi so HTV_NGUYEN TAN DUC.pdf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2579e35c0e95fb3cabcdfc57d567da311b54dc2111c636169d9eadf255a5507a
3
+ size 881077
document/chuyendoiso.pdf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:26005bc4c9c64781fc85dd11052c63fad5e3c8edd42cb887f82744299727fb70
3
+ size 284236
document/digitizedbrains.pdf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7fd7f0cf77f9260d2fb5b7c82f4cfcb9ea1234d65d0fe7139358ff083941fecb
3
+ size 1432192
document/digitizedbrains_profile.txt ADDED
@@ -0,0 +1,353 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ DIGITIZEDBRAINS COMPANY PROFILE
2
+
3
+ ==================================================
4
+ EXECUTIVE SUMMARY
5
+ ==================================================
6
+
7
+ DigitizedBrains is Vietnam's leading AI Agent and digital transformation consultancy, founded in 2023 by Duc Nguyen, a veteran technology leader with 24+ years of experience in broadcasting and media technology. The company specializes in intelligent process automation, AI-driven business solutions, and comprehensive digital transformation services for Vietnamese enterprises.
8
+
9
+ Headquarters: Ho Chi Minh City, Vietnam
10
+ Founded: 2024
11
+ Founder & CEO: Duc Nguyen
12
+ Industry Focus: AI Agents, Digital Transformation, Process Automation
13
+ Mission: Empowering Vietnamese enterprises with intelligent digital solutions for operational excellence and sustainable growth
14
+
15
+ ==================================================
16
+ COMPANY OVERVIEW
17
+ ==================================================
18
+
19
+ DigitizedBrains represents the convergence of artificial intelligence and business intelligence, providing Vietnamese companies with the tools and expertise needed to thrive in the digital economy. Our name reflects our core philosophy: combining human intelligence with digital capabilities to create "digitized brains" that enhance decision-making, automate processes, and drive innovation.
20
+
21
+ As Vietnam's digital transformation accelerates, DigitizedBrains serves as a strategic partner for enterprises seeking to transition from intuition-based operations to data-driven, automated business processes. We bridge the gap between traditional Vietnamese business practices and cutting-edge AI technologies.
22
+
23
+ ==================================================
24
+ CORE SERVICES & SOLUTIONS
25
+ ==================================================
26
+
27
+ AI AGENT SOLUTIONS
28
+ -------------------
29
+
30
+ Process Automation (RPA):
31
+ • 24/7 automated robot systems for repetitive tasks
32
+ • Data entry automation with 99% accuracy
33
+ • Document processing and workflow management
34
+ • Custom automation solutions for specific business needs
35
+
36
+ Intelligent Workflow Management:
37
+ • AI-powered decision-making engines
38
+ • Automatic process routing and optimization
39
+ • Real-time workflow monitoring and adjustment
40
+ • Exception handling and escalation management
41
+
42
+ Document Processing Automation:
43
+ • Invoice processing with OCR and AI extraction
44
+ • Contract management and compliance checking
45
+ • Automated document classification and routing
46
+ • Digital signature integration and workflow
47
+
48
+ HR Process Automation:
49
+ • Recruitment and candidate screening automation
50
+ • Training program management and tracking
51
+ • Payroll processing and compliance management
52
+ • Performance evaluation and reporting systems
53
+
54
+ Financial Process Automation:
55
+ • Accounting automation and reconciliation
56
+ • Expense management and approval workflows
57
+ • Financial reporting and dashboard generation
58
+ • Compliance monitoring and audit trail management
59
+
60
+ Customer Service Automation:
61
+ • AI chatbots with natural language processing
62
+ • Intelligent ticket routing and prioritization
63
+ • Automated response generation and escalation
64
+ • Customer satisfaction tracking and analysis
65
+
66
+ DIGITAL TRANSFORMATION SERVICES
67
+ --------------------------------
68
+
69
+ Business Process Digitization:
70
+ • Process analysis and optimization consulting
71
+ • Digital workflow design and implementation
72
+ • Legacy system modernization
73
+ • Change management and training programs
74
+
75
+ System Integration:
76
+ • ERP, CRM, and business system connectivity
77
+ • Data synchronization across platforms
78
+ • API development and management
79
+ • Cloud migration and hybrid solutions
80
+
81
+ Digital Ecosystem Development:
82
+ • End-to-end digital infrastructure planning
83
+ • Scalable architecture design
84
+ • Security and compliance framework implementation
85
+ • Performance monitoring and optimization
86
+
87
+ DATA ANALYTICS & BUSINESS INTELLIGENCE
88
+ ---------------------------------------
89
+
90
+ Data Integration & Standardization:
91
+ • Multi-source data consolidation (ERP, CRM, IoT)
92
+ • Data cleansing and quality management
93
+ • Master data management solutions
94
+ • Real-time data pipeline development
95
+
96
+ Analysis & Visualization:
97
+ • Multi-dimensional data analysis
98
+ • Interactive dashboard development
99
+ • Trend identification and pattern recognition
100
+ • Risk detection and mitigation strategies
101
+
102
+ AI/ML & Smart Forecasting:
103
+ • Predictive analytics model development
104
+ • Machine learning algorithm implementation
105
+ • Demand forecasting and capacity planning
106
+ • Intelligent decision support systems
107
+
108
+ Real-time Process Monitoring:
109
+ • Performance optimization dashboards
110
+ • Automated alert and notification systems
111
+ • Process bottleneck identification
112
+ • Quick issue resolution protocols
113
+
114
+ ==================================================
115
+ INDUSTRY EXPERTISE
116
+ ==================================================
117
+
118
+ MANUFACTURING
119
+ -------------
120
+ • Smart factory solutions with IoT integration
121
+ • Predictive maintenance system implementation
122
+ • Quality control automation and monitoring
123
+ • Production planning and scheduling optimization
124
+ • Supply chain visibility and optimization
125
+ • Equipment performance monitoring and analysis
126
+
127
+ FINANCIAL SERVICES
128
+ ------------------
129
+ • Loan processing automation and approval workflows
130
+ • KYC (Know Your Customer) compliance automation
131
+ • Financial reporting and regulatory compliance
132
+ • Risk management and fraud detection systems
133
+ • Customer onboarding and account management
134
+ • Payment processing and reconciliation
135
+
136
+ RETAIL & E-COMMERCE
137
+ -------------------
138
+ • Order processing and fulfillment automation
139
+ • Inventory optimization and demand forecasting
140
+ • Customer service chatbots and support systems
141
+ • Dynamic pricing and promotional management
142
+ • Customer behavior analysis and segmentation
143
+ • Multi-channel sales integration
144
+
145
+ HEALTHCARE
146
+ ----------
147
+ • Patient appointment scheduling and management
148
+ • Medical record automation and digitization
149
+ • Payment processing and insurance management
150
+ • Compliance monitoring and reporting
151
+ • Telemedicine platform integration
152
+ • Clinical workflow optimization
153
+
154
+ ==================================================
155
+ TECHNOLOGY STACK & CAPABILITIES
156
+ ==================================================
157
+
158
+ ARTIFICIAL INTELLIGENCE
159
+ • Machine Learning algorithms and model development
160
+ • Natural Language Processing for Vietnamese and English
161
+ • Computer Vision for document and image processing
162
+ • Predictive Analytics and forecasting models
163
+ • Deep Learning for complex pattern recognition
164
+
165
+ AUTOMATION TECHNOLOGIES
166
+ • Robotic Process Automation (RPA) platforms
167
+ • Workflow automation engines
168
+ • Business rule management systems
169
+ • Integration platforms and API management
170
+ • Cloud-based automation infrastructure
171
+
172
+ ANALYTICS & VISUALIZATION
173
+ • Business Intelligence platforms
174
+ • Data visualization and dashboard tools
175
+ • Statistical analysis and modeling
176
+ • Real-time monitoring and alerting systems
177
+ • Custom reporting and analytics solutions
178
+
179
+ CLOUD & INFRASTRUCTURE
180
+ • Scalable cloud infrastructure management
181
+ • Hybrid cloud solutions and migration
182
+ • Security and compliance frameworks
183
+ • Database management and optimization
184
+ • System monitoring and performance tuning
185
+
186
+ ==================================================
187
+ BUSINESS VALUE PROPOSITIONS
188
+ ==================================================
189
+
190
+ OPERATIONAL EFFICIENCY
191
+ • Up to 90% reduction in processing time
192
+ • 99% task execution accuracy
193
+ • Elimination of human errors in repetitive processes
194
+ • 24/7 operational continuity
195
+ • Improved resource allocation and utilization
196
+
197
+ COST OPTIMIZATION
198
+ • 30-80% reduction in operational costs
199
+ • Decreased manual labor requirements
200
+ • Reduced error-related costs and rework
201
+ • Improved energy efficiency and resource usage
202
+ • Optimized vendor and supplier management
203
+
204
+ BUSINESS SCALABILITY
205
+ • Operations scaling without proportional workforce increase
206
+ • Rapid business expansion capabilities
207
+ • Flexible process adaptation to market changes
208
+ • Enhanced capacity for handling business growth
209
+ • Improved agility in competitive markets
210
+
211
+ REGULATORY COMPLIANCE
212
+ • Consistent adherence to Vietnamese business regulations
213
+ • International standards compliance (ISO, SOX, GDPR)
214
+ • Automated monitoring and reporting systems
215
+ • Audit trail management and documentation
216
+ • Risk mitigation and control frameworks
217
+
218
+ COMPETITIVE ADVANTAGE
219
+ • Data-driven decision making capabilities
220
+ • Faster time-to-market for products and services
221
+ • Enhanced customer experience and satisfaction
222
+ • Improved innovation capacity and R&D efficiency
223
+ • Better market responsiveness and adaptability
224
+
225
+ ==================================================
226
+ MARKET STATISTICS & RESULTS
227
+ ==================================================
228
+
229
+ VIETNAM AUTOMATION MARKET (2024-2025)
230
+ • 45% of Vietnamese businesses have adopted partial process automation
231
+ • 280% average ROI within 8-12 months post-automation implementation
232
+ • 65% average operational cost reduction through process automation
233
+ • 85% of Vietnamese businesses plan to expand automation in the next 2 years
234
+ • 320% growth in Business Intelligence market adoption over past 3 years
235
+
236
+ CLIENT SUCCESS METRICS
237
+ • Average 280% ROI achieved within 8-12 months
238
+ • 90% average reduction in process completion time
239
+ • 99% task execution accuracy across implemented solutions
240
+ • 30-80% operational cost reduction for clients
241
+ • 95% client satisfaction and retention rate
242
+
243
+ ==================================================
244
+ CLIENT SUCCESS STORIES
245
+ ==================================================
246
+
247
+ MANUFACTURING CLIENT
248
+ Challenge: Manual quality control processes causing delays and inconsistencies
249
+ Solution: AI-powered quality control automation with computer vision
250
+ Results: 95% reduction in inspection time, 99.8% accuracy improvement
251
+
252
+ FINANCIAL SERVICES CLIENT
253
+ Challenge: Time-intensive loan approval processes affecting customer satisfaction
254
+ Solution: Automated loan processing with AI risk assessment
255
+ Results: 80% faster approval times, 60% reduction in processing costs
256
+
257
+ HEALTHCARE CLIENT
258
+ Challenge: Manual patient appointment scheduling leading to conflicts and errors
259
+ Solution: Intelligent appointment management with patient preference optimization
260
+ Results: 70% reduction in scheduling conflicts, 40% improvement in patient satisfaction
261
+
262
+ RETAIL CLIENT
263
+ Challenge: Inventory management inefficiencies and stockout issues
264
+ Solution: AI-powered demand forecasting and automated inventory optimization
265
+ Results: 50% reduction in stockouts, 25% decrease in inventory carrying costs
266
+
267
+ ==================================================
268
+ COMPANY MISSION & VISION
269
+ ==================================================
270
+
271
+ MISSION
272
+ To empower Vietnamese enterprises with intelligent digital solutions that drive operational excellence, cost efficiency, and sustainable growth in the digital economy through AI-powered automation and data-driven decision making.
273
+
274
+ VISION
275
+ To be Vietnam's leading provider of AI Agent solutions and digital transformation services, helping businesses transition from intuition-based to data-driven operations while maintaining competitive advantage in Industry 4.0.
276
+
277
+ CORE VALUES
278
+ • Innovation: Pioneering AI and automation technologies for Vietnamese market needs
279
+ • Excellence: Delivering superior results through intelligent process optimization
280
+ • Partnership: Building long-term relationships with clients for sustainable success
281
+ • Expertise: Deep technical knowledge combined with practical business understanding
282
+ • Growth: Enabling scalable business expansion through digital transformation
283
+ • Integrity: Maintaining highest standards of professional ethics and transparency
284
+
285
+ ==================================================
286
+ COMPETITIVE ADVANTAGES
287
+ ==================================================
288
+
289
+ LOCAL EXPERTISE
290
+ • Deep understanding of Vietnamese business culture and practices
291
+ • Regulatory compliance expertise for Vietnam market
292
+ • Vietnamese language AI and automation capabilities
293
+ • Local support and service delivery
294
+
295
+ PROVEN LEADERSHIP
296
+ • Founder with 24+ years technology leadership experience
297
+ • Track record in large-scale system implementations
298
+ • Broadcasting and media technology expertise
299
+ • Public policy and regulatory knowledge
300
+
301
+ COMPREHENSIVE SOLUTIONS
302
+ • End-to-end digital transformation services
303
+ • Custom AI development and implementation
304
+ • Industry-specific solution expertise
305
+ • Integrated technology stack and platforms
306
+
307
+ RAPID IMPLEMENTATION
308
+ • Proven methodologies for quick deployment
309
+ • Pre-built industry-specific templates
310
+ • Agile development and iterative improvement
311
+ • Comprehensive training and change management
312
+
313
+ ==================================================
314
+ PARTNERSHIPS & CERTIFICATIONS
315
+ ==================================================
316
+
317
+ TECHNOLOGY PARTNERSHIPS
318
+ • Leading cloud infrastructure providers
319
+ • Enterprise software and platform vendors
320
+ • AI and machine learning technology providers
321
+ • System integration and consulting partners
322
+
323
+ INDUSTRY CERTIFICATIONS
324
+ • ISO 27001 Information Security Management
325
+ • GDPR Compliance and Data Protection
326
+ • Vietnam Digital Transformation Standards
327
+ • International Quality Management Standards
328
+
329
+ ==================================================
330
+ CONTACT INFORMATION
331
+ ==================================================
332
+
333
+ Company: DigitizedBrains
334
+ Founder & CEO: Duc Nguyen
335
+ Email: ai.agent.tailieu@gmail.com
336
+ Website: [Under Development - Comprehensive Digital Platform]
337
+ LinkedIn: www.linkedin.com/in/ducnguyen-68b9a8370
338
+ Location: Ho Chi Minh City, Vietnam
339
+
340
+ Business Hours: Monday - Friday, 8:00 AM - 6:00 PM (Vietnam Time)
341
+ Emergency Support: 24/7 for critical automation systems
342
+
343
+ ==================================================
344
+ NEXT STEPS FOR POTENTIAL CLIENTS
345
+ ==================================================
346
+
347
+ 1. CONSULTATION: Free initial consultation to assess automation opportunities
348
+ 2. ANALYSIS: Comprehensive business process analysis and ROI projection
349
+ 3. PROOF OF CONCEPT: Small-scale implementation to demonstrate value
350
+ 4. IMPLEMENTATION: Full-scale deployment with training and support
351
+ 5. OPTIMIZATION: Continuous improvement and expansion of automation capabilities
352
+
353
+ Ready to transform your business with intelligent automation? Contact DigitizedBrains today to begin your digital transformation journey.
document/linkedin.pdf ADDED
Binary file (42.8 kB). View file
 
document/linkedin_profile.txt ADDED
@@ -0,0 +1,173 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ LINKEDIN PROFILE - DUC NGUYEN
2
+
3
+ ==================================================
4
+ PROFESSIONAL HEADLINE
5
+ ==================================================
6
+ Deputy Director, Broadcasting Transmission Center | Digital Transformation Leader | AI Agents Specialist
7
+ Ho Chi Minh City Television (HTV) | 24+ Years in Media Technology & Broadcasting
8
+
9
+ ==================================================
10
+ ABOUT / SUMMARY
11
+ ==================================================
12
+ Experienced technology leader and entrepreneur with 24+ years in broadcasting, media technology, and digital transformation. Currently serving as Deputy Director at Broadcasting Transmission Center, Ho Chi Minh City Television (HTV), while founding and leading DigitizedBrains - a pioneering AI Agent and digital transformation consultancy.
13
+
14
+ Combining deep technical expertise in electronics, telecommunications, and broadcasting with advanced business acumen (Masters in Economics), I specialize in helping Vietnamese enterprises navigate digital transformation through intelligent automation, AI agents, and data-driven decision making.
15
+
16
+ Core Expertise: AI Agents Development, Process Automation (RPA), Digital Transformation Strategy, Broadcasting Technology, Technical Investment Planning, Public Policy Development
17
+
18
+ ==================================================
19
+ CURRENT POSITIONS
20
+ ==================================================
21
+
22
+ Founder & CEO | DigitizedBrains
23
+ 2023 - Present | Ho Chi Minh City, Vietnam
24
+ • Leading Vietnamese AI Agent and digital transformation consultancy
25
+ • Developing intelligent automation solutions for Manufacturing, Financial Services, Healthcare, and Retail sectors
26
+ • Implementing RPA, ML, NLP, and Business Intelligence solutions for enterprise clients
27
+ • Achieved 280% average ROI for clients within 8-12 months of automation implementation
28
+ • Specializing in process automation with 90% processing time reduction and 99% task execution accuracy
29
+
30
+ Deputy Director | Broadcasting Transmission Center, Ho Chi Minh City Television (HTV)
31
+ 2018 - Present | Ho Chi Minh City, Vietnam
32
+ • Overseeing technical transmission operations and infrastructure for major Vietnamese broadcaster
33
+ • Leading digital transformation initiatives in broadcasting and media technology
34
+ • Managing technical investment planning and large-scale system implementations
35
+ • Developing AI implementation strategies for media operations
36
+ • Responsible for technical infrastructure modernization and operational excellence
37
+
38
+ ==================================================
39
+ PROFESSIONAL EXPERIENCE
40
+ ==================================================
41
+
42
+ Technical Transmission Management | Ho Chi Minh City Television (HTV)
43
+ 2018 - 2024 (6 years)
44
+ • Managing comprehensive technical transmission operations
45
+ • Implementing digital transformation in broadcasting infrastructure
46
+ • Leading AI and automation projects in media operations
47
+ • Ensuring 24/7 operational continuity and service excellence
48
+
49
+ Electrical Systems Manager | Ho Chi Minh City Television (HTV)
50
+ 2006 - 2018 (12 years)
51
+ • Managed electrical systems across all HTV broadcasting operations
52
+ • Supervised technical infrastructure maintenance and upgrades
53
+ • Implemented energy efficiency and cost optimization programs
54
+ • Led cross-functional teams in technical project execution
55
+
56
+ Electronics Engineer & Technical Investment Planning Specialist | Ho Chi Minh City Television (HTV)
57
+ 2000 - 2006 (6 years)
58
+ • Electronics and telecommunications engineering for broadcasting systems
59
+ • Technical investment planning and feasibility analysis
60
+ • System design and implementation oversight
61
+ • Technology evaluation and vendor management
62
+
63
+ ==================================================
64
+ EDUCATION & CERTIFICATIONS
65
+ ==================================================
66
+
67
+ Master of Economics | University of Economics Ho Chi Minh City
68
+ Economics, Business Strategy, and Management
69
+ Graduated with focus on Digital Economy and Technology Policy
70
+
71
+ Bachelor of Economics | University of Economics Ho Chi Minh City
72
+ Economics and Business Administration
73
+ Foundation in economic theory, business operations, and policy development
74
+
75
+ Bachelor of Engineering | Electronics & Telecommunications Engineering
76
+ Technical foundation in electronics, telecommunications, and broadcasting technology
77
+
78
+ Specializations & Continuous Learning:
79
+ • AI Agents and Machine Learning Applications
80
+ • Digital Transformation Strategy and Implementation
81
+ • Process Automation and RPA Development
82
+ • Public Policy in Media & Technology
83
+ • Advanced Broadcasting and Transmission Technology
84
+
85
+ ==================================================
86
+ TECHNICAL SKILLS & EXPERTISE
87
+ ==================================================
88
+
89
+ AI & Automation Technologies:
90
+ • Artificial Intelligence (AI) and Machine Learning (ML)
91
+ • Natural Language Processing (NLP) and Computer Vision
92
+ • Robotic Process Automation (RPA)
93
+ • Intelligent Workflow Management
94
+ • Predictive Analytics and Business Intelligence
95
+
96
+ Broadcasting & Media Technology:
97
+ • Digital Broadcasting Systems and Infrastructure
98
+ • Transmission Technology and Network Management
99
+ • Media Production and Distribution Systems
100
+ • Technical Investment Planning and Project Management
101
+
102
+ Digital Transformation:
103
+ • Business Process Analysis and Optimization
104
+ • System Integration and Data Synchronization
105
+ • Digital Ecosystem Development
106
+ • Technology Infrastructure Modernization
107
+ • Change Management and Training
108
+
109
+ Business & Leadership:
110
+ • Strategic Planning and Business Development
111
+ • Team Leadership and Cross-functional Collaboration
112
+ • Public Policy Development and Implementation
113
+ • Stakeholder Management and Client Relations
114
+ • Financial Analysis and Investment Planning
115
+
116
+ ==================================================
117
+ INDUSTRY EXPERTISE
118
+ ==================================================
119
+
120
+ Media & Broadcasting: 24+ years in television broadcasting, transmission technology, and media operations
121
+
122
+ Manufacturing: Smart factory solutions, predictive maintenance, quality control automation, supply chain optimization
123
+
124
+ Financial Services: Loan processing automation, KYC compliance, financial reporting, risk management
125
+
126
+ Healthcare: Patient appointment systems, medical record automation, payment processes, compliance monitoring
127
+
128
+ Retail & E-commerce: Order processing, inventory optimization, customer service automation, dynamic pricing
129
+
130
+ ==================================================
131
+ ACHIEVEMENTS & RESULTS
132
+ ==================================================
133
+
134
+ Business Impact:
135
+ • Founded DigitizedBrains, achieving 280% average ROI for automation clients
136
+ • Reduced operational costs by 30-80% through intelligent process optimization
137
+ • Achieved 90% processing time reduction with 99% task execution accuracy
138
+ • Successfully automated processes for 45+ Vietnamese enterprises
139
+
140
+ Leadership Excellence:
141
+ • 24+ years of progressive leadership in broadcasting and technology
142
+ • Led digital transformation initiatives across multiple business sectors
143
+ • Developed and implemented public policies for media and technology adoption
144
+ • Built and managed high-performing technical and business teams
145
+
146
+ Technical Innovation:
147
+ • Pioneer in AI Agent implementation for Vietnamese enterprises
148
+ • Developed scalable automation solutions across multiple industries
149
+ • Created integrated digital ecosystems for business process optimization
150
+ • Established best practices for RPA and ML implementation in Vietnam market
151
+
152
+ ==================================================
153
+ LANGUAGES
154
+ ==================================================
155
+ • Vietnamese (Native)
156
+ • English (Professional working proficiency)
157
+
158
+ ==================================================
159
+ CONTACT INFORMATION
160
+ ==================================================
161
+ Email: ai.agent.tailieu@gmail.com
162
+ LinkedIn: www.linkedin.com/in/ducnguyen-68b9a8370
163
+ Location: Ho Chi Minh City, Vietnam
164
+
165
+ ==================================================
166
+ PROFESSIONAL INTERESTS
167
+ ==================================================
168
+ • AI Agents and Intelligent Automation
169
+ • Digital Transformation in Emerging Markets
170
+ • Public Policy Development for Technology Sectors
171
+ • Sustainable Business Growth through Innovation
172
+ • Industry 4.0 Implementation in Vietnam
173
+ • Media Technology and Broadcasting Innovation
document/summary.txt ADDED
@@ -0,0 +1,144 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ DUC NGUYEN - FOUNDER & CEO, DIGITIZEDBRAINS
2
+ RAG-Enhanced Professional Summary
3
+
4
+ ==================================================
5
+ EXECUTIVE PROFILE
6
+ ==================================================
7
+
8
+ Duc Nguyen is a visionary technology leader and entrepreneur who combines 24+ years of broadcasting and media technology expertise with cutting-edge AI and digital transformation capabilities. As the Founder & CEO of DigitizedBrains and Deputy Director of Broadcasting Transmission Center at Ho Chi Minh City Television (HTV), he uniquely bridges traditional Vietnamese business practices with Industry 4.0 innovations.
9
+
10
+ ==================================================
11
+ DUAL LEADERSHIP ROLES
12
+ ==================================================
13
+
14
+ FOUNDER & CEO - DIGITIZEDBRAINS (2023-Present)
15
+ Leading Vietnam's premier AI Agent and digital transformation consultancy, achieving 280% average ROI for clients through intelligent process automation, machine learning, and data-driven business solutions.
16
+
17
+ DEPUTY DIRECTOR - HTV BROADCASTING TRANSMISSION CENTER (2018-Present)
18
+ Overseeing technical transmission operations for major Vietnamese broadcaster while pioneering digital transformation initiatives in media technology and AI implementation strategies.
19
+
20
+ ==================================================
21
+ CORE EXPERTISE & SPECIALIZATIONS
22
+ ==================================================
23
+
24
+ AI & AUTOMATION TECHNOLOGIES
25
+ • AI Agents development and implementation
26
+ • Robotic Process Automation (RPA) with 99% accuracy
27
+ • Machine Learning and Predictive Analytics
28
+ • Natural Language Processing for Vietnamese market
29
+ • Computer Vision and document processing automation
30
+
31
+ DIGITAL TRANSFORMATION LEADERSHIP
32
+ • Business process digitization and optimization
33
+ • System integration across ERP, CRM, IoT platforms
34
+ • Digital ecosystem development and connectivity
35
+ • Change management and organizational transformation
36
+ • Technology infrastructure modernization
37
+
38
+ BROADCASTING & MEDIA TECHNOLOGY
39
+ • Advanced broadcasting systems and transmission technology
40
+ • Large-scale technical infrastructure management
41
+ • Technical investment planning and implementation
42
+ • Media production and distribution systems
43
+ • 24/7 operational continuity and service excellence
44
+
45
+ BUSINESS & STRATEGIC PLANNING
46
+ • Economics and business strategy (Masters degree)
47
+ • Technical investment analysis and ROI optimization
48
+ • Public policy development for media and technology
49
+ • Cross-functional team leadership and management
50
+ • Stakeholder relationship management and client success
51
+
52
+ ==================================================
53
+ EDUCATIONAL FOUNDATION
54
+ ==================================================
55
+
56
+ MASTER OF ECONOMICS | University of Economics Ho Chi Minh City
57
+ Advanced studies in economics, business strategy, digital economy, and technology policy development.
58
+
59
+ BACHELOR OF ECONOMICS | University of Economics Ho Chi Minh City
60
+ Foundation in economic theory, business administration, and policy formulation.
61
+
62
+ BACHELOR OF ENGINEERING | Electronics & Telecommunications
63
+ Technical expertise in electronics, telecommunications, and broadcasting technology systems.
64
+
65
+ CONTINUOUS SPECIALIZATIONS
66
+ • AI Agents and Machine Learning Applications
67
+ • Digital Transformation Strategy and Implementation
68
+ • Process Automation and RPA Development
69
+ • Public Policy in Media & Technology Sectors
70
+ • Advanced Broadcasting and Transmission Technology
71
+
72
+ ==================================================
73
+ PROFESSIONAL JOURNEY & ACHIEVEMENTS
74
+ ==================================================
75
+
76
+ 24+ YEARS PROGRESSIVE TECHNOLOGY LEADERSHIP
77
+ • 6 years: Electronics Engineer & Technical Investment Planning Specialist
78
+ • 12 years: Electrical Systems Management across HTV operations
79
+ • 6 years: Technical Transmission Management (current Deputy Director role)
80
+ • 1+ years: DigitizedBrains founder leading AI transformation initiatives
81
+
82
+ KEY ACCOMPLISHMENTS
83
+ • Founded DigitizedBrains achieving 280% average client ROI within 8-12 months
84
+ • Implemented automation solutions reducing operational costs by 30-80%
85
+ • Achieved 90% processing time reduction with 99% task execution accuracy
86
+ • Successfully automated processes for 45+ Vietnamese enterprises
87
+ • Led digital transformation in broadcasting industry at national scale
88
+ • Developed public policies for technology adoption in media sector
89
+
90
+ ==================================================
91
+ INDUSTRY IMPACT & MARKET LEADERSHIP
92
+ ==================================================
93
+
94
+ VIETNAM MARKET EXPERTISE
95
+ Deep understanding of Vietnamese business culture, regulatory environment, and market dynamics across manufacturing, financial services, healthcare, retail, and media industries.
96
+
97
+ PROVEN CLIENT SUCCESS
98
+ • Manufacturing: 95% reduction in inspection time with AI quality control
99
+ • Financial Services: 80% faster loan approval with automated processing
100
+ • Healthcare: 70% reduction in scheduling conflicts through intelligent automation
101
+ • Retail: 50% reduction in stockouts via AI-powered demand forecasting
102
+
103
+ THOUGHT LEADERSHIP
104
+ Pioneering AI Agent implementation for Vietnamese enterprises while maintaining cultural alignment and regulatory compliance for sustainable business transformation.
105
+
106
+ ==================================================
107
+ TECHNICAL CAPABILITIES & INNOVATION
108
+ ==================================================
109
+
110
+ AI & MACHINE LEARNING
111
+ Advanced implementation of artificial intelligence, machine learning algorithms, predictive analytics, and intelligent decision support systems for business optimization.
112
+
113
+ PROCESS AUTOMATION
114
+ Expert development and deployment of RPA solutions, workflow automation, document processing, and intelligent business process management.
115
+
116
+ SYSTEM INTEGRATION
117
+ Comprehensive experience in ERP, CRM, IoT integration, API development, cloud migration, and hybrid infrastructure solutions.
118
+
119
+ DATA ANALYTICS & BUSINESS INTELLIGENCE
120
+ Multi-dimensional data analysis, real-time monitoring, dashboard development, and data-driven decision making framework implementation.
121
+
122
+ ==================================================
123
+ MISSION & VISION ALIGNMENT
124
+ ==================================================
125
+
126
+ PERSONAL MISSION
127
+ To bridge the gap between traditional Vietnamese business practices and cutting-edge AI technologies, enabling sustainable digital transformation that preserves cultural values while driving innovation and growth.
128
+
129
+ DIGITIZEDBRAINS VISION
130
+ To establish Vietnam as a regional leader in AI Agent implementation and digital transformation, empowering enterprises to compete globally while maintaining local market strength.
131
+
132
+ INDUSTRY TRANSFORMATION GOAL
133
+ To accelerate Vietnam's transition to Industry 4.0 through practical AI solutions that deliver measurable ROI and sustainable competitive advantage.
134
+
135
+ ==================================================
136
+ CONTACT & ENGAGEMENT
137
+ ==================================================
138
+
139
+ Email: ai.agent.tailieu@gmail.com
140
+ LinkedIn: www.linkedin.com/in/ducnguyen-68b9a8370
141
+ Location: Ho Chi Minh City, Vietnam
142
+ Company: DigitizedBrains - AI Agents & Digital Transformation
143
+
144
+ Available for: Strategic consulting, digital transformation projects, AI implementation, technology leadership, and public-private partnerships in Vietnamese digital economy development.
requirements.txt ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ python-dotenv
2
+ google-generativeai
3
+ gradio
4
+ requests
5
+ pypdf