Spaces:
Running
Running
Upload folder using huggingface_hub
Browse files- .gitattributes +12 -0
- README.md +83 -8
- __pycache__/app_gemini.cpython-312.pyc +0 -0
- __pycache__/app_gemini.cpython-313.pyc +0 -0
- app.py +223 -0
- app_gemini.py +271 -0
- app_rag_simple.py +226 -0
- app_rag_test.py +265 -0
- document/8 bài học kinh nghiệm tại Hồ Nam Trung Quốc.pdf +3 -0
- document/Bài Tham Luận_Chuyen doi so.pdf +3 -0
- document/CHUYỂN ĐỔI SỐ ĐÀI TRUYỀN HÌNH TP HCM_ver1.pdf +3 -0
- document/Chuyển đổi số VieON.pdf +3 -0
- document/My Changsha App.pdf +3 -0
- document/Phần trình bày trong Diễn Đàn Chuyển Đổi Số_Nguyễn Tấn Đức.pdf +3 -0
- document/Phụ luc 1_ SỬ DỤNG API MỞ ĐỂ CẢI THIỆN DỊCH VỤ VÀ MỞ RỘNG HỆ SINH THÁI.pdf +3 -0
- document/Summary-ThanhHoa.txt +32 -0
- document/Thời cuộc chuyển đổi số của ngành truyền thông.pdf +3 -0
- document/Tong hop cac tai lieu Chuyen doi so.pdf +3 -0
- document/Xay dung mo hinh chuyen doi so HTV_NGUYEN TAN DUC.pdf +3 -0
- document/chuyendoiso.pdf +3 -0
- document/digitizedbrains.pdf +3 -0
- document/digitizedbrains_profile.txt +353 -0
- document/linkedin.pdf +0 -0
- document/linkedin_profile.txt +173 -0
- document/summary.txt +144 -0
- requirements.txt +5 -0
.gitattributes
CHANGED
|
@@ -33,3 +33,15 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
document/8[[:space:]]bài[[:space:]]học[[:space:]]kinh[[:space:]]nghiệm[[:space:]]tại[[:space:]]Hồ[[:space:]]Nam[[:space:]]Trung[[:space:]]Quốc.pdf filter=lfs diff=lfs merge=lfs -text
|
| 37 |
+
document/Bài[[:space:]]Tham[[:space:]]Luận_Chuyen[[:space:]]doi[[:space:]]so.pdf filter=lfs diff=lfs merge=lfs -text
|
| 38 |
+
document/chuyendoiso.pdf filter=lfs diff=lfs merge=lfs -text
|
| 39 |
+
document/Chuyển[[:space:]]đổi[[:space:]]số[[:space:]]VieON.pdf filter=lfs diff=lfs merge=lfs -text
|
| 40 |
+
document/CHUYỂN[[:space:]]ĐỔI[[:space:]]SỐ[[:space:]]ĐÀI[[:space:]]TRUYỀN[[:space:]]HÌNH[[:space:]]TP[[:space:]]HCM_ver1.pdf filter=lfs diff=lfs merge=lfs -text
|
| 41 |
+
document/digitizedbrains.pdf filter=lfs diff=lfs merge=lfs -text
|
| 42 |
+
document/My[[:space:]]Changsha[[:space:]]App.pdf filter=lfs diff=lfs merge=lfs -text
|
| 43 |
+
document/Phần[[:space:]]trình[[:space:]]bày[[:space:]]trong[[:space:]]Diễn[[:space:]]Đàn[[:space:]]Chuyển[[:space:]]Đổi[[:space:]]Số_Nguyễn[[:space:]]Tấn[[:space:]]Đức.pdf filter=lfs diff=lfs merge=lfs -text
|
| 44 |
+
document/Phụ[[:space:]]luc[[:space:]]1_[[:space:]]SỬ[[:space:]]DỤNG[[:space:]]API[[:space:]]MỞ[[:space:]]ĐỂ[[:space:]]CẢI[[:space:]]THIỆN[[:space:]]DỊCH[[:space:]]VỤ[[:space:]]VÀ[[:space:]]MỞ[[:space:]]RỘNG[[:space:]]HỆ[[:space:]]SINH[[:space:]]THÁI.pdf filter=lfs diff=lfs merge=lfs -text
|
| 45 |
+
document/Thời[[:space:]]cuộc[[:space:]]chuyển[[:space:]]đổi[[:space:]]số[[:space:]]của[[:space:]]ngành[[:space:]][[:space:]]truyền[[:space:]]thông.pdf filter=lfs diff=lfs merge=lfs -text
|
| 46 |
+
document/Tong[[:space:]]hop[[:space:]]cac[[:space:]]tai[[:space:]]lieu[[:space:]]Chuyen[[:space:]]doi[[:space:]]so.pdf filter=lfs diff=lfs merge=lfs -text
|
| 47 |
+
document/Xay[[:space:]]dung[[:space:]]mo[[:space:]]hinh[[:space:]]chuyen[[:space:]]doi[[:space:]]so[[:space:]]HTV_NGUYEN[[:space:]]TAN[[:space:]]DUC.pdf filter=lfs diff=lfs merge=lfs -text
|
README.md
CHANGED
|
@@ -1,12 +1,87 @@
|
|
| 1 |
---
|
| 2 |
-
title:
|
| 3 |
-
|
| 4 |
-
colorFrom: red
|
| 5 |
-
colorTo: blue
|
| 6 |
sdk: gradio
|
| 7 |
-
sdk_version: 5.
|
| 8 |
-
app_file: app.py
|
| 9 |
-
pinned: false
|
| 10 |
---
|
| 11 |
|
| 12 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
+
title: digitizedgemini
|
| 3 |
+
app_file: app_gemini.py
|
|
|
|
|
|
|
| 4 |
sdk: gradio
|
| 5 |
+
sdk_version: 5.33.1
|
|
|
|
|
|
|
| 6 |
---
|
| 7 |
|
| 8 |
+
# DigitizedBrains RAG Chatbot
|
| 9 |
+
|
| 10 |
+
An intelligent chatbot powered by Google's Gemini AI and Retrieval-Augmented Generation (RAG) technology.
|
| 11 |
+
|
| 12 |
+
## Features
|
| 13 |
+
|
| 14 |
+
- **RAG-based Knowledge Retrieval**: Uses comprehensive document knowledge base for accurate responses
|
| 15 |
+
- **Multi-document Search**: Intelligently searches and ranks relevant documents
|
| 16 |
+
- **Professional AI Representative**: Represents Duc Nguyen and DigitizedBrains company
|
| 17 |
+
- **Lead Capture**: Automatically detects and records user contact information
|
| 18 |
+
- **Unknown Question Tracking**: Logs questions that couldn't be answered for improvement
|
| 19 |
+
|
| 20 |
+
## Knowledge Base
|
| 21 |
+
|
| 22 |
+
The chatbot has access to extensive documentation about:
|
| 23 |
+
- Digital transformation strategies
|
| 24 |
+
- Business intelligence solutions
|
| 25 |
+
- AI implementation approaches
|
| 26 |
+
- Company services and expertise
|
| 27 |
+
- Industry best practices
|
| 28 |
+
|
| 29 |
+
## Technology Stack
|
| 30 |
+
|
| 31 |
+
- **Google Gemini AI**: Advanced language model for natural conversation
|
| 32 |
+
- **Gradio**: User-friendly web interface
|
| 33 |
+
- **RAG Architecture**: Retrieval-Augmented Generation for knowledge-based responses
|
| 34 |
+
- **Python**: Backend implementation
|
| 35 |
+
|
| 36 |
+
## Usage
|
| 37 |
+
|
| 38 |
+
Simply start a conversation with the chatbot about:
|
| 39 |
+
- Digital transformation consulting
|
| 40 |
+
- AI and business intelligence solutions
|
| 41 |
+
- Company services and capabilities
|
| 42 |
+
- Industry expertise and case studies
|
| 43 |
+
|
| 44 |
+
The chatbot will provide informed responses based on its comprehensive knowledge base.
|
| 45 |
+
|
| 46 |
+
## Environment Variables Required
|
| 47 |
+
|
| 48 |
+
To run this application, you need to set:
|
| 49 |
+
- `GEMINI_API_KEY`: Your Google Gemini API key
|
| 50 |
+
- `PUSHOVER_TOKEN`: (Optional) For notifications
|
| 51 |
+
- `PUSHOVER_USER`: (Optional) For notifications
|
| 52 |
+
|
| 53 |
+
## Configuration
|
| 54 |
+
|
| 55 |
+
This Space is configured to run with Gradio and uses the Gemini AI model for intelligent responses.
|
| 56 |
+
|
| 57 |
+
|
| 58 |
+
Bash(cd Agents_Gemini && huggingface-cli login)
|
| 59 |
+
⎿ Running in the background (down arrow to manage)
|
| 60 |
+
|
| 61 |
+
● BashOutput(Reading shell output)
|
| 62 |
+
⎿ To log in, `huggingface_hub` requires a token generated from htt
|
| 63 |
+
ps://huggingface.co/settings/tokens .
|
| 64 |
+
Token can be pasted using 'Right-Click'.
|
| 65 |
+
|
| 66 |
+
Bạn cần:
|
| 67 |
+
|
| 68 |
+
1. Truy cập https://huggingface.co/settings/tokens để tạo token
|
| 69 |
+
2. Paste token vào terminal
|
| 70 |
+
3. Sau khi login, bạn có thể tạo Space bằng lệnh:
|
| 71 |
+
|
| 72 |
+
huggingface-cli repo create digitizedbrains-rag-chatbot --type=space
|
| 73 |
+
--space_sdk=gradio
|
| 74 |
+
|
| 75 |
+
|
| 76 |
+
|
| 77 |
+
4. Upload files:
|
| 78 |
+
git init
|
| 79 |
+
git add .
|
| 80 |
+
git commit -m "Initial commit"
|
| 81 |
+
git remote add origin https://huggingface.co/spaces/YOUR_USERNAME/digi
|
| 82 |
+
tizedbrains-rag-chatbot
|
| 83 |
+
git push origin main
|
| 84 |
+
|
| 85 |
+
Lưu ý quan trọng:
|
| 86 |
+
- Bạn cần đặt GEMINI_API_KEY trong Space Settings của Hugging Face
|
| 87 |
+
- Thư mục document/ cần được upload cùng để RAG hoạt động
|
__pycache__/app_gemini.cpython-312.pyc
ADDED
|
Binary file (9.96 kB). View file
|
|
|
__pycache__/app_gemini.cpython-313.pyc
ADDED
|
Binary file (14.1 kB). View file
|
|
|
app.py
ADDED
|
@@ -0,0 +1,223 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import google.generativeai as genai
|
| 2 |
+
import json
|
| 3 |
+
import os
|
| 4 |
+
import requests
|
| 5 |
+
import gradio as gr
|
| 6 |
+
import re
|
| 7 |
+
import glob
|
| 8 |
+
from collections import defaultdict
|
| 9 |
+
|
| 10 |
+
# Configure Gemini API - Use environment variables for security
|
| 11 |
+
genai.configure(api_key=os.getenv("GEMINI_API_KEY"))
|
| 12 |
+
|
| 13 |
+
def push(text):
|
| 14 |
+
try:
|
| 15 |
+
requests.post(
|
| 16 |
+
"https://api.pushover.net/1/messages.json",
|
| 17 |
+
data={
|
| 18 |
+
"token": os.getenv("PUSHOVER_TOKEN"),
|
| 19 |
+
"user": os.getenv("PUSHOVER_USER"),
|
| 20 |
+
"message": text,
|
| 21 |
+
}
|
| 22 |
+
)
|
| 23 |
+
except:
|
| 24 |
+
print(f"Push notification: {text}")
|
| 25 |
+
|
| 26 |
+
def record_user_details(email, name="Name not provided", notes="not provided"):
|
| 27 |
+
push(f"Recording {name} with email {email} and notes {notes}")
|
| 28 |
+
return {"recorded": "ok"}
|
| 29 |
+
|
| 30 |
+
def record_unknown_question(question):
|
| 31 |
+
push(f"Recording {question}")
|
| 32 |
+
return {"recorded": "ok"}
|
| 33 |
+
|
| 34 |
+
record_user_details_json = {
|
| 35 |
+
"name": "record_user_details",
|
| 36 |
+
"description": "Use this tool to record that a user is interested in being in touch and provided an email address",
|
| 37 |
+
"parameters": {
|
| 38 |
+
"type": "object",
|
| 39 |
+
"properties": {
|
| 40 |
+
"email": {
|
| 41 |
+
"type": "string",
|
| 42 |
+
"description": "The email address of this user"
|
| 43 |
+
},
|
| 44 |
+
"name": {
|
| 45 |
+
"type": "string",
|
| 46 |
+
"description": "The user's name, if they provided it"
|
| 47 |
+
},
|
| 48 |
+
"notes": {
|
| 49 |
+
"type": "string",
|
| 50 |
+
"description": "Any additional information about the conversation that's worth recording to give context"
|
| 51 |
+
}
|
| 52 |
+
},
|
| 53 |
+
"required": ["email"],
|
| 54 |
+
"additionalProperties": False
|
| 55 |
+
}
|
| 56 |
+
}
|
| 57 |
+
|
| 58 |
+
record_unknown_question_json = {
|
| 59 |
+
"name": "record_unknown_question",
|
| 60 |
+
"description": "Always use this tool to record any question that couldn't be answered as you didn't know the answer",
|
| 61 |
+
"parameters": {
|
| 62 |
+
"type": "object",
|
| 63 |
+
"properties": {
|
| 64 |
+
"question": {
|
| 65 |
+
"type": "string",
|
| 66 |
+
"description": "The question that couldn't be answered"
|
| 67 |
+
}
|
| 68 |
+
},
|
| 69 |
+
"required": ["question"],
|
| 70 |
+
"additionalProperties": False
|
| 71 |
+
}
|
| 72 |
+
}
|
| 73 |
+
|
| 74 |
+
tools = [record_user_details_json, record_unknown_question_json]
|
| 75 |
+
|
| 76 |
+
class Me:
|
| 77 |
+
|
| 78 |
+
def __init__(self):
|
| 79 |
+
self.model = genai.GenerativeModel("gemini-1.5-flash")
|
| 80 |
+
self.owner_name = "Duc Nguyen"
|
| 81 |
+
self.chatbot_name = "DigitizedBrains"
|
| 82 |
+
|
| 83 |
+
# RAG Knowledge Base - Load text documents only (fast loading)
|
| 84 |
+
self.knowledge_base = self.load_text_documents()
|
| 85 |
+
print(f"Loaded {len(self.knowledge_base)} text documents into RAG knowledge base")
|
| 86 |
+
|
| 87 |
+
# Core information
|
| 88 |
+
self.linkedin = self.knowledge_base.get('linkedin_profile.txt', '[LinkedIn profile not found]')
|
| 89 |
+
self.summary = self.knowledge_base.get('summary.txt', '[Summary not found]')
|
| 90 |
+
self.digitizedbrains_info = self.knowledge_base.get('digitizedbrains_profile.txt', '[DigitizedBrains profile not found]')
|
| 91 |
+
|
| 92 |
+
def load_text_documents(self):
|
| 93 |
+
"""Load only text documents for fast startup"""
|
| 94 |
+
knowledge_base = {}
|
| 95 |
+
document_dir = "document/"
|
| 96 |
+
|
| 97 |
+
# Load all text files (fast)
|
| 98 |
+
for txt_file in glob.glob(os.path.join(document_dir, "*.txt")):
|
| 99 |
+
filename = os.path.basename(txt_file)
|
| 100 |
+
try:
|
| 101 |
+
with open(txt_file, "r", encoding="utf-8") as f:
|
| 102 |
+
content = f.read()
|
| 103 |
+
knowledge_base[filename] = content
|
| 104 |
+
print(f"Loaded: {filename} ({len(content)} chars)")
|
| 105 |
+
except Exception as e:
|
| 106 |
+
print(f"Failed: {filename}")
|
| 107 |
+
|
| 108 |
+
return knowledge_base
|
| 109 |
+
|
| 110 |
+
def search_relevant_content(self, query):
|
| 111 |
+
"""Simple RAG retrieval based on keyword matching"""
|
| 112 |
+
query_lower = query.lower()
|
| 113 |
+
relevant_docs = []
|
| 114 |
+
|
| 115 |
+
# Score documents based on relevance
|
| 116 |
+
doc_scores = defaultdict(int)
|
| 117 |
+
for filename, content in self.knowledge_base.items():
|
| 118 |
+
content_lower = content.lower()
|
| 119 |
+
|
| 120 |
+
# Direct query match (highest score)
|
| 121 |
+
if query_lower in content_lower:
|
| 122 |
+
doc_scores[filename] += 10
|
| 123 |
+
|
| 124 |
+
# Word-by-word matching
|
| 125 |
+
query_words = query_lower.split()
|
| 126 |
+
for word in query_words:
|
| 127 |
+
if len(word) > 2 and word in content_lower:
|
| 128 |
+
doc_scores[filename] += 2
|
| 129 |
+
|
| 130 |
+
# Return top relevant documents
|
| 131 |
+
sorted_docs = sorted(doc_scores.items(), key=lambda x: x[1], reverse=True)
|
| 132 |
+
|
| 133 |
+
# Get top 3 most relevant documents
|
| 134 |
+
for filename, score in sorted_docs[:3]:
|
| 135 |
+
if score > 0:
|
| 136 |
+
relevant_docs.append({
|
| 137 |
+
'filename': filename,
|
| 138 |
+
'content': self.knowledge_base[filename],
|
| 139 |
+
'score': score
|
| 140 |
+
})
|
| 141 |
+
|
| 142 |
+
return relevant_docs
|
| 143 |
+
|
| 144 |
+
def system_prompt(self, relevant_docs=None):
|
| 145 |
+
system_prompt = f"You are {self.chatbot_name}, an AI representative for {self.owner_name}. \
|
| 146 |
+
You represent both {self.owner_name} personally and {self.chatbot_name} company. \
|
| 147 |
+
\n\nYou have access to a comprehensive knowledge base with {len(self.knowledge_base)} documents. \
|
| 148 |
+
Be professional, engaging, and use the knowledge base to provide accurate responses. \
|
| 149 |
+
\n\nIf you don't know something, use record_unknown_question tool. \
|
| 150 |
+
If users provide emails, use record_user_details tool."
|
| 151 |
+
|
| 152 |
+
# Add core information (truncated for context limit)
|
| 153 |
+
system_prompt += f"\n\n## Core Information:"
|
| 154 |
+
system_prompt += f"\n### {self.owner_name}'s Summary:\n{self.summary[:800]}..."
|
| 155 |
+
system_prompt += f"\n\n### {self.chatbot_name} Business:\n{self.digitizedbrains_info[:800]}..."
|
| 156 |
+
|
| 157 |
+
# Add relevant documents
|
| 158 |
+
if relevant_docs:
|
| 159 |
+
system_prompt += f"\n\n## Relevant Documents:"
|
| 160 |
+
for doc in relevant_docs:
|
| 161 |
+
system_prompt += f"\n\n### {doc['filename']} (Score: {doc['score']}):\n"
|
| 162 |
+
content = doc['content'][:1500] + "..." if len(doc['content']) > 1500 else doc['content']
|
| 163 |
+
system_prompt += content
|
| 164 |
+
|
| 165 |
+
return system_prompt
|
| 166 |
+
|
| 167 |
+
def chat(self, message, history):
|
| 168 |
+
# RAG Retrieval
|
| 169 |
+
relevant_docs = self.search_relevant_content(message)
|
| 170 |
+
print(f"\nQuery: {message[:50]}...")
|
| 171 |
+
print(f"Found {len(relevant_docs)} relevant documents:")
|
| 172 |
+
for doc in relevant_docs:
|
| 173 |
+
print(f" - {doc['filename']} (score: {doc['score']})")
|
| 174 |
+
|
| 175 |
+
# Generate response
|
| 176 |
+
prompt = self.system_prompt(relevant_docs) + "\n\n"
|
| 177 |
+
|
| 178 |
+
# Add conversation history
|
| 179 |
+
for h in history:
|
| 180 |
+
prompt += f"{h['role'].capitalize()}: {h['content']}\n"
|
| 181 |
+
prompt += f"User: {message}\nAssistant:"
|
| 182 |
+
|
| 183 |
+
try:
|
| 184 |
+
response = self.model.generate_content(prompt)
|
| 185 |
+
reply = response.text
|
| 186 |
+
except Exception as e:
|
| 187 |
+
reply = f"Xin lỗi, tôi gặp lỗi khi xử lý câu hỏi của bạn. Vui lòng thử lại. Error: {str(e)}"
|
| 188 |
+
|
| 189 |
+
# Email detection
|
| 190 |
+
email_match = re.search(r'[\w\.-]+@[\w\.-]+', message)
|
| 191 |
+
if email_match:
|
| 192 |
+
email = email_match.group(0)
|
| 193 |
+
record_user_details(email, "Website Contact", f"RAG chat: {message[:100]}")
|
| 194 |
+
|
| 195 |
+
# Unknown question detection
|
| 196 |
+
if "I don't know" in reply or "không biết" in reply.lower():
|
| 197 |
+
record_unknown_question(message)
|
| 198 |
+
|
| 199 |
+
return reply
|
| 200 |
+
|
| 201 |
+
# Initialize the chatbot
|
| 202 |
+
print("Starting RAG-Enhanced DigitizedBrains Chatbot...")
|
| 203 |
+
me = Me()
|
| 204 |
+
print("\n" + "="*60)
|
| 205 |
+
print("RAG-ENHANCED DIGITIZEDBRAINS CHATBOT READY!")
|
| 206 |
+
print("="*60)
|
| 207 |
+
print("Features:")
|
| 208 |
+
print(" - RAG-based knowledge retrieval")
|
| 209 |
+
print(" - Multi-document search")
|
| 210 |
+
print(" - Intelligent response generation")
|
| 211 |
+
print(" - Lead capture & unknown question tracking")
|
| 212 |
+
print("="*60)
|
| 213 |
+
|
| 214 |
+
# Launch Gradio interface
|
| 215 |
+
iface = gr.ChatInterface(
|
| 216 |
+
me.chat,
|
| 217 |
+
type="messages",
|
| 218 |
+
title="DigitizedBrains RAG Chatbot",
|
| 219 |
+
description="AI-powered chatbot with comprehensive knowledge base about Duc Nguyen and DigitizedBrains services."
|
| 220 |
+
)
|
| 221 |
+
|
| 222 |
+
if __name__ == "__main__":
|
| 223 |
+
iface.launch(share=False, server_name="0.0.0.0")
|
app_gemini.py
ADDED
|
@@ -0,0 +1,271 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from dotenv import load_dotenv
|
| 2 |
+
import google.generativeai as genai
|
| 3 |
+
import json
|
| 4 |
+
import os
|
| 5 |
+
import requests
|
| 6 |
+
from pypdf import PdfReader
|
| 7 |
+
import gradio as gr
|
| 8 |
+
import re
|
| 9 |
+
import glob
|
| 10 |
+
from collections import defaultdict
|
| 11 |
+
|
| 12 |
+
|
| 13 |
+
load_dotenv(override=True)
|
| 14 |
+
|
| 15 |
+
def push(text):
|
| 16 |
+
requests.post(
|
| 17 |
+
"https://api.pushover.net/1/messages.json",
|
| 18 |
+
data={
|
| 19 |
+
"token": os.getenv("PUSHOVER_TOKEN"),
|
| 20 |
+
"user": os.getenv("PUSHOVER_USER"),
|
| 21 |
+
"message": text,
|
| 22 |
+
}
|
| 23 |
+
)
|
| 24 |
+
|
| 25 |
+
|
| 26 |
+
def record_user_details(email, name="Name not provided", notes="not provided"):
|
| 27 |
+
push(f"Recording {name} with email {email} and notes {notes}")
|
| 28 |
+
return {"recorded": "ok"}
|
| 29 |
+
|
| 30 |
+
def record_unknown_question(question):
|
| 31 |
+
push(f"Recording {question}")
|
| 32 |
+
return {"recorded": "ok"}
|
| 33 |
+
|
| 34 |
+
record_user_details_json = {
|
| 35 |
+
"name": "record_user_details",
|
| 36 |
+
"description": "Use this tool to record that a user is interested in being in touch and provided an email address",
|
| 37 |
+
"parameters": {
|
| 38 |
+
"type": "object",
|
| 39 |
+
"properties": {
|
| 40 |
+
"email": {
|
| 41 |
+
"type": "string",
|
| 42 |
+
"description": "The email address of this user"
|
| 43 |
+
},
|
| 44 |
+
"name": {
|
| 45 |
+
"type": "string",
|
| 46 |
+
"description": "The user's name, if they provided it"
|
| 47 |
+
},
|
| 48 |
+
"notes": {
|
| 49 |
+
"type": "string",
|
| 50 |
+
"description": "Any additional information about the conversation that's worth recording to give context"
|
| 51 |
+
}
|
| 52 |
+
},
|
| 53 |
+
"required": ["email"],
|
| 54 |
+
"additionalProperties": False
|
| 55 |
+
}
|
| 56 |
+
}
|
| 57 |
+
|
| 58 |
+
record_unknown_question_json = {
|
| 59 |
+
"name": "record_unknown_question",
|
| 60 |
+
"description": "Always use this tool to record any question that couldn't be answered as you didn't know the answer",
|
| 61 |
+
"parameters": {
|
| 62 |
+
"type": "object",
|
| 63 |
+
"properties": {
|
| 64 |
+
"question": {
|
| 65 |
+
"type": "string",
|
| 66 |
+
"description": "The question that couldn't be answered"
|
| 67 |
+
}
|
| 68 |
+
},
|
| 69 |
+
"required": ["question"],
|
| 70 |
+
"additionalProperties": False
|
| 71 |
+
}
|
| 72 |
+
}
|
| 73 |
+
|
| 74 |
+
tools = [record_user_details_json, record_unknown_question_json]
|
| 75 |
+
|
| 76 |
+
|
| 77 |
+
class Me:
|
| 78 |
+
|
| 79 |
+
def __init__(self):
|
| 80 |
+
genai.configure(api_key=os.getenv("GEMINI_API_KEY"))
|
| 81 |
+
self.model = genai.GenerativeModel("gemini-2.0-flash")
|
| 82 |
+
self.owner_name = "Duc Nguyen" # Chủ sở hữu website và DigitizedBrains
|
| 83 |
+
self.chatbot_name = "DigitizedBrains" # Nhân vật đại diện chatbot
|
| 84 |
+
|
| 85 |
+
# RAG Knowledge Base - Load all documents
|
| 86 |
+
self.knowledge_base = self.load_all_documents()
|
| 87 |
+
print(f"Loaded {len(self.knowledge_base)} documents into RAG knowledge base")
|
| 88 |
+
|
| 89 |
+
# Core information (backwards compatibility)
|
| 90 |
+
self.linkedin = self.knowledge_base.get('linkedin_profile.txt', '[LinkedIn profile not found]')
|
| 91 |
+
self.summary = self.knowledge_base.get('summary.txt', '[Summary not found]')
|
| 92 |
+
self.digitizedbrains_info = self.knowledge_base.get('digitizedbrains_profile.txt', '[DigitizedBrains profile not found]')
|
| 93 |
+
|
| 94 |
+
def load_all_documents(self):
|
| 95 |
+
"""Load all documents from the document folder using RAG technique"""
|
| 96 |
+
knowledge_base = {}
|
| 97 |
+
document_dir = "document/"
|
| 98 |
+
|
| 99 |
+
# Load all text files
|
| 100 |
+
for txt_file in glob.glob(os.path.join(document_dir, "*.txt")):
|
| 101 |
+
filename = os.path.basename(txt_file)
|
| 102 |
+
try:
|
| 103 |
+
with open(txt_file, "r", encoding="utf-8") as f:
|
| 104 |
+
content = f.read()
|
| 105 |
+
knowledge_base[filename] = content
|
| 106 |
+
# Safe filename encoding for print
|
| 107 |
+
safe_filename = filename.encode('ascii', errors='replace').decode('ascii')
|
| 108 |
+
print(f"Loaded text document: {safe_filename} ({len(content)} chars)")
|
| 109 |
+
except Exception as e:
|
| 110 |
+
safe_filename = filename.encode('ascii', errors='replace').decode('ascii')
|
| 111 |
+
print(f"Warning: Could not load {safe_filename}: text loading error")
|
| 112 |
+
|
| 113 |
+
# Load all PDF files
|
| 114 |
+
for pdf_file in glob.glob(os.path.join(document_dir, "*.pdf")):
|
| 115 |
+
filename = os.path.basename(pdf_file)
|
| 116 |
+
try:
|
| 117 |
+
reader = PdfReader(pdf_file)
|
| 118 |
+
pdf_content = ""
|
| 119 |
+
for page in reader.pages:
|
| 120 |
+
text = page.extract_text()
|
| 121 |
+
if text:
|
| 122 |
+
pdf_content += text + "\n"
|
| 123 |
+
knowledge_base[filename] = pdf_content
|
| 124 |
+
# Safe filename encoding for print
|
| 125 |
+
safe_filename = filename.encode('utf-8', errors='replace').decode('utf-8')
|
| 126 |
+
print(f"Loaded PDF document: {safe_filename} ({len(pdf_content)} chars)")
|
| 127 |
+
except Exception as e:
|
| 128 |
+
# Handle encoding issues in error messages
|
| 129 |
+
safe_filename = filename.encode('ascii', errors='replace').decode('ascii')
|
| 130 |
+
print(f"Warning: Could not load PDF {safe_filename}: PDF loading error")
|
| 131 |
+
|
| 132 |
+
return knowledge_base
|
| 133 |
+
|
| 134 |
+
def search_relevant_content(self, query):
|
| 135 |
+
"""Simple RAG retrieval - find most relevant documents based on keyword matching"""
|
| 136 |
+
query_lower = query.lower()
|
| 137 |
+
relevant_docs = []
|
| 138 |
+
|
| 139 |
+
# Keywords for different document types
|
| 140 |
+
keywords = {
|
| 141 |
+
'personal': ['duc nguyen', 'linkedin', 'career', 'experience', 'education', 'background', 'profile'],
|
| 142 |
+
'business': ['digitizedbrains', 'company', 'services', 'solutions', 'automation', 'ai agent'],
|
| 143 |
+
'digital_transformation': ['chuyển đổi số', 'digital transformation', 'technology', 'broadcasting', 'htv'],
|
| 144 |
+
'experience': ['kinh nghiệm', 'experience', 'học', 'tham luận', 'diễn đàn'],
|
| 145 |
+
'hunan_broadcasting': ['hồ nam', 'hunan', 'truyền hình', 'broadcasting', 'television', 'đài', 'tập đoàn', 'ngụy văn bân', 'mango', 'bài học', 'lesson', 'kinh nghiệm']
|
| 146 |
+
}
|
| 147 |
+
|
| 148 |
+
# Score documents based on keyword relevance
|
| 149 |
+
doc_scores = defaultdict(int)
|
| 150 |
+
for filename, content in self.knowledge_base.items():
|
| 151 |
+
content_lower = content.lower()
|
| 152 |
+
|
| 153 |
+
# Direct query match
|
| 154 |
+
if query_lower in content_lower:
|
| 155 |
+
doc_scores[filename] += 10
|
| 156 |
+
|
| 157 |
+
# Keyword category matching
|
| 158 |
+
for category, category_keywords in keywords.items():
|
| 159 |
+
for keyword in category_keywords:
|
| 160 |
+
if keyword in query_lower and keyword in content_lower:
|
| 161 |
+
doc_scores[filename] += 5
|
| 162 |
+
|
| 163 |
+
# Additional scoring for query words
|
| 164 |
+
query_words = query_lower.split()
|
| 165 |
+
for word in query_words:
|
| 166 |
+
if len(word) > 2 and word in content_lower:
|
| 167 |
+
doc_scores[filename] += 2
|
| 168 |
+
|
| 169 |
+
# Return top relevant documents
|
| 170 |
+
sorted_docs = sorted(doc_scores.items(), key=lambda x: x[1], reverse=True)
|
| 171 |
+
|
| 172 |
+
# Get top 5 most relevant documents
|
| 173 |
+
for filename, score in sorted_docs[:5]:
|
| 174 |
+
if score > 0:
|
| 175 |
+
relevant_docs.append({
|
| 176 |
+
'filename': filename,
|
| 177 |
+
'content': self.knowledge_base[filename],
|
| 178 |
+
'score': score
|
| 179 |
+
})
|
| 180 |
+
|
| 181 |
+
return relevant_docs
|
| 182 |
+
|
| 183 |
+
|
| 184 |
+
def handle_tool_call(self, tool_calls):
|
| 185 |
+
results = []
|
| 186 |
+
for tool_call in tool_calls:
|
| 187 |
+
tool_name = tool_call.function.name
|
| 188 |
+
arguments = json.loads(tool_call.function.arguments)
|
| 189 |
+
print(f"Tool called: {tool_name}", flush=True)
|
| 190 |
+
tool = globals().get(tool_name)
|
| 191 |
+
result = tool(**arguments) if tool else {}
|
| 192 |
+
results.append({"role": "tool","content": json.dumps(result),"tool_call_id": tool_call.id})
|
| 193 |
+
return results
|
| 194 |
+
|
| 195 |
+
def system_prompt(self, relevant_docs=None):
|
| 196 |
+
system_prompt = f"You are {self.chatbot_name}, an AI representative acting on behalf of {self.owner_name}. \
|
| 197 |
+
You are answering questions on {self.owner_name}'s website, representing both {self.owner_name} personally and the {self.chatbot_name} company/brand. \
|
| 198 |
+
\n\nYour responsibilities include: \
|
| 199 |
+
1. Representing {self.owner_name}'s career, background, skills and experience using his comprehensive knowledge base \
|
| 200 |
+
2. Representing {self.chatbot_name} as a digital transformation and AI solutions company \
|
| 201 |
+
3. Answering questions about digital transformation, broadcasting, and technology expertise \
|
| 202 |
+
4. Using the extensive document knowledge base to provide detailed, accurate responses \
|
| 203 |
+
\n\nYou have access to a comprehensive RAG knowledge base with {len(self.knowledge_base)} documents including: \
|
| 204 |
+
- Personal information about {self.owner_name} (career, LinkedIn, education, experience) \
|
| 205 |
+
- Business information about {self.chatbot_name} (services, solutions, capabilities) \
|
| 206 |
+
- Digital transformation expertise and case studies \
|
| 207 |
+
- Broadcasting and media technology knowledge \
|
| 208 |
+
- Academic papers and industry presentations \
|
| 209 |
+
\n\nBe professional and engaging, using the knowledge base to provide comprehensive answers. \
|
| 210 |
+
When discussing {self.owner_name}, speak about him in first person as his representative. \
|
| 211 |
+
When discussing {self.chatbot_name}, represent the company's capabilities and services. \
|
| 212 |
+
\n\nIf you don't know the answer to any question, use your record_unknown_question tool to record it. \
|
| 213 |
+
Only ask for contact information if the user specifically expresses interest in getting in touch or requests services. Do not proactively push for contact details or add unnecessary calls-to-action about API services."
|
| 214 |
+
|
| 215 |
+
# Add core information
|
| 216 |
+
system_prompt += f"\n\n## Core Information:"
|
| 217 |
+
system_prompt += f"\n### {self.owner_name}'s Summary:\n{self.summary[:2000]}..."
|
| 218 |
+
system_prompt += f"\n\n### {self.chatbot_name} Business Profile:\n{self.digitizedbrains_info[:2000]}..."
|
| 219 |
+
|
| 220 |
+
# Add relevant documents if provided
|
| 221 |
+
if relevant_docs:
|
| 222 |
+
system_prompt += f"\n\n## Relevant Knowledge Base Documents:"
|
| 223 |
+
for doc in relevant_docs:
|
| 224 |
+
system_prompt += f"\n\n### Document: {doc['filename']} (Relevance Score: {doc['score']})\n"
|
| 225 |
+
# Truncate content to avoid context limit
|
| 226 |
+
content = doc['content'][:3000] + "..." if len(doc['content']) > 3000 else doc['content']
|
| 227 |
+
system_prompt += content
|
| 228 |
+
|
| 229 |
+
system_prompt += f"\n\nWith this comprehensive RAG knowledge base, please provide detailed and accurate responses as {self.chatbot_name}, \
|
| 230 |
+
representing both {self.owner_name} personally and the {self.chatbot_name} business professionally."
|
| 231 |
+
return system_prompt
|
| 232 |
+
|
| 233 |
+
def chat(self, message, history):
|
| 234 |
+
# RAG Retrieval - Find relevant documents for the user's question
|
| 235 |
+
relevant_docs = self.search_relevant_content(message)
|
| 236 |
+
try:
|
| 237 |
+
safe_message = message[:100].encode('ascii', errors='replace').decode('ascii')
|
| 238 |
+
print(f"Found {len(relevant_docs)} relevant documents for query: {safe_message}...")
|
| 239 |
+
except:
|
| 240 |
+
print(f"Found {len(relevant_docs)} relevant documents for user query")
|
| 241 |
+
|
| 242 |
+
# Generate prompt with relevant context
|
| 243 |
+
prompt = self.system_prompt(relevant_docs) + "\n\n"
|
| 244 |
+
|
| 245 |
+
# Add conversation history
|
| 246 |
+
for h in history:
|
| 247 |
+
prompt += f"{h['role'].capitalize()}: {h['content']}\n"
|
| 248 |
+
prompt += f"User: {message}\nAssistant:"
|
| 249 |
+
|
| 250 |
+
# Generate response
|
| 251 |
+
response = self.model.generate_content(prompt)
|
| 252 |
+
reply = response.text
|
| 253 |
+
|
| 254 |
+
# Tìm email trong message hoặc reply
|
| 255 |
+
email_match = re.search(r'[\w\.-]+@[\w\.-]+', message)
|
| 256 |
+
if email_match:
|
| 257 |
+
email = email_match.group(0)
|
| 258 |
+
name = "Contact from website" # hoặc trích xuất tên nếu muốn
|
| 259 |
+
notes = f"User provided email via {self.chatbot_name} chat with RAG knowledge base"
|
| 260 |
+
record_user_details(email, name, notes)
|
| 261 |
+
|
| 262 |
+
# Nếu Gemini trả lời không biết, thì ghi lại câu hỏi
|
| 263 |
+
if "I don't know" in reply or "I'm not sure" in reply or "Tôi không biết" in reply:
|
| 264 |
+
record_unknown_question(message)
|
| 265 |
+
|
| 266 |
+
return reply
|
| 267 |
+
|
| 268 |
+
|
| 269 |
+
if __name__ == "__main__":
|
| 270 |
+
me = Me()
|
| 271 |
+
gr.ChatInterface(me.chat, type="messages").launch()
|
app_rag_simple.py
ADDED
|
@@ -0,0 +1,226 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from dotenv import load_dotenv
|
| 2 |
+
import google.generativeai as genai
|
| 3 |
+
import json
|
| 4 |
+
import os
|
| 5 |
+
import requests
|
| 6 |
+
import gradio as gr
|
| 7 |
+
import re
|
| 8 |
+
import glob
|
| 9 |
+
from collections import defaultdict
|
| 10 |
+
|
| 11 |
+
|
| 12 |
+
load_dotenv(override=True)
|
| 13 |
+
|
| 14 |
+
def push(text):
|
| 15 |
+
try:
|
| 16 |
+
requests.post(
|
| 17 |
+
"https://api.pushover.net/1/messages.json",
|
| 18 |
+
data={
|
| 19 |
+
"token": os.getenv("PUSHOVER_TOKEN"),
|
| 20 |
+
"user": os.getenv("PUSHOVER_USER"),
|
| 21 |
+
"message": text,
|
| 22 |
+
}
|
| 23 |
+
)
|
| 24 |
+
except:
|
| 25 |
+
print(f"Push notification: {text}")
|
| 26 |
+
|
| 27 |
+
|
| 28 |
+
def record_user_details(email, name="Name not provided", notes="not provided"):
|
| 29 |
+
push(f"Recording {name} with email {email} and notes {notes}")
|
| 30 |
+
return {"recorded": "ok"}
|
| 31 |
+
|
| 32 |
+
def record_unknown_question(question):
|
| 33 |
+
push(f"Recording {question}")
|
| 34 |
+
return {"recorded": "ok"}
|
| 35 |
+
|
| 36 |
+
record_user_details_json = {
|
| 37 |
+
"name": "record_user_details",
|
| 38 |
+
"description": "Use this tool to record that a user is interested in being in touch and provided an email address",
|
| 39 |
+
"parameters": {
|
| 40 |
+
"type": "object",
|
| 41 |
+
"properties": {
|
| 42 |
+
"email": {
|
| 43 |
+
"type": "string",
|
| 44 |
+
"description": "The email address of this user"
|
| 45 |
+
},
|
| 46 |
+
"name": {
|
| 47 |
+
"type": "string",
|
| 48 |
+
"description": "The user's name, if they provided it"
|
| 49 |
+
},
|
| 50 |
+
"notes": {
|
| 51 |
+
"type": "string",
|
| 52 |
+
"description": "Any additional information about the conversation that's worth recording to give context"
|
| 53 |
+
}
|
| 54 |
+
},
|
| 55 |
+
"required": ["email"],
|
| 56 |
+
"additionalProperties": False
|
| 57 |
+
}
|
| 58 |
+
}
|
| 59 |
+
|
| 60 |
+
record_unknown_question_json = {
|
| 61 |
+
"name": "record_unknown_question",
|
| 62 |
+
"description": "Always use this tool to record any question that couldn't be answered as you didn't know the answer",
|
| 63 |
+
"parameters": {
|
| 64 |
+
"type": "object",
|
| 65 |
+
"properties": {
|
| 66 |
+
"question": {
|
| 67 |
+
"type": "string",
|
| 68 |
+
"description": "The question that couldn't be answered"
|
| 69 |
+
}
|
| 70 |
+
},
|
| 71 |
+
"required": ["question"],
|
| 72 |
+
"additionalProperties": False
|
| 73 |
+
}
|
| 74 |
+
}
|
| 75 |
+
|
| 76 |
+
tools = [record_user_details_json, record_unknown_question_json]
|
| 77 |
+
|
| 78 |
+
|
| 79 |
+
class Me:
|
| 80 |
+
|
| 81 |
+
def __init__(self):
|
| 82 |
+
genai.configure(api_key=os.getenv("GEMINI_API_KEY"))
|
| 83 |
+
self.model = genai.GenerativeModel("gemini-1.5-flash")
|
| 84 |
+
self.owner_name = "Duc Nguyen"
|
| 85 |
+
self.chatbot_name = "DigitizedBrains"
|
| 86 |
+
|
| 87 |
+
# RAG Knowledge Base - Load text documents only (fast loading)
|
| 88 |
+
self.knowledge_base = self.load_text_documents()
|
| 89 |
+
print(f"Loaded {len(self.knowledge_base)} text documents into RAG knowledge base")
|
| 90 |
+
|
| 91 |
+
# Core information
|
| 92 |
+
self.linkedin = self.knowledge_base.get('linkedin_profile.txt', '[LinkedIn profile not found]')
|
| 93 |
+
self.summary = self.knowledge_base.get('summary.txt', '[Summary not found]')
|
| 94 |
+
self.digitizedbrains_info = self.knowledge_base.get('digitizedbrains_profile.txt', '[DigitizedBrains profile not found]')
|
| 95 |
+
|
| 96 |
+
def load_text_documents(self):
|
| 97 |
+
"""Load only text documents for fast startup"""
|
| 98 |
+
knowledge_base = {}
|
| 99 |
+
document_dir = "document/"
|
| 100 |
+
|
| 101 |
+
# Load all text files (fast)
|
| 102 |
+
for txt_file in glob.glob(os.path.join(document_dir, "*.txt")):
|
| 103 |
+
filename = os.path.basename(txt_file)
|
| 104 |
+
try:
|
| 105 |
+
with open(txt_file, "r", encoding="utf-8") as f:
|
| 106 |
+
content = f.read()
|
| 107 |
+
knowledge_base[filename] = content
|
| 108 |
+
print(f"Loaded: {filename} ({len(content)} chars)")
|
| 109 |
+
except Exception as e:
|
| 110 |
+
print(f"Failed: {filename}")
|
| 111 |
+
|
| 112 |
+
return knowledge_base
|
| 113 |
+
|
| 114 |
+
def search_relevant_content(self, query):
|
| 115 |
+
"""Simple RAG retrieval based on keyword matching"""
|
| 116 |
+
query_lower = query.lower()
|
| 117 |
+
relevant_docs = []
|
| 118 |
+
|
| 119 |
+
# Score documents based on relevance
|
| 120 |
+
doc_scores = defaultdict(int)
|
| 121 |
+
for filename, content in self.knowledge_base.items():
|
| 122 |
+
content_lower = content.lower()
|
| 123 |
+
|
| 124 |
+
# Direct query match (highest score)
|
| 125 |
+
if query_lower in content_lower:
|
| 126 |
+
doc_scores[filename] += 10
|
| 127 |
+
|
| 128 |
+
# Word-by-word matching
|
| 129 |
+
query_words = query_lower.split()
|
| 130 |
+
for word in query_words:
|
| 131 |
+
if len(word) > 2 and word in content_lower:
|
| 132 |
+
doc_scores[filename] += 2
|
| 133 |
+
|
| 134 |
+
# Return top relevant documents
|
| 135 |
+
sorted_docs = sorted(doc_scores.items(), key=lambda x: x[1], reverse=True)
|
| 136 |
+
|
| 137 |
+
# Get top 3 most relevant documents
|
| 138 |
+
for filename, score in sorted_docs[:3]:
|
| 139 |
+
if score > 0:
|
| 140 |
+
relevant_docs.append({
|
| 141 |
+
'filename': filename,
|
| 142 |
+
'content': self.knowledge_base[filename],
|
| 143 |
+
'score': score
|
| 144 |
+
})
|
| 145 |
+
|
| 146 |
+
return relevant_docs
|
| 147 |
+
|
| 148 |
+
def system_prompt(self, relevant_docs=None):
|
| 149 |
+
system_prompt = f"You are {self.chatbot_name}, an AI representative for {self.owner_name}. \
|
| 150 |
+
You represent both {self.owner_name} personally and {self.chatbot_name} company. \
|
| 151 |
+
\n\nYou have access to a comprehensive knowledge base with {len(self.knowledge_base)} documents. \
|
| 152 |
+
Be professional, engaging, and use the knowledge base to provide accurate responses. \
|
| 153 |
+
\n\nIf you don't know something, use record_unknown_question tool. \
|
| 154 |
+
If users provide emails, use record_user_details tool."
|
| 155 |
+
|
| 156 |
+
# Add core information (truncated for context limit)
|
| 157 |
+
system_prompt += f"\n\n## Core Information:"
|
| 158 |
+
system_prompt += f"\n### {self.owner_name}'s Summary:\n{self.summary[:800]}..."
|
| 159 |
+
system_prompt += f"\n\n### {self.chatbot_name} Business:\n{self.digitizedbrains_info[:800]}..."
|
| 160 |
+
|
| 161 |
+
# Add relevant documents
|
| 162 |
+
if relevant_docs:
|
| 163 |
+
system_prompt += f"\n\n## Relevant Documents:"
|
| 164 |
+
for doc in relevant_docs:
|
| 165 |
+
system_prompt += f"\n\n### {doc['filename']} (Score: {doc['score']}):\n"
|
| 166 |
+
content = doc['content'][:1500] + "..." if len(doc['content']) > 1500 else doc['content']
|
| 167 |
+
system_prompt += content
|
| 168 |
+
|
| 169 |
+
return system_prompt
|
| 170 |
+
|
| 171 |
+
def chat(self, message, history):
|
| 172 |
+
# RAG Retrieval
|
| 173 |
+
relevant_docs = self.search_relevant_content(message)
|
| 174 |
+
print(f"\nQuery: {message[:50]}...")
|
| 175 |
+
print(f"Found {len(relevant_docs)} relevant documents:")
|
| 176 |
+
for doc in relevant_docs:
|
| 177 |
+
print(f" - {doc['filename']} (score: {doc['score']})")
|
| 178 |
+
|
| 179 |
+
# Generate response
|
| 180 |
+
prompt = self.system_prompt(relevant_docs) + "\n\n"
|
| 181 |
+
|
| 182 |
+
# Add conversation history
|
| 183 |
+
for h in history:
|
| 184 |
+
prompt += f"{h['role'].capitalize()}: {h['content']}\n"
|
| 185 |
+
prompt += f"User: {message}\nAssistant:"
|
| 186 |
+
|
| 187 |
+
try:
|
| 188 |
+
response = self.model.generate_content(prompt)
|
| 189 |
+
reply = response.text
|
| 190 |
+
except Exception as e:
|
| 191 |
+
reply = f"Xin lỗi, tôi gặp lỗi khi xử lý câu hỏi của bạn. Vui lòng thử lại. Error: {str(e)}"
|
| 192 |
+
|
| 193 |
+
# Email detection
|
| 194 |
+
email_match = re.search(r'[\w\.-]+@[\w\.-]+', message)
|
| 195 |
+
if email_match:
|
| 196 |
+
email = email_match.group(0)
|
| 197 |
+
record_user_details(email, "Website Contact", f"RAG chat: {message[:100]}")
|
| 198 |
+
|
| 199 |
+
# Unknown question detection
|
| 200 |
+
if "I don't know" in reply or "không biết" in reply.lower():
|
| 201 |
+
record_unknown_question(message)
|
| 202 |
+
|
| 203 |
+
return reply
|
| 204 |
+
|
| 205 |
+
|
| 206 |
+
if __name__ == "__main__":
|
| 207 |
+
print("Starting RAG-Enhanced DigitizedBrains Chatbot...")
|
| 208 |
+
me = Me()
|
| 209 |
+
print("\n" + "="*60)
|
| 210 |
+
print("RAG-ENHANCED DIGITIZEDBRAINS CHATBOT READY!")
|
| 211 |
+
print("="*60)
|
| 212 |
+
print("Features:")
|
| 213 |
+
print(" - RAG-based knowledge retrieval")
|
| 214 |
+
print(" - Multi-document search")
|
| 215 |
+
print(" - Intelligent response generation")
|
| 216 |
+
print(" - Lead capture & unknown question tracking")
|
| 217 |
+
print("="*60)
|
| 218 |
+
|
| 219 |
+
# Launch Gradio interface
|
| 220 |
+
iface = gr.ChatInterface(
|
| 221 |
+
me.chat,
|
| 222 |
+
type="messages",
|
| 223 |
+
title="DigitizedBrains RAG Chatbot",
|
| 224 |
+
description="AI-powered chatbot with comprehensive knowledge base about Duc Nguyen and DigitizedBrains services."
|
| 225 |
+
)
|
| 226 |
+
iface.launch(share=False, server_name="0.0.0.0")
|
app_rag_test.py
ADDED
|
@@ -0,0 +1,265 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from dotenv import load_dotenv
|
| 2 |
+
import google.generativeai as genai
|
| 3 |
+
import json
|
| 4 |
+
import os
|
| 5 |
+
import requests
|
| 6 |
+
from pypdf import PdfReader
|
| 7 |
+
import gradio as gr
|
| 8 |
+
import re
|
| 9 |
+
import glob
|
| 10 |
+
from collections import defaultdict
|
| 11 |
+
|
| 12 |
+
|
| 13 |
+
load_dotenv(override=True)
|
| 14 |
+
|
| 15 |
+
def push(text):
|
| 16 |
+
requests.post(
|
| 17 |
+
"https://api.pushover.net/1/messages.json",
|
| 18 |
+
data={
|
| 19 |
+
"token": os.getenv("PUSHOVER_TOKEN"),
|
| 20 |
+
"user": os.getenv("PUSHOVER_USER"),
|
| 21 |
+
"message": text,
|
| 22 |
+
}
|
| 23 |
+
)
|
| 24 |
+
|
| 25 |
+
|
| 26 |
+
def record_user_details(email, name="Name not provided", notes="not provided"):
|
| 27 |
+
push(f"Recording {name} with email {email} and notes {notes}")
|
| 28 |
+
return {"recorded": "ok"}
|
| 29 |
+
|
| 30 |
+
def record_unknown_question(question):
|
| 31 |
+
push(f"Recording {question}")
|
| 32 |
+
return {"recorded": "ok"}
|
| 33 |
+
|
| 34 |
+
record_user_details_json = {
|
| 35 |
+
"name": "record_user_details",
|
| 36 |
+
"description": "Use this tool to record that a user is interested in being in touch and provided an email address",
|
| 37 |
+
"parameters": {
|
| 38 |
+
"type": "object",
|
| 39 |
+
"properties": {
|
| 40 |
+
"email": {
|
| 41 |
+
"type": "string",
|
| 42 |
+
"description": "The email address of this user"
|
| 43 |
+
},
|
| 44 |
+
"name": {
|
| 45 |
+
"type": "string",
|
| 46 |
+
"description": "The user's name, if they provided it"
|
| 47 |
+
},
|
| 48 |
+
"notes": {
|
| 49 |
+
"type": "string",
|
| 50 |
+
"description": "Any additional information about the conversation that's worth recording to give context"
|
| 51 |
+
}
|
| 52 |
+
},
|
| 53 |
+
"required": ["email"],
|
| 54 |
+
"additionalProperties": False
|
| 55 |
+
}
|
| 56 |
+
}
|
| 57 |
+
|
| 58 |
+
record_unknown_question_json = {
|
| 59 |
+
"name": "record_unknown_question",
|
| 60 |
+
"description": "Always use this tool to record any question that couldn't be answered as you didn't know the answer",
|
| 61 |
+
"parameters": {
|
| 62 |
+
"type": "object",
|
| 63 |
+
"properties": {
|
| 64 |
+
"question": {
|
| 65 |
+
"type": "string",
|
| 66 |
+
"description": "The question that couldn't be answered"
|
| 67 |
+
}
|
| 68 |
+
},
|
| 69 |
+
"required": ["question"],
|
| 70 |
+
"additionalProperties": False
|
| 71 |
+
}
|
| 72 |
+
}
|
| 73 |
+
|
| 74 |
+
tools = [record_user_details_json, record_unknown_question_json]
|
| 75 |
+
|
| 76 |
+
|
| 77 |
+
class Me:
|
| 78 |
+
|
| 79 |
+
def __init__(self):
|
| 80 |
+
genai.configure(api_key=os.getenv("GEMINI_API_KEY"))
|
| 81 |
+
self.model = genai.GenerativeModel("gemini-1.5-flash")
|
| 82 |
+
self.owner_name = "Duc Nguyen" # Chủ sở hữu website và DigitizedBrains
|
| 83 |
+
self.chatbot_name = "DigitizedBrains" # Nhân vật đại diện chatbot
|
| 84 |
+
|
| 85 |
+
# RAG Knowledge Base - Load all documents (text files only for faster loading)
|
| 86 |
+
self.knowledge_base = self.load_documents_optimized()
|
| 87 |
+
print(f"Loaded {len(self.knowledge_base)} documents into RAG knowledge base")
|
| 88 |
+
|
| 89 |
+
# Core information (backwards compatibility)
|
| 90 |
+
self.linkedin = self.knowledge_base.get('linkedin_profile.txt', '[LinkedIn profile not found]')
|
| 91 |
+
self.summary = self.knowledge_base.get('summary.txt', '[Summary not found]')
|
| 92 |
+
self.digitizedbrains_info = self.knowledge_base.get('digitizedbrains_profile.txt', '[DigitizedBrains profile not found]')
|
| 93 |
+
|
| 94 |
+
def load_documents_optimized(self):
|
| 95 |
+
"""Load documents optimized for faster startup (text files only)"""
|
| 96 |
+
knowledge_base = {}
|
| 97 |
+
document_dir = "document/"
|
| 98 |
+
|
| 99 |
+
# Load all text files (fast)
|
| 100 |
+
for txt_file in glob.glob(os.path.join(document_dir, "*.txt")):
|
| 101 |
+
filename = os.path.basename(txt_file)
|
| 102 |
+
try:
|
| 103 |
+
with open(txt_file, "r", encoding="utf-8") as f:
|
| 104 |
+
content = f.read()
|
| 105 |
+
knowledge_base[filename] = content
|
| 106 |
+
print(f"Loaded text document: {filename} ({len(content)} chars)")
|
| 107 |
+
except Exception as e:
|
| 108 |
+
print(f"Warning: Could not load {filename}: text loading error")
|
| 109 |
+
|
| 110 |
+
# Load only essential PDF files for demo
|
| 111 |
+
essential_pdfs = ['chuyendoiso.pdf', 'digitizedbrains.pdf', 'linkedin.pdf']
|
| 112 |
+
for pdf_name in essential_pdfs:
|
| 113 |
+
pdf_path = os.path.join(document_dir, pdf_name)
|
| 114 |
+
if os.path.exists(pdf_path):
|
| 115 |
+
try:
|
| 116 |
+
reader = PdfReader(pdf_path)
|
| 117 |
+
pdf_content = ""
|
| 118 |
+
for page in reader.pages[:5]: # Only first 5 pages for speed
|
| 119 |
+
text = page.extract_text()
|
| 120 |
+
if text:
|
| 121 |
+
pdf_content += text + "\n"
|
| 122 |
+
knowledge_base[pdf_name] = pdf_content
|
| 123 |
+
print(f"Loaded PDF document: {pdf_name} ({len(pdf_content)} chars)")
|
| 124 |
+
except Exception as e:
|
| 125 |
+
print(f"Warning: Could not load PDF {pdf_name}: PDF loading error")
|
| 126 |
+
|
| 127 |
+
return knowledge_base
|
| 128 |
+
|
| 129 |
+
def search_relevant_content(self, query):
|
| 130 |
+
"""Simple RAG retrieval - find most relevant documents based on keyword matching"""
|
| 131 |
+
query_lower = query.lower()
|
| 132 |
+
relevant_docs = []
|
| 133 |
+
|
| 134 |
+
# Keywords for different document types
|
| 135 |
+
keywords = {
|
| 136 |
+
'personal': ['duc nguyen', 'linkedin', 'career', 'experience', 'education', 'background', 'profile'],
|
| 137 |
+
'business': ['digitizedbrains', 'company', 'services', 'solutions', 'automation', 'ai agent'],
|
| 138 |
+
'digital_transformation': ['chuyển đổi số', 'digital transformation', 'technology', 'broadcasting', 'htv'],
|
| 139 |
+
'experience': ['kinh nghiệm', 'experience', 'học', 'tham luận', 'diễn đàn']
|
| 140 |
+
}
|
| 141 |
+
|
| 142 |
+
# Score documents based on keyword relevance
|
| 143 |
+
doc_scores = defaultdict(int)
|
| 144 |
+
for filename, content in self.knowledge_base.items():
|
| 145 |
+
content_lower = content.lower()
|
| 146 |
+
|
| 147 |
+
# Direct query match
|
| 148 |
+
if query_lower in content_lower:
|
| 149 |
+
doc_scores[filename] += 10
|
| 150 |
+
|
| 151 |
+
# Keyword category matching
|
| 152 |
+
for category, category_keywords in keywords.items():
|
| 153 |
+
for keyword in category_keywords:
|
| 154 |
+
if keyword in query_lower and keyword in content_lower:
|
| 155 |
+
doc_scores[filename] += 5
|
| 156 |
+
|
| 157 |
+
# Additional scoring for query words
|
| 158 |
+
query_words = query_lower.split()
|
| 159 |
+
for word in query_words:
|
| 160 |
+
if len(word) > 2 and word in content_lower:
|
| 161 |
+
doc_scores[filename] += 2
|
| 162 |
+
|
| 163 |
+
# Return top relevant documents
|
| 164 |
+
sorted_docs = sorted(doc_scores.items(), key=lambda x: x[1], reverse=True)
|
| 165 |
+
|
| 166 |
+
# Get top 5 most relevant documents
|
| 167 |
+
for filename, score in sorted_docs[:5]:
|
| 168 |
+
if score > 0:
|
| 169 |
+
relevant_docs.append({
|
| 170 |
+
'filename': filename,
|
| 171 |
+
'content': self.knowledge_base[filename],
|
| 172 |
+
'score': score
|
| 173 |
+
})
|
| 174 |
+
|
| 175 |
+
return relevant_docs
|
| 176 |
+
|
| 177 |
+
def system_prompt(self, relevant_docs=None):
|
| 178 |
+
system_prompt = f"You are {self.chatbot_name}, an AI representative acting on behalf of {self.owner_name}. \
|
| 179 |
+
You are answering questions on {self.owner_name}'s website, representing both {self.owner_name} personally and the {self.chatbot_name} company/brand. \
|
| 180 |
+
\n\nYour responsibilities include: \
|
| 181 |
+
1. Representing {self.owner_name}'s career, background, skills and experience using his comprehensive knowledge base \
|
| 182 |
+
2. Representing {self.chatbot_name} as a digital transformation and AI solutions company \
|
| 183 |
+
3. Answering questions about digital transformation, broadcasting, and technology expertise \
|
| 184 |
+
4. Using the extensive document knowledge base to provide detailed, accurate responses \
|
| 185 |
+
\n\nYou have access to a comprehensive RAG knowledge base with {len(self.knowledge_base)} documents including: \
|
| 186 |
+
- Personal information about {self.owner_name} (career, LinkedIn, education, experience) \
|
| 187 |
+
- Business information about {self.chatbot_name} (services, solutions, capabilities) \
|
| 188 |
+
- Digital transformation expertise and case studies \
|
| 189 |
+
- Broadcasting and media technology knowledge \
|
| 190 |
+
- Academic papers and industry presentations \
|
| 191 |
+
\n\nBe professional and engaging, using the knowledge base to provide comprehensive answers. \
|
| 192 |
+
When discussing {self.owner_name}, speak about him in first person as his representative. \
|
| 193 |
+
When discussing {self.chatbot_name}, represent the company's capabilities and services. \
|
| 194 |
+
\n\nIf you don't know the answer to any question, use your record_unknown_question tool to record it. \
|
| 195 |
+
If the user shows interest in services or wants to connect, try to get their email using your record_user_details tool."
|
| 196 |
+
|
| 197 |
+
# Add core information
|
| 198 |
+
system_prompt += f"\n\n## Core Information:"
|
| 199 |
+
system_prompt += f"\n### {self.owner_name}'s Summary:\n{self.summary[:1000]}..."
|
| 200 |
+
system_prompt += f"\n\n### {self.chatbot_name} Business Profile:\n{self.digitizedbrains_info[:1000]}..."
|
| 201 |
+
|
| 202 |
+
# Add relevant documents if provided
|
| 203 |
+
if relevant_docs:
|
| 204 |
+
system_prompt += f"\n\n## Relevant Knowledge Base Documents:"
|
| 205 |
+
for doc in relevant_docs:
|
| 206 |
+
system_prompt += f"\n\n### Document: {doc['filename']} (Relevance Score: {doc['score']})\n"
|
| 207 |
+
# Truncate content to avoid context limit
|
| 208 |
+
content = doc['content'][:2000] + "..." if len(doc['content']) > 2000 else doc['content']
|
| 209 |
+
system_prompt += content
|
| 210 |
+
|
| 211 |
+
system_prompt += f"\n\nWith this comprehensive RAG knowledge base, please provide detailed and accurate responses as {self.chatbot_name}, \
|
| 212 |
+
representing both {self.owner_name} personally and the {self.chatbot_name} business professionally."
|
| 213 |
+
return system_prompt
|
| 214 |
+
|
| 215 |
+
def handle_tool_call(self, tool_calls):
|
| 216 |
+
results = []
|
| 217 |
+
for tool_call in tool_calls:
|
| 218 |
+
tool_name = tool_call.function.name
|
| 219 |
+
arguments = json.loads(tool_call.function.arguments)
|
| 220 |
+
print(f"Tool called: {tool_name}", flush=True)
|
| 221 |
+
tool = globals().get(tool_name)
|
| 222 |
+
result = tool(**arguments) if tool else {}
|
| 223 |
+
results.append({"role": "tool","content": json.dumps(result),"tool_call_id": tool_call.id})
|
| 224 |
+
return results
|
| 225 |
+
|
| 226 |
+
def chat(self, message, history):
|
| 227 |
+
# RAG Retrieval - Find relevant documents for the user's question
|
| 228 |
+
relevant_docs = self.search_relevant_content(message)
|
| 229 |
+
print(f"Found {len(relevant_docs)} relevant documents for query: {message[:100]}...")
|
| 230 |
+
for doc in relevant_docs[:3]:
|
| 231 |
+
print(f" - {doc['filename']} (score: {doc['score']})")
|
| 232 |
+
|
| 233 |
+
# Generate prompt with relevant context
|
| 234 |
+
prompt = self.system_prompt(relevant_docs) + "\n\n"
|
| 235 |
+
|
| 236 |
+
# Add conversation history
|
| 237 |
+
for h in history:
|
| 238 |
+
prompt += f"{h['role'].capitalize()}: {h['content']}\n"
|
| 239 |
+
prompt += f"User: {message}\nAssistant:"
|
| 240 |
+
|
| 241 |
+
# Generate response
|
| 242 |
+
response = self.model.generate_content(prompt)
|
| 243 |
+
reply = response.text
|
| 244 |
+
|
| 245 |
+
# Tìm email trong message hoặc reply
|
| 246 |
+
email_match = re.search(r'[\w\.-]+@[\w\.-]+', message)
|
| 247 |
+
if email_match:
|
| 248 |
+
email = email_match.group(0)
|
| 249 |
+
name = "Contact from website" # hoặc trích xuất tên nếu muốn
|
| 250 |
+
notes = f"User provided email via {self.chatbot_name} chat with RAG knowledge base"
|
| 251 |
+
record_user_details(email, name, notes)
|
| 252 |
+
|
| 253 |
+
# Nếu Gemini trả lời không biết, thì ghi lại câu hỏi
|
| 254 |
+
if "I don't know" in reply or "I'm not sure" in reply or "Tôi không biết" in reply:
|
| 255 |
+
record_unknown_question(message)
|
| 256 |
+
|
| 257 |
+
return reply
|
| 258 |
+
|
| 259 |
+
|
| 260 |
+
if __name__ == "__main__":
|
| 261 |
+
me = Me()
|
| 262 |
+
print("\n" + "="*50)
|
| 263 |
+
print("RAG-ENHANCED DIGITIZEDBRAINS CHATBOT READY!")
|
| 264 |
+
print("="*50)
|
| 265 |
+
gr.ChatInterface(me.chat, type="messages").launch()
|
document/8 bài học kinh nghiệm tại Hồ Nam Trung Quốc.pdf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:66752b2bba3fda91201b7589d9583d152785c04f78cb330bf45b9d7ba80993c0
|
| 3 |
+
size 459099
|
document/Bài Tham Luận_Chuyen doi so.pdf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:8a463b4330915f63b9c643f43ad6ea6ffd37dbbd259acba1812a0e84c1185236
|
| 3 |
+
size 190744
|
document/CHUYỂN ĐỔI SỐ ĐÀI TRUYỀN HÌNH TP HCM_ver1.pdf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d30ec0e63038192ec4d018415cec5d7f6314f53669a5c23563edf6deeca75c1e
|
| 3 |
+
size 1694083
|
document/Chuyển đổi số VieON.pdf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:09c2d2b389519fa50965335378fdbd55c54875e90333c3e0a02ebd5c6178e10c
|
| 3 |
+
size 431240
|
document/My Changsha App.pdf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:650ec6056a2704146b156828432ccd67f7200344fcadbc96d99e318c83d76ebe
|
| 3 |
+
size 258660
|
document/Phần trình bày trong Diễn Đàn Chuyển Đổi Số_Nguyễn Tấn Đức.pdf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:8e19a116d5f2f1a98c8919d77054b9747f86096a5a300782bd5a6713da5b7719
|
| 3 |
+
size 323153
|
document/Phụ luc 1_ SỬ DỤNG API MỞ ĐỂ CẢI THIỆN DỊCH VỤ VÀ MỞ RỘNG HỆ SINH THÁI.pdf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4943f09adccd1bc30b75cf3c6f7f336954aa39f199af0501dca6db674fe9bc8a
|
| 3 |
+
size 195399
|
document/Summary-ThanhHoa.txt
ADDED
|
@@ -0,0 +1,32 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
|
| 2 |
+
Dưới đây là thông tin tổng hợp về Tiến sĩ Nguyễn Thanh Hòa – Phó Giám đốc Trung tâm Chuyển đổi số Thành phố Hồ Chí Minh (HCMC-DXCENTER) phù hợp để trình bày trong một file Word:
|
| 3 |
+
Tiến sĩ Nguyễn Thanh Hòa
|
| 4 |
+
Phó Giám đốc Trung tâm Chuyển đổi số Thành phố Hồ Chí Minh (HCMC-DXCENTER)
|
| 5 |
+
Thông tin chính
|
| 6 |
+
• Chức vụ: Phó Giám đốc Trung tâm Chuyển đổi số TP.HCM[1].
|
| 7 |
+
• Chuyên ngành: Báo chí truyền thông, chuyên gia chuyển đổi số.
|
| 8 |
+
• Trung tâm Chuyển đổi số TP.HCM là đơn vị sự nghiệp công lập thuộc lĩnh vực thông tin truyền thông trực thuộc Ủy ban nhân dân TP.HCM[1].
|
| 9 |
+
• Ông Nguyễn Thanh Hòa thường xuyên tham gia các hội thảo, tọa đàm về chuyển đổi số tại TP.HCM, nhấn mạnh yếu tố "chuyển đổi" (thay đổi quy trình, mô hình vận hành, mô hình kinh doanh) song song với đầu tư hạ tầng công nghệ số[2].
|
| 10 |
+
• Trung tâm Chuyển đổi số TP.HCM có vai trò: Triển khai Kiến trúc Chính quyền điện tử, vận hành trung tâm dữ liệu, quản lý mạng truyền số liệu, phối hợp với các sở ngành xây dựng thành phố thông minh, chính quyền số, kinh tế số và xã hội số.
|
| 11 |
+
• Trung tâm tập trung đẩy mạnh hiệu quả hành chính điện tử, vận hành nền kinh tế và xã hội số, đặc biệt trong giai đoạn năm 2024 và những năm tiếp theo nhằm số hóa hoạt động của chính quyền, doanh nghiệp và phục vụ người dân.
|
| 12 |
+
Quan điểm chuyên môn
|
| 13 |
+
“Trong quá trình chuyển đổi số, phần 'số' (hạ tầng công nghệ) thường được chú trọng trong khi phần 'chuyển đổi' (ứng dụng mô hình mới, hình thức kinh doanh mới) lại chưa được quan tâm đúng mức nên cần đẩy mạnh công tác này.”
|
| 14 |
+
— TS. Nguyễn Thanh Hòa, Phó Giám đốc Trung tâm Chuyển đổi số TP.HCM[2]
|
| 15 |
+
Một số hoạt động nổi bật
|
| 16 |
+
• Tham gia, trình bày tại các diễn đàn kinh tế, hội thảo khoa học – công nghệ và chuyển đổi số cấp thành phố.
|
| 17 |
+
• Đại diện Trung tâm nêu rõ vai trò của dữ liệu, nền tảng số; tham gia góp ý về luật Công nghiệp Công nghệ Số và vận động ứng dụng blockchain trong quản lý tài sản trí tuệ tại địa phương.
|
| 18 |
+
• Tích cực tham gia vào việc xây dựng và phổ biến các giải pháp số phục vụ hoạt động quản lý, kinh doanh mới cho thành phố.
|
| 19 |
+
⁂
|
| 20 |
+
|
| 21 |
+
1. https://congluan.vn/bao-chi-dia-phuong-van-loay-hoay-lua-chon-mo-hinh-kinh-doanh-10295255.html
|
| 22 |
+
2. https://www.sggp.org.vn/can-quan-tam-dung-muc-phan-chuyen-doi-trong-chuyen-doi-so-post800017.html
|
| 23 |
+
3. https://chuyendoiso.hochiminhcity.gov.vn/danh-sách-chuyên-gia1
|
| 24 |
+
|
| 25 |
+
TS. Nguyễn Thanh Hòa
|
| 26 |
+
Phó Giám đốc Trung tâm Chuyển đổi số TP.HCM
|
| 27 |
+
|
| 28 |
+
TS. Nguyễn Thanh Hòa là chuyên gia hàng đầu trong lĩnh vực chuyển đổi số và truyền thông tại Việt Nam. Với hơn 10 năm nghiên cứu, giảng dạy và quản lý về chuyển đổi số, ông đã đóng vai trò quan trọng trong việc xây dựng chiến lược và triển khai các dự án chuyển đổi số quy mô lớn cho các tổ chức, doanh nghiệp và cơ quan truyền thông tại TP.HCM.
|
| 29 |
+
|
| 30 |
+
Trước khi đảm nhận vị trí Phó Giám đốc Trung tâm Chuyển đổi số TP.HCM, TS. Hòa có hơn 17 năm kinh nghiệm làm việc và quản lý tại các đơn vị báo chí, truyền hình lớn, nơi ông đã dẫn dắt nhiều sáng kiến đổi mới sáng tạo, ứng dụng công nghệ AI và dữ liệu lớn vào sản xuất nội dung, tối ưu hóa quy trình vận hành và nâng cao trải nghiệm người dùng.
|
| 31 |
+
|
| 32 |
+
Ông là tác giả của nhiều công trình nghiên cứu, bài báo khoa học về chuyển đổi số, đồng thời là diễn giả uy tín tại các hội thảo, diễn đàn trong nước và quốc tế về công nghệ, truyền thông và quản trị số. TS. Hòa luôn tiên phong trong việc kết nối tri thức toàn cầu với thực tiễn Việt Nam, góp phần thúc đẩy sự phát triển bền vững của hệ sinh thái số quốc gia.
|
document/Thời cuộc chuyển đổi số của ngành truyền thông.pdf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:5408f19bafb746f4ad5ffa130d9d94bbe7db460c6b69937c6d992d63b0d7abc8
|
| 3 |
+
size 324582
|
document/Tong hop cac tai lieu Chuyen doi so.pdf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:19b799a703f2ea14814b24d240bec4c6bba2ec63aae5040e7971edcabde90009
|
| 3 |
+
size 1838422
|
document/Xay dung mo hinh chuyen doi so HTV_NGUYEN TAN DUC.pdf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2579e35c0e95fb3cabcdfc57d567da311b54dc2111c636169d9eadf255a5507a
|
| 3 |
+
size 881077
|
document/chuyendoiso.pdf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:26005bc4c9c64781fc85dd11052c63fad5e3c8edd42cb887f82744299727fb70
|
| 3 |
+
size 284236
|
document/digitizedbrains.pdf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:7fd7f0cf77f9260d2fb5b7c82f4cfcb9ea1234d65d0fe7139358ff083941fecb
|
| 3 |
+
size 1432192
|
document/digitizedbrains_profile.txt
ADDED
|
@@ -0,0 +1,353 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
DIGITIZEDBRAINS COMPANY PROFILE
|
| 2 |
+
|
| 3 |
+
==================================================
|
| 4 |
+
EXECUTIVE SUMMARY
|
| 5 |
+
==================================================
|
| 6 |
+
|
| 7 |
+
DigitizedBrains is Vietnam's leading AI Agent and digital transformation consultancy, founded in 2023 by Duc Nguyen, a veteran technology leader with 24+ years of experience in broadcasting and media technology. The company specializes in intelligent process automation, AI-driven business solutions, and comprehensive digital transformation services for Vietnamese enterprises.
|
| 8 |
+
|
| 9 |
+
Headquarters: Ho Chi Minh City, Vietnam
|
| 10 |
+
Founded: 2024
|
| 11 |
+
Founder & CEO: Duc Nguyen
|
| 12 |
+
Industry Focus: AI Agents, Digital Transformation, Process Automation
|
| 13 |
+
Mission: Empowering Vietnamese enterprises with intelligent digital solutions for operational excellence and sustainable growth
|
| 14 |
+
|
| 15 |
+
==================================================
|
| 16 |
+
COMPANY OVERVIEW
|
| 17 |
+
==================================================
|
| 18 |
+
|
| 19 |
+
DigitizedBrains represents the convergence of artificial intelligence and business intelligence, providing Vietnamese companies with the tools and expertise needed to thrive in the digital economy. Our name reflects our core philosophy: combining human intelligence with digital capabilities to create "digitized brains" that enhance decision-making, automate processes, and drive innovation.
|
| 20 |
+
|
| 21 |
+
As Vietnam's digital transformation accelerates, DigitizedBrains serves as a strategic partner for enterprises seeking to transition from intuition-based operations to data-driven, automated business processes. We bridge the gap between traditional Vietnamese business practices and cutting-edge AI technologies.
|
| 22 |
+
|
| 23 |
+
==================================================
|
| 24 |
+
CORE SERVICES & SOLUTIONS
|
| 25 |
+
==================================================
|
| 26 |
+
|
| 27 |
+
AI AGENT SOLUTIONS
|
| 28 |
+
-------------------
|
| 29 |
+
|
| 30 |
+
Process Automation (RPA):
|
| 31 |
+
• 24/7 automated robot systems for repetitive tasks
|
| 32 |
+
• Data entry automation with 99% accuracy
|
| 33 |
+
• Document processing and workflow management
|
| 34 |
+
• Custom automation solutions for specific business needs
|
| 35 |
+
|
| 36 |
+
Intelligent Workflow Management:
|
| 37 |
+
• AI-powered decision-making engines
|
| 38 |
+
• Automatic process routing and optimization
|
| 39 |
+
• Real-time workflow monitoring and adjustment
|
| 40 |
+
• Exception handling and escalation management
|
| 41 |
+
|
| 42 |
+
Document Processing Automation:
|
| 43 |
+
• Invoice processing with OCR and AI extraction
|
| 44 |
+
• Contract management and compliance checking
|
| 45 |
+
• Automated document classification and routing
|
| 46 |
+
• Digital signature integration and workflow
|
| 47 |
+
|
| 48 |
+
HR Process Automation:
|
| 49 |
+
• Recruitment and candidate screening automation
|
| 50 |
+
• Training program management and tracking
|
| 51 |
+
• Payroll processing and compliance management
|
| 52 |
+
• Performance evaluation and reporting systems
|
| 53 |
+
|
| 54 |
+
Financial Process Automation:
|
| 55 |
+
• Accounting automation and reconciliation
|
| 56 |
+
• Expense management and approval workflows
|
| 57 |
+
• Financial reporting and dashboard generation
|
| 58 |
+
• Compliance monitoring and audit trail management
|
| 59 |
+
|
| 60 |
+
Customer Service Automation:
|
| 61 |
+
• AI chatbots with natural language processing
|
| 62 |
+
• Intelligent ticket routing and prioritization
|
| 63 |
+
• Automated response generation and escalation
|
| 64 |
+
• Customer satisfaction tracking and analysis
|
| 65 |
+
|
| 66 |
+
DIGITAL TRANSFORMATION SERVICES
|
| 67 |
+
--------------------------------
|
| 68 |
+
|
| 69 |
+
Business Process Digitization:
|
| 70 |
+
• Process analysis and optimization consulting
|
| 71 |
+
• Digital workflow design and implementation
|
| 72 |
+
• Legacy system modernization
|
| 73 |
+
• Change management and training programs
|
| 74 |
+
|
| 75 |
+
System Integration:
|
| 76 |
+
• ERP, CRM, and business system connectivity
|
| 77 |
+
• Data synchronization across platforms
|
| 78 |
+
• API development and management
|
| 79 |
+
• Cloud migration and hybrid solutions
|
| 80 |
+
|
| 81 |
+
Digital Ecosystem Development:
|
| 82 |
+
• End-to-end digital infrastructure planning
|
| 83 |
+
• Scalable architecture design
|
| 84 |
+
• Security and compliance framework implementation
|
| 85 |
+
• Performance monitoring and optimization
|
| 86 |
+
|
| 87 |
+
DATA ANALYTICS & BUSINESS INTELLIGENCE
|
| 88 |
+
---------------------------------------
|
| 89 |
+
|
| 90 |
+
Data Integration & Standardization:
|
| 91 |
+
• Multi-source data consolidation (ERP, CRM, IoT)
|
| 92 |
+
• Data cleansing and quality management
|
| 93 |
+
• Master data management solutions
|
| 94 |
+
• Real-time data pipeline development
|
| 95 |
+
|
| 96 |
+
Analysis & Visualization:
|
| 97 |
+
• Multi-dimensional data analysis
|
| 98 |
+
• Interactive dashboard development
|
| 99 |
+
• Trend identification and pattern recognition
|
| 100 |
+
• Risk detection and mitigation strategies
|
| 101 |
+
|
| 102 |
+
AI/ML & Smart Forecasting:
|
| 103 |
+
• Predictive analytics model development
|
| 104 |
+
• Machine learning algorithm implementation
|
| 105 |
+
• Demand forecasting and capacity planning
|
| 106 |
+
• Intelligent decision support systems
|
| 107 |
+
|
| 108 |
+
Real-time Process Monitoring:
|
| 109 |
+
• Performance optimization dashboards
|
| 110 |
+
• Automated alert and notification systems
|
| 111 |
+
• Process bottleneck identification
|
| 112 |
+
• Quick issue resolution protocols
|
| 113 |
+
|
| 114 |
+
==================================================
|
| 115 |
+
INDUSTRY EXPERTISE
|
| 116 |
+
==================================================
|
| 117 |
+
|
| 118 |
+
MANUFACTURING
|
| 119 |
+
-------------
|
| 120 |
+
• Smart factory solutions with IoT integration
|
| 121 |
+
• Predictive maintenance system implementation
|
| 122 |
+
• Quality control automation and monitoring
|
| 123 |
+
• Production planning and scheduling optimization
|
| 124 |
+
• Supply chain visibility and optimization
|
| 125 |
+
• Equipment performance monitoring and analysis
|
| 126 |
+
|
| 127 |
+
FINANCIAL SERVICES
|
| 128 |
+
------------------
|
| 129 |
+
• Loan processing automation and approval workflows
|
| 130 |
+
• KYC (Know Your Customer) compliance automation
|
| 131 |
+
• Financial reporting and regulatory compliance
|
| 132 |
+
• Risk management and fraud detection systems
|
| 133 |
+
• Customer onboarding and account management
|
| 134 |
+
• Payment processing and reconciliation
|
| 135 |
+
|
| 136 |
+
RETAIL & E-COMMERCE
|
| 137 |
+
-------------------
|
| 138 |
+
• Order processing and fulfillment automation
|
| 139 |
+
• Inventory optimization and demand forecasting
|
| 140 |
+
• Customer service chatbots and support systems
|
| 141 |
+
• Dynamic pricing and promotional management
|
| 142 |
+
• Customer behavior analysis and segmentation
|
| 143 |
+
• Multi-channel sales integration
|
| 144 |
+
|
| 145 |
+
HEALTHCARE
|
| 146 |
+
----------
|
| 147 |
+
• Patient appointment scheduling and management
|
| 148 |
+
• Medical record automation and digitization
|
| 149 |
+
• Payment processing and insurance management
|
| 150 |
+
• Compliance monitoring and reporting
|
| 151 |
+
• Telemedicine platform integration
|
| 152 |
+
• Clinical workflow optimization
|
| 153 |
+
|
| 154 |
+
==================================================
|
| 155 |
+
TECHNOLOGY STACK & CAPABILITIES
|
| 156 |
+
==================================================
|
| 157 |
+
|
| 158 |
+
ARTIFICIAL INTELLIGENCE
|
| 159 |
+
• Machine Learning algorithms and model development
|
| 160 |
+
• Natural Language Processing for Vietnamese and English
|
| 161 |
+
• Computer Vision for document and image processing
|
| 162 |
+
• Predictive Analytics and forecasting models
|
| 163 |
+
• Deep Learning for complex pattern recognition
|
| 164 |
+
|
| 165 |
+
AUTOMATION TECHNOLOGIES
|
| 166 |
+
• Robotic Process Automation (RPA) platforms
|
| 167 |
+
• Workflow automation engines
|
| 168 |
+
• Business rule management systems
|
| 169 |
+
• Integration platforms and API management
|
| 170 |
+
• Cloud-based automation infrastructure
|
| 171 |
+
|
| 172 |
+
ANALYTICS & VISUALIZATION
|
| 173 |
+
• Business Intelligence platforms
|
| 174 |
+
• Data visualization and dashboard tools
|
| 175 |
+
• Statistical analysis and modeling
|
| 176 |
+
• Real-time monitoring and alerting systems
|
| 177 |
+
• Custom reporting and analytics solutions
|
| 178 |
+
|
| 179 |
+
CLOUD & INFRASTRUCTURE
|
| 180 |
+
• Scalable cloud infrastructure management
|
| 181 |
+
• Hybrid cloud solutions and migration
|
| 182 |
+
• Security and compliance frameworks
|
| 183 |
+
• Database management and optimization
|
| 184 |
+
• System monitoring and performance tuning
|
| 185 |
+
|
| 186 |
+
==================================================
|
| 187 |
+
BUSINESS VALUE PROPOSITIONS
|
| 188 |
+
==================================================
|
| 189 |
+
|
| 190 |
+
OPERATIONAL EFFICIENCY
|
| 191 |
+
• Up to 90% reduction in processing time
|
| 192 |
+
• 99% task execution accuracy
|
| 193 |
+
• Elimination of human errors in repetitive processes
|
| 194 |
+
• 24/7 operational continuity
|
| 195 |
+
• Improved resource allocation and utilization
|
| 196 |
+
|
| 197 |
+
COST OPTIMIZATION
|
| 198 |
+
• 30-80% reduction in operational costs
|
| 199 |
+
• Decreased manual labor requirements
|
| 200 |
+
• Reduced error-related costs and rework
|
| 201 |
+
• Improved energy efficiency and resource usage
|
| 202 |
+
• Optimized vendor and supplier management
|
| 203 |
+
|
| 204 |
+
BUSINESS SCALABILITY
|
| 205 |
+
• Operations scaling without proportional workforce increase
|
| 206 |
+
• Rapid business expansion capabilities
|
| 207 |
+
• Flexible process adaptation to market changes
|
| 208 |
+
• Enhanced capacity for handling business growth
|
| 209 |
+
• Improved agility in competitive markets
|
| 210 |
+
|
| 211 |
+
REGULATORY COMPLIANCE
|
| 212 |
+
• Consistent adherence to Vietnamese business regulations
|
| 213 |
+
• International standards compliance (ISO, SOX, GDPR)
|
| 214 |
+
• Automated monitoring and reporting systems
|
| 215 |
+
• Audit trail management and documentation
|
| 216 |
+
• Risk mitigation and control frameworks
|
| 217 |
+
|
| 218 |
+
COMPETITIVE ADVANTAGE
|
| 219 |
+
• Data-driven decision making capabilities
|
| 220 |
+
• Faster time-to-market for products and services
|
| 221 |
+
• Enhanced customer experience and satisfaction
|
| 222 |
+
• Improved innovation capacity and R&D efficiency
|
| 223 |
+
• Better market responsiveness and adaptability
|
| 224 |
+
|
| 225 |
+
==================================================
|
| 226 |
+
MARKET STATISTICS & RESULTS
|
| 227 |
+
==================================================
|
| 228 |
+
|
| 229 |
+
VIETNAM AUTOMATION MARKET (2024-2025)
|
| 230 |
+
• 45% of Vietnamese businesses have adopted partial process automation
|
| 231 |
+
• 280% average ROI within 8-12 months post-automation implementation
|
| 232 |
+
• 65% average operational cost reduction through process automation
|
| 233 |
+
• 85% of Vietnamese businesses plan to expand automation in the next 2 years
|
| 234 |
+
• 320% growth in Business Intelligence market adoption over past 3 years
|
| 235 |
+
|
| 236 |
+
CLIENT SUCCESS METRICS
|
| 237 |
+
• Average 280% ROI achieved within 8-12 months
|
| 238 |
+
• 90% average reduction in process completion time
|
| 239 |
+
• 99% task execution accuracy across implemented solutions
|
| 240 |
+
• 30-80% operational cost reduction for clients
|
| 241 |
+
• 95% client satisfaction and retention rate
|
| 242 |
+
|
| 243 |
+
==================================================
|
| 244 |
+
CLIENT SUCCESS STORIES
|
| 245 |
+
==================================================
|
| 246 |
+
|
| 247 |
+
MANUFACTURING CLIENT
|
| 248 |
+
Challenge: Manual quality control processes causing delays and inconsistencies
|
| 249 |
+
Solution: AI-powered quality control automation with computer vision
|
| 250 |
+
Results: 95% reduction in inspection time, 99.8% accuracy improvement
|
| 251 |
+
|
| 252 |
+
FINANCIAL SERVICES CLIENT
|
| 253 |
+
Challenge: Time-intensive loan approval processes affecting customer satisfaction
|
| 254 |
+
Solution: Automated loan processing with AI risk assessment
|
| 255 |
+
Results: 80% faster approval times, 60% reduction in processing costs
|
| 256 |
+
|
| 257 |
+
HEALTHCARE CLIENT
|
| 258 |
+
Challenge: Manual patient appointment scheduling leading to conflicts and errors
|
| 259 |
+
Solution: Intelligent appointment management with patient preference optimization
|
| 260 |
+
Results: 70% reduction in scheduling conflicts, 40% improvement in patient satisfaction
|
| 261 |
+
|
| 262 |
+
RETAIL CLIENT
|
| 263 |
+
Challenge: Inventory management inefficiencies and stockout issues
|
| 264 |
+
Solution: AI-powered demand forecasting and automated inventory optimization
|
| 265 |
+
Results: 50% reduction in stockouts, 25% decrease in inventory carrying costs
|
| 266 |
+
|
| 267 |
+
==================================================
|
| 268 |
+
COMPANY MISSION & VISION
|
| 269 |
+
==================================================
|
| 270 |
+
|
| 271 |
+
MISSION
|
| 272 |
+
To empower Vietnamese enterprises with intelligent digital solutions that drive operational excellence, cost efficiency, and sustainable growth in the digital economy through AI-powered automation and data-driven decision making.
|
| 273 |
+
|
| 274 |
+
VISION
|
| 275 |
+
To be Vietnam's leading provider of AI Agent solutions and digital transformation services, helping businesses transition from intuition-based to data-driven operations while maintaining competitive advantage in Industry 4.0.
|
| 276 |
+
|
| 277 |
+
CORE VALUES
|
| 278 |
+
• Innovation: Pioneering AI and automation technologies for Vietnamese market needs
|
| 279 |
+
• Excellence: Delivering superior results through intelligent process optimization
|
| 280 |
+
• Partnership: Building long-term relationships with clients for sustainable success
|
| 281 |
+
• Expertise: Deep technical knowledge combined with practical business understanding
|
| 282 |
+
• Growth: Enabling scalable business expansion through digital transformation
|
| 283 |
+
• Integrity: Maintaining highest standards of professional ethics and transparency
|
| 284 |
+
|
| 285 |
+
==================================================
|
| 286 |
+
COMPETITIVE ADVANTAGES
|
| 287 |
+
==================================================
|
| 288 |
+
|
| 289 |
+
LOCAL EXPERTISE
|
| 290 |
+
• Deep understanding of Vietnamese business culture and practices
|
| 291 |
+
• Regulatory compliance expertise for Vietnam market
|
| 292 |
+
• Vietnamese language AI and automation capabilities
|
| 293 |
+
• Local support and service delivery
|
| 294 |
+
|
| 295 |
+
PROVEN LEADERSHIP
|
| 296 |
+
• Founder with 24+ years technology leadership experience
|
| 297 |
+
• Track record in large-scale system implementations
|
| 298 |
+
• Broadcasting and media technology expertise
|
| 299 |
+
• Public policy and regulatory knowledge
|
| 300 |
+
|
| 301 |
+
COMPREHENSIVE SOLUTIONS
|
| 302 |
+
• End-to-end digital transformation services
|
| 303 |
+
• Custom AI development and implementation
|
| 304 |
+
• Industry-specific solution expertise
|
| 305 |
+
• Integrated technology stack and platforms
|
| 306 |
+
|
| 307 |
+
RAPID IMPLEMENTATION
|
| 308 |
+
• Proven methodologies for quick deployment
|
| 309 |
+
• Pre-built industry-specific templates
|
| 310 |
+
• Agile development and iterative improvement
|
| 311 |
+
• Comprehensive training and change management
|
| 312 |
+
|
| 313 |
+
==================================================
|
| 314 |
+
PARTNERSHIPS & CERTIFICATIONS
|
| 315 |
+
==================================================
|
| 316 |
+
|
| 317 |
+
TECHNOLOGY PARTNERSHIPS
|
| 318 |
+
• Leading cloud infrastructure providers
|
| 319 |
+
• Enterprise software and platform vendors
|
| 320 |
+
• AI and machine learning technology providers
|
| 321 |
+
• System integration and consulting partners
|
| 322 |
+
|
| 323 |
+
INDUSTRY CERTIFICATIONS
|
| 324 |
+
• ISO 27001 Information Security Management
|
| 325 |
+
• GDPR Compliance and Data Protection
|
| 326 |
+
• Vietnam Digital Transformation Standards
|
| 327 |
+
• International Quality Management Standards
|
| 328 |
+
|
| 329 |
+
==================================================
|
| 330 |
+
CONTACT INFORMATION
|
| 331 |
+
==================================================
|
| 332 |
+
|
| 333 |
+
Company: DigitizedBrains
|
| 334 |
+
Founder & CEO: Duc Nguyen
|
| 335 |
+
Email: ai.agent.tailieu@gmail.com
|
| 336 |
+
Website: [Under Development - Comprehensive Digital Platform]
|
| 337 |
+
LinkedIn: www.linkedin.com/in/ducnguyen-68b9a8370
|
| 338 |
+
Location: Ho Chi Minh City, Vietnam
|
| 339 |
+
|
| 340 |
+
Business Hours: Monday - Friday, 8:00 AM - 6:00 PM (Vietnam Time)
|
| 341 |
+
Emergency Support: 24/7 for critical automation systems
|
| 342 |
+
|
| 343 |
+
==================================================
|
| 344 |
+
NEXT STEPS FOR POTENTIAL CLIENTS
|
| 345 |
+
==================================================
|
| 346 |
+
|
| 347 |
+
1. CONSULTATION: Free initial consultation to assess automation opportunities
|
| 348 |
+
2. ANALYSIS: Comprehensive business process analysis and ROI projection
|
| 349 |
+
3. PROOF OF CONCEPT: Small-scale implementation to demonstrate value
|
| 350 |
+
4. IMPLEMENTATION: Full-scale deployment with training and support
|
| 351 |
+
5. OPTIMIZATION: Continuous improvement and expansion of automation capabilities
|
| 352 |
+
|
| 353 |
+
Ready to transform your business with intelligent automation? Contact DigitizedBrains today to begin your digital transformation journey.
|
document/linkedin.pdf
ADDED
|
Binary file (42.8 kB). View file
|
|
|
document/linkedin_profile.txt
ADDED
|
@@ -0,0 +1,173 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
LINKEDIN PROFILE - DUC NGUYEN
|
| 2 |
+
|
| 3 |
+
==================================================
|
| 4 |
+
PROFESSIONAL HEADLINE
|
| 5 |
+
==================================================
|
| 6 |
+
Deputy Director, Broadcasting Transmission Center | Digital Transformation Leader | AI Agents Specialist
|
| 7 |
+
Ho Chi Minh City Television (HTV) | 24+ Years in Media Technology & Broadcasting
|
| 8 |
+
|
| 9 |
+
==================================================
|
| 10 |
+
ABOUT / SUMMARY
|
| 11 |
+
==================================================
|
| 12 |
+
Experienced technology leader and entrepreneur with 24+ years in broadcasting, media technology, and digital transformation. Currently serving as Deputy Director at Broadcasting Transmission Center, Ho Chi Minh City Television (HTV), while founding and leading DigitizedBrains - a pioneering AI Agent and digital transformation consultancy.
|
| 13 |
+
|
| 14 |
+
Combining deep technical expertise in electronics, telecommunications, and broadcasting with advanced business acumen (Masters in Economics), I specialize in helping Vietnamese enterprises navigate digital transformation through intelligent automation, AI agents, and data-driven decision making.
|
| 15 |
+
|
| 16 |
+
Core Expertise: AI Agents Development, Process Automation (RPA), Digital Transformation Strategy, Broadcasting Technology, Technical Investment Planning, Public Policy Development
|
| 17 |
+
|
| 18 |
+
==================================================
|
| 19 |
+
CURRENT POSITIONS
|
| 20 |
+
==================================================
|
| 21 |
+
|
| 22 |
+
Founder & CEO | DigitizedBrains
|
| 23 |
+
2023 - Present | Ho Chi Minh City, Vietnam
|
| 24 |
+
• Leading Vietnamese AI Agent and digital transformation consultancy
|
| 25 |
+
• Developing intelligent automation solutions for Manufacturing, Financial Services, Healthcare, and Retail sectors
|
| 26 |
+
• Implementing RPA, ML, NLP, and Business Intelligence solutions for enterprise clients
|
| 27 |
+
• Achieved 280% average ROI for clients within 8-12 months of automation implementation
|
| 28 |
+
• Specializing in process automation with 90% processing time reduction and 99% task execution accuracy
|
| 29 |
+
|
| 30 |
+
Deputy Director | Broadcasting Transmission Center, Ho Chi Minh City Television (HTV)
|
| 31 |
+
2018 - Present | Ho Chi Minh City, Vietnam
|
| 32 |
+
• Overseeing technical transmission operations and infrastructure for major Vietnamese broadcaster
|
| 33 |
+
• Leading digital transformation initiatives in broadcasting and media technology
|
| 34 |
+
• Managing technical investment planning and large-scale system implementations
|
| 35 |
+
• Developing AI implementation strategies for media operations
|
| 36 |
+
• Responsible for technical infrastructure modernization and operational excellence
|
| 37 |
+
|
| 38 |
+
==================================================
|
| 39 |
+
PROFESSIONAL EXPERIENCE
|
| 40 |
+
==================================================
|
| 41 |
+
|
| 42 |
+
Technical Transmission Management | Ho Chi Minh City Television (HTV)
|
| 43 |
+
2018 - 2024 (6 years)
|
| 44 |
+
• Managing comprehensive technical transmission operations
|
| 45 |
+
• Implementing digital transformation in broadcasting infrastructure
|
| 46 |
+
• Leading AI and automation projects in media operations
|
| 47 |
+
• Ensuring 24/7 operational continuity and service excellence
|
| 48 |
+
|
| 49 |
+
Electrical Systems Manager | Ho Chi Minh City Television (HTV)
|
| 50 |
+
2006 - 2018 (12 years)
|
| 51 |
+
• Managed electrical systems across all HTV broadcasting operations
|
| 52 |
+
• Supervised technical infrastructure maintenance and upgrades
|
| 53 |
+
• Implemented energy efficiency and cost optimization programs
|
| 54 |
+
• Led cross-functional teams in technical project execution
|
| 55 |
+
|
| 56 |
+
Electronics Engineer & Technical Investment Planning Specialist | Ho Chi Minh City Television (HTV)
|
| 57 |
+
2000 - 2006 (6 years)
|
| 58 |
+
• Electronics and telecommunications engineering for broadcasting systems
|
| 59 |
+
• Technical investment planning and feasibility analysis
|
| 60 |
+
• System design and implementation oversight
|
| 61 |
+
• Technology evaluation and vendor management
|
| 62 |
+
|
| 63 |
+
==================================================
|
| 64 |
+
EDUCATION & CERTIFICATIONS
|
| 65 |
+
==================================================
|
| 66 |
+
|
| 67 |
+
Master of Economics | University of Economics Ho Chi Minh City
|
| 68 |
+
Economics, Business Strategy, and Management
|
| 69 |
+
Graduated with focus on Digital Economy and Technology Policy
|
| 70 |
+
|
| 71 |
+
Bachelor of Economics | University of Economics Ho Chi Minh City
|
| 72 |
+
Economics and Business Administration
|
| 73 |
+
Foundation in economic theory, business operations, and policy development
|
| 74 |
+
|
| 75 |
+
Bachelor of Engineering | Electronics & Telecommunications Engineering
|
| 76 |
+
Technical foundation in electronics, telecommunications, and broadcasting technology
|
| 77 |
+
|
| 78 |
+
Specializations & Continuous Learning:
|
| 79 |
+
• AI Agents and Machine Learning Applications
|
| 80 |
+
• Digital Transformation Strategy and Implementation
|
| 81 |
+
• Process Automation and RPA Development
|
| 82 |
+
• Public Policy in Media & Technology
|
| 83 |
+
• Advanced Broadcasting and Transmission Technology
|
| 84 |
+
|
| 85 |
+
==================================================
|
| 86 |
+
TECHNICAL SKILLS & EXPERTISE
|
| 87 |
+
==================================================
|
| 88 |
+
|
| 89 |
+
AI & Automation Technologies:
|
| 90 |
+
• Artificial Intelligence (AI) and Machine Learning (ML)
|
| 91 |
+
• Natural Language Processing (NLP) and Computer Vision
|
| 92 |
+
• Robotic Process Automation (RPA)
|
| 93 |
+
• Intelligent Workflow Management
|
| 94 |
+
• Predictive Analytics and Business Intelligence
|
| 95 |
+
|
| 96 |
+
Broadcasting & Media Technology:
|
| 97 |
+
• Digital Broadcasting Systems and Infrastructure
|
| 98 |
+
• Transmission Technology and Network Management
|
| 99 |
+
• Media Production and Distribution Systems
|
| 100 |
+
• Technical Investment Planning and Project Management
|
| 101 |
+
|
| 102 |
+
Digital Transformation:
|
| 103 |
+
• Business Process Analysis and Optimization
|
| 104 |
+
• System Integration and Data Synchronization
|
| 105 |
+
• Digital Ecosystem Development
|
| 106 |
+
• Technology Infrastructure Modernization
|
| 107 |
+
• Change Management and Training
|
| 108 |
+
|
| 109 |
+
Business & Leadership:
|
| 110 |
+
• Strategic Planning and Business Development
|
| 111 |
+
• Team Leadership and Cross-functional Collaboration
|
| 112 |
+
• Public Policy Development and Implementation
|
| 113 |
+
• Stakeholder Management and Client Relations
|
| 114 |
+
• Financial Analysis and Investment Planning
|
| 115 |
+
|
| 116 |
+
==================================================
|
| 117 |
+
INDUSTRY EXPERTISE
|
| 118 |
+
==================================================
|
| 119 |
+
|
| 120 |
+
Media & Broadcasting: 24+ years in television broadcasting, transmission technology, and media operations
|
| 121 |
+
|
| 122 |
+
Manufacturing: Smart factory solutions, predictive maintenance, quality control automation, supply chain optimization
|
| 123 |
+
|
| 124 |
+
Financial Services: Loan processing automation, KYC compliance, financial reporting, risk management
|
| 125 |
+
|
| 126 |
+
Healthcare: Patient appointment systems, medical record automation, payment processes, compliance monitoring
|
| 127 |
+
|
| 128 |
+
Retail & E-commerce: Order processing, inventory optimization, customer service automation, dynamic pricing
|
| 129 |
+
|
| 130 |
+
==================================================
|
| 131 |
+
ACHIEVEMENTS & RESULTS
|
| 132 |
+
==================================================
|
| 133 |
+
|
| 134 |
+
Business Impact:
|
| 135 |
+
• Founded DigitizedBrains, achieving 280% average ROI for automation clients
|
| 136 |
+
• Reduced operational costs by 30-80% through intelligent process optimization
|
| 137 |
+
• Achieved 90% processing time reduction with 99% task execution accuracy
|
| 138 |
+
• Successfully automated processes for 45+ Vietnamese enterprises
|
| 139 |
+
|
| 140 |
+
Leadership Excellence:
|
| 141 |
+
• 24+ years of progressive leadership in broadcasting and technology
|
| 142 |
+
• Led digital transformation initiatives across multiple business sectors
|
| 143 |
+
• Developed and implemented public policies for media and technology adoption
|
| 144 |
+
• Built and managed high-performing technical and business teams
|
| 145 |
+
|
| 146 |
+
Technical Innovation:
|
| 147 |
+
• Pioneer in AI Agent implementation for Vietnamese enterprises
|
| 148 |
+
• Developed scalable automation solutions across multiple industries
|
| 149 |
+
• Created integrated digital ecosystems for business process optimization
|
| 150 |
+
• Established best practices for RPA and ML implementation in Vietnam market
|
| 151 |
+
|
| 152 |
+
==================================================
|
| 153 |
+
LANGUAGES
|
| 154 |
+
==================================================
|
| 155 |
+
• Vietnamese (Native)
|
| 156 |
+
• English (Professional working proficiency)
|
| 157 |
+
|
| 158 |
+
==================================================
|
| 159 |
+
CONTACT INFORMATION
|
| 160 |
+
==================================================
|
| 161 |
+
Email: ai.agent.tailieu@gmail.com
|
| 162 |
+
LinkedIn: www.linkedin.com/in/ducnguyen-68b9a8370
|
| 163 |
+
Location: Ho Chi Minh City, Vietnam
|
| 164 |
+
|
| 165 |
+
==================================================
|
| 166 |
+
PROFESSIONAL INTERESTS
|
| 167 |
+
==================================================
|
| 168 |
+
• AI Agents and Intelligent Automation
|
| 169 |
+
• Digital Transformation in Emerging Markets
|
| 170 |
+
• Public Policy Development for Technology Sectors
|
| 171 |
+
• Sustainable Business Growth through Innovation
|
| 172 |
+
• Industry 4.0 Implementation in Vietnam
|
| 173 |
+
• Media Technology and Broadcasting Innovation
|
document/summary.txt
ADDED
|
@@ -0,0 +1,144 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
DUC NGUYEN - FOUNDER & CEO, DIGITIZEDBRAINS
|
| 2 |
+
RAG-Enhanced Professional Summary
|
| 3 |
+
|
| 4 |
+
==================================================
|
| 5 |
+
EXECUTIVE PROFILE
|
| 6 |
+
==================================================
|
| 7 |
+
|
| 8 |
+
Duc Nguyen is a visionary technology leader and entrepreneur who combines 24+ years of broadcasting and media technology expertise with cutting-edge AI and digital transformation capabilities. As the Founder & CEO of DigitizedBrains and Deputy Director of Broadcasting Transmission Center at Ho Chi Minh City Television (HTV), he uniquely bridges traditional Vietnamese business practices with Industry 4.0 innovations.
|
| 9 |
+
|
| 10 |
+
==================================================
|
| 11 |
+
DUAL LEADERSHIP ROLES
|
| 12 |
+
==================================================
|
| 13 |
+
|
| 14 |
+
FOUNDER & CEO - DIGITIZEDBRAINS (2023-Present)
|
| 15 |
+
Leading Vietnam's premier AI Agent and digital transformation consultancy, achieving 280% average ROI for clients through intelligent process automation, machine learning, and data-driven business solutions.
|
| 16 |
+
|
| 17 |
+
DEPUTY DIRECTOR - HTV BROADCASTING TRANSMISSION CENTER (2018-Present)
|
| 18 |
+
Overseeing technical transmission operations for major Vietnamese broadcaster while pioneering digital transformation initiatives in media technology and AI implementation strategies.
|
| 19 |
+
|
| 20 |
+
==================================================
|
| 21 |
+
CORE EXPERTISE & SPECIALIZATIONS
|
| 22 |
+
==================================================
|
| 23 |
+
|
| 24 |
+
AI & AUTOMATION TECHNOLOGIES
|
| 25 |
+
• AI Agents development and implementation
|
| 26 |
+
• Robotic Process Automation (RPA) with 99% accuracy
|
| 27 |
+
• Machine Learning and Predictive Analytics
|
| 28 |
+
• Natural Language Processing for Vietnamese market
|
| 29 |
+
• Computer Vision and document processing automation
|
| 30 |
+
|
| 31 |
+
DIGITAL TRANSFORMATION LEADERSHIP
|
| 32 |
+
• Business process digitization and optimization
|
| 33 |
+
• System integration across ERP, CRM, IoT platforms
|
| 34 |
+
• Digital ecosystem development and connectivity
|
| 35 |
+
• Change management and organizational transformation
|
| 36 |
+
• Technology infrastructure modernization
|
| 37 |
+
|
| 38 |
+
BROADCASTING & MEDIA TECHNOLOGY
|
| 39 |
+
• Advanced broadcasting systems and transmission technology
|
| 40 |
+
• Large-scale technical infrastructure management
|
| 41 |
+
• Technical investment planning and implementation
|
| 42 |
+
• Media production and distribution systems
|
| 43 |
+
• 24/7 operational continuity and service excellence
|
| 44 |
+
|
| 45 |
+
BUSINESS & STRATEGIC PLANNING
|
| 46 |
+
• Economics and business strategy (Masters degree)
|
| 47 |
+
• Technical investment analysis and ROI optimization
|
| 48 |
+
• Public policy development for media and technology
|
| 49 |
+
• Cross-functional team leadership and management
|
| 50 |
+
• Stakeholder relationship management and client success
|
| 51 |
+
|
| 52 |
+
==================================================
|
| 53 |
+
EDUCATIONAL FOUNDATION
|
| 54 |
+
==================================================
|
| 55 |
+
|
| 56 |
+
MASTER OF ECONOMICS | University of Economics Ho Chi Minh City
|
| 57 |
+
Advanced studies in economics, business strategy, digital economy, and technology policy development.
|
| 58 |
+
|
| 59 |
+
BACHELOR OF ECONOMICS | University of Economics Ho Chi Minh City
|
| 60 |
+
Foundation in economic theory, business administration, and policy formulation.
|
| 61 |
+
|
| 62 |
+
BACHELOR OF ENGINEERING | Electronics & Telecommunications
|
| 63 |
+
Technical expertise in electronics, telecommunications, and broadcasting technology systems.
|
| 64 |
+
|
| 65 |
+
CONTINUOUS SPECIALIZATIONS
|
| 66 |
+
• AI Agents and Machine Learning Applications
|
| 67 |
+
• Digital Transformation Strategy and Implementation
|
| 68 |
+
• Process Automation and RPA Development
|
| 69 |
+
• Public Policy in Media & Technology Sectors
|
| 70 |
+
• Advanced Broadcasting and Transmission Technology
|
| 71 |
+
|
| 72 |
+
==================================================
|
| 73 |
+
PROFESSIONAL JOURNEY & ACHIEVEMENTS
|
| 74 |
+
==================================================
|
| 75 |
+
|
| 76 |
+
24+ YEARS PROGRESSIVE TECHNOLOGY LEADERSHIP
|
| 77 |
+
• 6 years: Electronics Engineer & Technical Investment Planning Specialist
|
| 78 |
+
• 12 years: Electrical Systems Management across HTV operations
|
| 79 |
+
• 6 years: Technical Transmission Management (current Deputy Director role)
|
| 80 |
+
• 1+ years: DigitizedBrains founder leading AI transformation initiatives
|
| 81 |
+
|
| 82 |
+
KEY ACCOMPLISHMENTS
|
| 83 |
+
• Founded DigitizedBrains achieving 280% average client ROI within 8-12 months
|
| 84 |
+
• Implemented automation solutions reducing operational costs by 30-80%
|
| 85 |
+
• Achieved 90% processing time reduction with 99% task execution accuracy
|
| 86 |
+
• Successfully automated processes for 45+ Vietnamese enterprises
|
| 87 |
+
• Led digital transformation in broadcasting industry at national scale
|
| 88 |
+
• Developed public policies for technology adoption in media sector
|
| 89 |
+
|
| 90 |
+
==================================================
|
| 91 |
+
INDUSTRY IMPACT & MARKET LEADERSHIP
|
| 92 |
+
==================================================
|
| 93 |
+
|
| 94 |
+
VIETNAM MARKET EXPERTISE
|
| 95 |
+
Deep understanding of Vietnamese business culture, regulatory environment, and market dynamics across manufacturing, financial services, healthcare, retail, and media industries.
|
| 96 |
+
|
| 97 |
+
PROVEN CLIENT SUCCESS
|
| 98 |
+
• Manufacturing: 95% reduction in inspection time with AI quality control
|
| 99 |
+
• Financial Services: 80% faster loan approval with automated processing
|
| 100 |
+
• Healthcare: 70% reduction in scheduling conflicts through intelligent automation
|
| 101 |
+
• Retail: 50% reduction in stockouts via AI-powered demand forecasting
|
| 102 |
+
|
| 103 |
+
THOUGHT LEADERSHIP
|
| 104 |
+
Pioneering AI Agent implementation for Vietnamese enterprises while maintaining cultural alignment and regulatory compliance for sustainable business transformation.
|
| 105 |
+
|
| 106 |
+
==================================================
|
| 107 |
+
TECHNICAL CAPABILITIES & INNOVATION
|
| 108 |
+
==================================================
|
| 109 |
+
|
| 110 |
+
AI & MACHINE LEARNING
|
| 111 |
+
Advanced implementation of artificial intelligence, machine learning algorithms, predictive analytics, and intelligent decision support systems for business optimization.
|
| 112 |
+
|
| 113 |
+
PROCESS AUTOMATION
|
| 114 |
+
Expert development and deployment of RPA solutions, workflow automation, document processing, and intelligent business process management.
|
| 115 |
+
|
| 116 |
+
SYSTEM INTEGRATION
|
| 117 |
+
Comprehensive experience in ERP, CRM, IoT integration, API development, cloud migration, and hybrid infrastructure solutions.
|
| 118 |
+
|
| 119 |
+
DATA ANALYTICS & BUSINESS INTELLIGENCE
|
| 120 |
+
Multi-dimensional data analysis, real-time monitoring, dashboard development, and data-driven decision making framework implementation.
|
| 121 |
+
|
| 122 |
+
==================================================
|
| 123 |
+
MISSION & VISION ALIGNMENT
|
| 124 |
+
==================================================
|
| 125 |
+
|
| 126 |
+
PERSONAL MISSION
|
| 127 |
+
To bridge the gap between traditional Vietnamese business practices and cutting-edge AI technologies, enabling sustainable digital transformation that preserves cultural values while driving innovation and growth.
|
| 128 |
+
|
| 129 |
+
DIGITIZEDBRAINS VISION
|
| 130 |
+
To establish Vietnam as a regional leader in AI Agent implementation and digital transformation, empowering enterprises to compete globally while maintaining local market strength.
|
| 131 |
+
|
| 132 |
+
INDUSTRY TRANSFORMATION GOAL
|
| 133 |
+
To accelerate Vietnam's transition to Industry 4.0 through practical AI solutions that deliver measurable ROI and sustainable competitive advantage.
|
| 134 |
+
|
| 135 |
+
==================================================
|
| 136 |
+
CONTACT & ENGAGEMENT
|
| 137 |
+
==================================================
|
| 138 |
+
|
| 139 |
+
Email: ai.agent.tailieu@gmail.com
|
| 140 |
+
LinkedIn: www.linkedin.com/in/ducnguyen-68b9a8370
|
| 141 |
+
Location: Ho Chi Minh City, Vietnam
|
| 142 |
+
Company: DigitizedBrains - AI Agents & Digital Transformation
|
| 143 |
+
|
| 144 |
+
Available for: Strategic consulting, digital transformation projects, AI implementation, technology leadership, and public-private partnerships in Vietnamese digital economy development.
|
requirements.txt
ADDED
|
@@ -0,0 +1,5 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
python-dotenv
|
| 2 |
+
google-generativeai
|
| 3 |
+
gradio
|
| 4 |
+
requests
|
| 5 |
+
pypdf
|