File size: 7,620 Bytes
1804a7a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
Project A: Intelligent Retail Automation Agent (Level 3)
Project A is an autonomous, on-premise AI agent designed for the Vietnamese Retail Industry. Unlike standard chatbots, it functions as a Digital Employee capable of:
Consulting: Providing business advice using context-aware memory and RAG.
Managing: Interfacing with SaaS data (Sales, Inventory, CRM).
Building: Autonomous generation of technical automation workflows (Make.com/Native JSON) based on natural language requests.
Built to run efficiently on a single L4 GPU (24GB VRAM) using a Unified 8-bit Model Architecture.
🏗️ Architecture Overview
The system utilizes a Single-Model, Multi-Persona architecture. Instead of loading multiple models, we use one high-performance model (Qwen-2.5-Coder-14B-Instruct) and dynamically swap system prompts to alter its behavior (Manager, Coder, Researcher).
System Logic Flow (Visualization)
code
Mermaid
graph TD
    User[User Input] --> Main[Main Orchestrator (main.py)]

    

    subgraph Context_Layer

        Mem[(SQLite Memory)]

        SaaS[SaaS API Mock]

        Tools[Retail Tools]

    end

    

    Main -->|Fetch History & Profile| Mem

    Main -->|Health Check| Tools

    Main -->|Send Context| Manager

    

    subgraph The_Brain_Qwen_14B

        Manager{Manager Agent}

        Coder[Coder Agent]

        Researcher[Researcher Agent]

    end

    

    Manager -->|Analyze Intent| Router{Router Decision}

    

    %% Branch 1: General Business

    Router -->|Intent: GENERAL / DATA| SaaS

    SaaS -->|Return Sales/Stock| Manager

    Manager -->|Consult & Advise| Output[Final Response]

    

    %% Branch 2: Marketing

    Router -->|Intent: MARKETING| Manager

    Manager -->|Write Copy| Output

    

    %% Branch 3: Technical Automation

    Router -->|Intent: TECHNICAL| Coder

    Coder -->|Generate JSON Blueprint| Validator[Syntax Check]

    Validator -->|Valid JSON| Integrations[Integration Manager]

    Integrations -->|Save to DB| Workflows[(Workflows Table)]

    Workflows --> Output

📂 File Structure & Components

The project is contained entirely within the src/ directory for portability.

code

Text

ProjectA/

├── main.py                  # Entry point. Handles the Event Loop, Login, and UI output.

├── requirements.txt         # Python dependencies.

├── src/

│   ├── data/                # Storage

│   │   ├── project_a.db     # SQLite DB (Chat History, Users, Stores, Sales, Workflows).

│   │   ├── docs/            # RAG Documents (PDF/TXT) for knowledge base.

│   │   └── blueprints/      # JSON Reference samples for the Coder to learn from.

│   │

│   ├── core/                # The Nervous System

│   │   ├── config.py        # Settings (Model ID, Paths, Quantization Params).

│   │   ├── engine.py        # Model Loader. Enforces Singleton pattern (loads Qwen once).

│   │   ├── memory.py        # Database Manager. Handles Context injection & History.

│   │   ├── context.py       # Login Logic. Handles Multi-store ambiguity resolution.

│   │   ├── saas_api.py      # Mock API simulating KiotViet/Sapo (Sales, Inventory).

│   │   ├── tools.py         # Deterministic Utilities (Math, Lunar Calendar, Health Check).

│   │   ├── integrations.py  # Deployment Handler (Saves JSON to DB / Mock Social Post).

│   │   └── prompts.py       # Prompt Library. Contains Persona definitions & Few-Shot examples.

│   │

│   └── agents/              # The Personas (All powered by Qwen-14B)

│       ├── base.py          # Abstract wrapper for LLM inference.

│       ├── manager.py       # THE BRAIN. Routes tasks, handles chat, intent classification.

│       ├── coder.py         # THE BUILDER. Generates strict JSON automation blueprints.

│       └── researcher.py    # THE EYES. Performs web searches (DuckDuckGo).

🧩 Detailed Component Roles

1. The Core Engines

engine.py: Loads Qwen-2.5-Coder-14B-Instruct in 8-bit quantization. This fits the model into ~16GB VRAM, leaving ~8GB buffer for long context windows (Chat History + RAG).

memory.py: Not just a logger. It actively seeds "Mock Data" (Sales, Users) so the system feels alive immediately. It formats the last 

N

N

 turns of conversation for the Manager to ensure continuity.

saas_api.py: Acts as the bridge to your business data. Currently returns mock data, but designed to be replaced with requests.get() to your real Backend API.

2. The Agents

Manager (manager.py):

Smart Routing: Distinguishes between "I want to automate" (Vague -> Ask Question) vs "Automate email on new order" (Specific -> Call Coder).

Contextual Glue: Injects Store Name, Industry, and Time into the system prompt so answers are always relevant.

Coder (coder.py):

Registry Aware: Uses a library of "Golden Templates" (in prompts.py or JSON) to ensure generated Make.com blueprints use the correct internal IDs and parameter names.

Researcher (researcher.py):

Summarizes web search results into Vietnamese business insights.

3. The Deployment Layer

integrations.py:

Instead of just printing code, it saves the generated Workflow JSON into the workflows table in SQLite.

Simulates the "Save & Activate" flow of a real SaaS platform.

🚀 Deployment Instructions

1. Hardware Requirements

GPU: NVIDIA GPU with 24GB VRAM minimum (Recommended: L4, A10g, RTX 3090/4090).

RAM: 16GB System RAM.

Disk: 50GB free space (Model weights are large).

2. Environment Setup

It is recommended to use conda or a virtual environment.

code

Bash

# 1. Create Environment

conda create -n project_a python=3.10

conda activate project_a


# 2. Install Dependencies
pip install -r requirements.txt
requirements.txt content:
code
Text
torch
transformers
accelerate
bitsandbytes
duckduckgo-search
sentence-transformers
sqlite3
lunardate
protobuf
3. Running the Agent
The project is self-contained. The first run will automatically:
Download the Qwen-14B model (approx. 9-10GB).
Initialize the SQLite Database.
Seed mock data (User: Nguyen Van A).
code
Bash
python src/main.py
🧪 Testing the Capabilities
Once the system shows ✅ Ready, try these scenarios:
Scenario A: Business Intelligence (Data + Context)
Input: "Hôm nay doanh thu thế nào?" (How is revenue today?)
Logic: Manager detects DATA_INTERNAL -> Calls saas_api.get_sales_report -> Formats response.
Output: "Doanh thu hôm nay của BabyWorld là 3.700.000 VND..."
Scenario B: Contextual Advice (Lunar Tool + Profile)
Input: "Hôm nay là ngày bao nhiêu âm? Có nên khuyến mãi không?"
Logic: Manager calls RetailTools.get_lunar_date -> Checks Profile (Baby Store) -> Suggests advice.
Output: "Hôm nay là 15 Âm lịch... Nên chạy chương trình nhẹ nhàng..."
Scenario C: Automation Building (The "Meta-Agent")
Input: "Tự động gửi email cảm ơn khi có đơn hàng mới."
Logic: Manager detects TECHNICAL (Specific) -> Coder generates JSON -> Integrations saves to DB.
Output: "Đã thiết kế xong. Workflow ID: 5. ✅ ĐÃ LƯU THÀNH CÔNG."
🔮 Future Roadmap (Beyond Phase 22)
Real API Hookup: Replace saas_api.py methods with real SQL queries to your Postgres/MySQL production DB.

Frontend Integration: Connect this Python backend to your Website via FastAPI.

User types in Web Chat -> FastAPI sends to main.py -> Agent Returns Text/JSON.

Vision Support: Upgrade to Qwen-VL to allow users to upload photos of invoices or products for auto-entry.