Spaces:
Paused
Paused
File size: 7,620 Bytes
1804a7a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 |
Project A: Intelligent Retail Automation Agent (Level 3)
Project A is an autonomous, on-premise AI agent designed for the Vietnamese Retail Industry. Unlike standard chatbots, it functions as a Digital Employee capable of:
Consulting: Providing business advice using context-aware memory and RAG.
Managing: Interfacing with SaaS data (Sales, Inventory, CRM).
Building: Autonomous generation of technical automation workflows (Make.com/Native JSON) based on natural language requests.
Built to run efficiently on a single L4 GPU (24GB VRAM) using a Unified 8-bit Model Architecture.
🏗️ Architecture Overview
The system utilizes a Single-Model, Multi-Persona architecture. Instead of loading multiple models, we use one high-performance model (Qwen-2.5-Coder-14B-Instruct) and dynamically swap system prompts to alter its behavior (Manager, Coder, Researcher).
System Logic Flow (Visualization)
code
Mermaid
graph TD
User[User Input] --> Main[Main Orchestrator (main.py)]
subgraph Context_Layer
Mem[(SQLite Memory)]
SaaS[SaaS API Mock]
Tools[Retail Tools]
end
Main -->|Fetch History & Profile| Mem
Main -->|Health Check| Tools
Main -->|Send Context| Manager
subgraph The_Brain_Qwen_14B
Manager{Manager Agent}
Coder[Coder Agent]
Researcher[Researcher Agent]
end
Manager -->|Analyze Intent| Router{Router Decision}
%% Branch 1: General Business
Router -->|Intent: GENERAL / DATA| SaaS
SaaS -->|Return Sales/Stock| Manager
Manager -->|Consult & Advise| Output[Final Response]
%% Branch 2: Marketing
Router -->|Intent: MARKETING| Manager
Manager -->|Write Copy| Output
%% Branch 3: Technical Automation
Router -->|Intent: TECHNICAL| Coder
Coder -->|Generate JSON Blueprint| Validator[Syntax Check]
Validator -->|Valid JSON| Integrations[Integration Manager]
Integrations -->|Save to DB| Workflows[(Workflows Table)]
Workflows --> Output
📂 File Structure & Components
The project is contained entirely within the src/ directory for portability.
code
Text
ProjectA/
├── main.py # Entry point. Handles the Event Loop, Login, and UI output.
├── requirements.txt # Python dependencies.
├── src/
│ ├── data/ # Storage
│ │ ├── project_a.db # SQLite DB (Chat History, Users, Stores, Sales, Workflows).
│ │ ├── docs/ # RAG Documents (PDF/TXT) for knowledge base.
│ │ └── blueprints/ # JSON Reference samples for the Coder to learn from.
│ │
│ ├── core/ # The Nervous System
│ │ ├── config.py # Settings (Model ID, Paths, Quantization Params).
│ │ ├── engine.py # Model Loader. Enforces Singleton pattern (loads Qwen once).
│ │ ├── memory.py # Database Manager. Handles Context injection & History.
│ │ ├── context.py # Login Logic. Handles Multi-store ambiguity resolution.
│ │ ├── saas_api.py # Mock API simulating KiotViet/Sapo (Sales, Inventory).
│ │ ├── tools.py # Deterministic Utilities (Math, Lunar Calendar, Health Check).
│ │ ├── integrations.py # Deployment Handler (Saves JSON to DB / Mock Social Post).
│ │ └── prompts.py # Prompt Library. Contains Persona definitions & Few-Shot examples.
│ │
│ └── agents/ # The Personas (All powered by Qwen-14B)
│ ├── base.py # Abstract wrapper for LLM inference.
│ ├── manager.py # THE BRAIN. Routes tasks, handles chat, intent classification.
│ ├── coder.py # THE BUILDER. Generates strict JSON automation blueprints.
│ └── researcher.py # THE EYES. Performs web searches (DuckDuckGo).
🧩 Detailed Component Roles
1. The Core Engines
engine.py: Loads Qwen-2.5-Coder-14B-Instruct in 8-bit quantization. This fits the model into ~16GB VRAM, leaving ~8GB buffer for long context windows (Chat History + RAG).
memory.py: Not just a logger. It actively seeds "Mock Data" (Sales, Users) so the system feels alive immediately. It formats the last
N
N
turns of conversation for the Manager to ensure continuity.
saas_api.py: Acts as the bridge to your business data. Currently returns mock data, but designed to be replaced with requests.get() to your real Backend API.
2. The Agents
Manager (manager.py):
Smart Routing: Distinguishes between "I want to automate" (Vague -> Ask Question) vs "Automate email on new order" (Specific -> Call Coder).
Contextual Glue: Injects Store Name, Industry, and Time into the system prompt so answers are always relevant.
Coder (coder.py):
Registry Aware: Uses a library of "Golden Templates" (in prompts.py or JSON) to ensure generated Make.com blueprints use the correct internal IDs and parameter names.
Researcher (researcher.py):
Summarizes web search results into Vietnamese business insights.
3. The Deployment Layer
integrations.py:
Instead of just printing code, it saves the generated Workflow JSON into the workflows table in SQLite.
Simulates the "Save & Activate" flow of a real SaaS platform.
🚀 Deployment Instructions
1. Hardware Requirements
GPU: NVIDIA GPU with 24GB VRAM minimum (Recommended: L4, A10g, RTX 3090/4090).
RAM: 16GB System RAM.
Disk: 50GB free space (Model weights are large).
2. Environment Setup
It is recommended to use conda or a virtual environment.
code
Bash
# 1. Create Environment
conda create -n project_a python=3.10
conda activate project_a
# 2. Install Dependencies
pip install -r requirements.txt
requirements.txt content:
code
Text
torch
transformers
accelerate
bitsandbytes
duckduckgo-search
sentence-transformers
sqlite3
lunardate
protobuf
3. Running the Agent
The project is self-contained. The first run will automatically:
Download the Qwen-14B model (approx. 9-10GB).
Initialize the SQLite Database.
Seed mock data (User: Nguyen Van A).
code
Bash
python src/main.py
🧪 Testing the Capabilities
Once the system shows ✅ Ready, try these scenarios:
Scenario A: Business Intelligence (Data + Context)
Input: "Hôm nay doanh thu thế nào?" (How is revenue today?)
Logic: Manager detects DATA_INTERNAL -> Calls saas_api.get_sales_report -> Formats response.
Output: "Doanh thu hôm nay của BabyWorld là 3.700.000 VND..."
Scenario B: Contextual Advice (Lunar Tool + Profile)
Input: "Hôm nay là ngày bao nhiêu âm? Có nên khuyến mãi không?"
Logic: Manager calls RetailTools.get_lunar_date -> Checks Profile (Baby Store) -> Suggests advice.
Output: "Hôm nay là 15 Âm lịch... Nên chạy chương trình nhẹ nhàng..."
Scenario C: Automation Building (The "Meta-Agent")
Input: "Tự động gửi email cảm ơn khi có đơn hàng mới."
Logic: Manager detects TECHNICAL (Specific) -> Coder generates JSON -> Integrations saves to DB.
Output: "Đã thiết kế xong. Workflow ID: 5. ✅ ĐÃ LƯU THÀNH CÔNG."
🔮 Future Roadmap (Beyond Phase 22)
Real API Hookup: Replace saas_api.py methods with real SQL queries to your Postgres/MySQL production DB.
Frontend Integration: Connect this Python backend to your Website via FastAPI.
User types in Web Chat -> FastAPI sends to main.py -> Agent Returns Text/JSON.
Vision Support: Upgrade to Qwen-VL to allow users to upload photos of invoices or products for auto-entry. |