Project A: Intelligent Retail Automation Agent (Level 3) Project A is an autonomous, on-premise AI agent designed for the Vietnamese Retail Industry. Unlike standard chatbots, it functions as a Digital Employee capable of: Consulting: Providing business advice using context-aware memory and RAG. Managing: Interfacing with SaaS data (Sales, Inventory, CRM). Building: Autonomous generation of technical automation workflows (Make.com/Native JSON) based on natural language requests. Built to run efficiently on a single L4 GPU (24GB VRAM) using a Unified 8-bit Model Architecture. 🏗️ Architecture Overview The system utilizes a Single-Model, Multi-Persona architecture. Instead of loading multiple models, we use one high-performance model (Qwen-2.5-Coder-14B-Instruct) and dynamically swap system prompts to alter its behavior (Manager, Coder, Researcher). System Logic Flow (Visualization) code Mermaid graph TD User[User Input] --> Main[Main Orchestrator (main.py)] subgraph Context_Layer Mem[(SQLite Memory)] SaaS[SaaS API Mock] Tools[Retail Tools] end Main -->|Fetch History & Profile| Mem Main -->|Health Check| Tools Main -->|Send Context| Manager subgraph The_Brain_Qwen_14B Manager{Manager Agent} Coder[Coder Agent] Researcher[Researcher Agent] end Manager -->|Analyze Intent| Router{Router Decision} %% Branch 1: General Business Router -->|Intent: GENERAL / DATA| SaaS SaaS -->|Return Sales/Stock| Manager Manager -->|Consult & Advise| Output[Final Response] %% Branch 2: Marketing Router -->|Intent: MARKETING| Manager Manager -->|Write Copy| Output %% Branch 3: Technical Automation Router -->|Intent: TECHNICAL| Coder Coder -->|Generate JSON Blueprint| Validator[Syntax Check] Validator -->|Valid JSON| Integrations[Integration Manager] Integrations -->|Save to DB| Workflows[(Workflows Table)] Workflows --> Output 📂 File Structure & Components The project is contained entirely within the src/ directory for portability. code Text ProjectA/ ├── main.py # Entry point. Handles the Event Loop, Login, and UI output. ├── requirements.txt # Python dependencies. ├── src/ │ ├── data/ # Storage │ │ ├── project_a.db # SQLite DB (Chat History, Users, Stores, Sales, Workflows). │ │ ├── docs/ # RAG Documents (PDF/TXT) for knowledge base. │ │ └── blueprints/ # JSON Reference samples for the Coder to learn from. │ │ │ ├── core/ # The Nervous System │ │ ├── config.py # Settings (Model ID, Paths, Quantization Params). │ │ ├── engine.py # Model Loader. Enforces Singleton pattern (loads Qwen once). │ │ ├── memory.py # Database Manager. Handles Context injection & History. │ │ ├── context.py # Login Logic. Handles Multi-store ambiguity resolution. │ │ ├── saas_api.py # Mock API simulating KiotViet/Sapo (Sales, Inventory). │ │ ├── tools.py # Deterministic Utilities (Math, Lunar Calendar, Health Check). │ │ ├── integrations.py # Deployment Handler (Saves JSON to DB / Mock Social Post). │ │ └── prompts.py # Prompt Library. Contains Persona definitions & Few-Shot examples. │ │ │ └── agents/ # The Personas (All powered by Qwen-14B) │ ├── base.py # Abstract wrapper for LLM inference. │ ├── manager.py # THE BRAIN. Routes tasks, handles chat, intent classification. │ ├── coder.py # THE BUILDER. Generates strict JSON automation blueprints. │ └── researcher.py # THE EYES. Performs web searches (DuckDuckGo). 🧩 Detailed Component Roles 1. The Core Engines engine.py: Loads Qwen-2.5-Coder-14B-Instruct in 8-bit quantization. This fits the model into ~16GB VRAM, leaving ~8GB buffer for long context windows (Chat History + RAG). memory.py: Not just a logger. It actively seeds "Mock Data" (Sales, Users) so the system feels alive immediately. It formats the last N N turns of conversation for the Manager to ensure continuity. saas_api.py: Acts as the bridge to your business data. Currently returns mock data, but designed to be replaced with requests.get() to your real Backend API. 2. The Agents Manager (manager.py): Smart Routing: Distinguishes between "I want to automate" (Vague -> Ask Question) vs "Automate email on new order" (Specific -> Call Coder). Contextual Glue: Injects Store Name, Industry, and Time into the system prompt so answers are always relevant. Coder (coder.py): Registry Aware: Uses a library of "Golden Templates" (in prompts.py or JSON) to ensure generated Make.com blueprints use the correct internal IDs and parameter names. Researcher (researcher.py): Summarizes web search results into Vietnamese business insights. 3. The Deployment Layer integrations.py: Instead of just printing code, it saves the generated Workflow JSON into the workflows table in SQLite. Simulates the "Save & Activate" flow of a real SaaS platform. 🚀 Deployment Instructions 1. Hardware Requirements GPU: NVIDIA GPU with 24GB VRAM minimum (Recommended: L4, A10g, RTX 3090/4090). RAM: 16GB System RAM. Disk: 50GB free space (Model weights are large). 2. Environment Setup It is recommended to use conda or a virtual environment. code Bash # 1. Create Environment conda create -n project_a python=3.10 conda activate project_a # 2. Install Dependencies pip install -r requirements.txt requirements.txt content: code Text torch transformers accelerate bitsandbytes duckduckgo-search sentence-transformers sqlite3 lunardate protobuf 3. Running the Agent The project is self-contained. The first run will automatically: Download the Qwen-14B model (approx. 9-10GB). Initialize the SQLite Database. Seed mock data (User: Nguyen Van A). code Bash python src/main.py 🧪 Testing the Capabilities Once the system shows ✅ Ready, try these scenarios: Scenario A: Business Intelligence (Data + Context) Input: "Hôm nay doanh thu thế nào?" (How is revenue today?) Logic: Manager detects DATA_INTERNAL -> Calls saas_api.get_sales_report -> Formats response. Output: "Doanh thu hôm nay của BabyWorld là 3.700.000 VND..." Scenario B: Contextual Advice (Lunar Tool + Profile) Input: "Hôm nay là ngày bao nhiêu âm? Có nên khuyến mãi không?" Logic: Manager calls RetailTools.get_lunar_date -> Checks Profile (Baby Store) -> Suggests advice. Output: "Hôm nay là 15 Âm lịch... Nên chạy chương trình nhẹ nhàng..." Scenario C: Automation Building (The "Meta-Agent") Input: "Tự động gửi email cảm ơn khi có đơn hàng mới." Logic: Manager detects TECHNICAL (Specific) -> Coder generates JSON -> Integrations saves to DB. Output: "Đã thiết kế xong. Workflow ID: 5. ✅ ĐÃ LƯU THÀNH CÔNG." 🔮 Future Roadmap (Beyond Phase 22) Real API Hookup: Replace saas_api.py methods with real SQL queries to your Postgres/MySQL production DB. Frontend Integration: Connect this Python backend to your Website via FastAPI. User types in Web Chat -> FastAPI sends to main.py -> Agent Returns Text/JSON. Vision Support: Upgrade to Qwen-VL to allow users to upload photos of invoices or products for auto-entry.