Multi_Agent_Model / src /README.md
sonthaiha
Fresh Deployment with LFS
1804a7a
Project A: Intelligent Retail Automation Agent (Level 3)
Project A is an autonomous, on-premise AI agent designed for the Vietnamese Retail Industry. Unlike standard chatbots, it functions as a Digital Employee capable of:
Consulting: Providing business advice using context-aware memory and RAG.
Managing: Interfacing with SaaS data (Sales, Inventory, CRM).
Building: Autonomous generation of technical automation workflows (Make.com/Native JSON) based on natural language requests.
Built to run efficiently on a single L4 GPU (24GB VRAM) using a Unified 8-bit Model Architecture.
๐Ÿ—๏ธ Architecture Overview
The system utilizes a Single-Model, Multi-Persona architecture. Instead of loading multiple models, we use one high-performance model (Qwen-2.5-Coder-14B-Instruct) and dynamically swap system prompts to alter its behavior (Manager, Coder, Researcher).
System Logic Flow (Visualization)
code
Mermaid
graph TD
User[User Input] --> Main[Main Orchestrator (main.py)]
subgraph Context_Layer
Mem[(SQLite Memory)]
SaaS[SaaS API Mock]
Tools[Retail Tools]
end
Main -->|Fetch History & Profile| Mem
Main -->|Health Check| Tools
Main -->|Send Context| Manager
subgraph The_Brain_Qwen_14B
Manager{Manager Agent}
Coder[Coder Agent]
Researcher[Researcher Agent]
end
Manager -->|Analyze Intent| Router{Router Decision}
%% Branch 1: General Business
Router -->|Intent: GENERAL / DATA| SaaS
SaaS -->|Return Sales/Stock| Manager
Manager -->|Consult & Advise| Output[Final Response]
%% Branch 2: Marketing
Router -->|Intent: MARKETING| Manager
Manager -->|Write Copy| Output
%% Branch 3: Technical Automation
Router -->|Intent: TECHNICAL| Coder
Coder -->|Generate JSON Blueprint| Validator[Syntax Check]
Validator -->|Valid JSON| Integrations[Integration Manager]
Integrations -->|Save to DB| Workflows[(Workflows Table)]
Workflows --> Output
๐Ÿ“‚ File Structure & Components
The project is contained entirely within the src/ directory for portability.
code
Text
ProjectA/
โ”œโ”€โ”€ main.py # Entry point. Handles the Event Loop, Login, and UI output.
โ”œโ”€โ”€ requirements.txt # Python dependencies.
โ”œโ”€โ”€ src/
โ”‚ โ”œโ”€โ”€ data/ # Storage
โ”‚ โ”‚ โ”œโ”€โ”€ project_a.db # SQLite DB (Chat History, Users, Stores, Sales, Workflows).
โ”‚ โ”‚ โ”œโ”€โ”€ docs/ # RAG Documents (PDF/TXT) for knowledge base.
โ”‚ โ”‚ โ””โ”€โ”€ blueprints/ # JSON Reference samples for the Coder to learn from.
โ”‚ โ”‚
โ”‚ โ”œโ”€โ”€ core/ # The Nervous System
โ”‚ โ”‚ โ”œโ”€โ”€ config.py # Settings (Model ID, Paths, Quantization Params).
โ”‚ โ”‚ โ”œโ”€โ”€ engine.py # Model Loader. Enforces Singleton pattern (loads Qwen once).
โ”‚ โ”‚ โ”œโ”€โ”€ memory.py # Database Manager. Handles Context injection & History.
โ”‚ โ”‚ โ”œโ”€โ”€ context.py # Login Logic. Handles Multi-store ambiguity resolution.
โ”‚ โ”‚ โ”œโ”€โ”€ saas_api.py # Mock API simulating KiotViet/Sapo (Sales, Inventory).
โ”‚ โ”‚ โ”œโ”€โ”€ tools.py # Deterministic Utilities (Math, Lunar Calendar, Health Check).
โ”‚ โ”‚ โ”œโ”€โ”€ integrations.py # Deployment Handler (Saves JSON to DB / Mock Social Post).
โ”‚ โ”‚ โ””โ”€โ”€ prompts.py # Prompt Library. Contains Persona definitions & Few-Shot examples.
โ”‚ โ”‚
โ”‚ โ””โ”€โ”€ agents/ # The Personas (All powered by Qwen-14B)
โ”‚ โ”œโ”€โ”€ base.py # Abstract wrapper for LLM inference.
โ”‚ โ”œโ”€โ”€ manager.py # THE BRAIN. Routes tasks, handles chat, intent classification.
โ”‚ โ”œโ”€โ”€ coder.py # THE BUILDER. Generates strict JSON automation blueprints.
โ”‚ โ””โ”€โ”€ researcher.py # THE EYES. Performs web searches (DuckDuckGo).
๐Ÿงฉ Detailed Component Roles
1. The Core Engines
engine.py: Loads Qwen-2.5-Coder-14B-Instruct in 8-bit quantization. This fits the model into ~16GB VRAM, leaving ~8GB buffer for long context windows (Chat History + RAG).
memory.py: Not just a logger. It actively seeds "Mock Data" (Sales, Users) so the system feels alive immediately. It formats the last
N
N
turns of conversation for the Manager to ensure continuity.
saas_api.py: Acts as the bridge to your business data. Currently returns mock data, but designed to be replaced with requests.get() to your real Backend API.
2. The Agents
Manager (manager.py):
Smart Routing: Distinguishes between "I want to automate" (Vague -> Ask Question) vs "Automate email on new order" (Specific -> Call Coder).
Contextual Glue: Injects Store Name, Industry, and Time into the system prompt so answers are always relevant.
Coder (coder.py):
Registry Aware: Uses a library of "Golden Templates" (in prompts.py or JSON) to ensure generated Make.com blueprints use the correct internal IDs and parameter names.
Researcher (researcher.py):
Summarizes web search results into Vietnamese business insights.
3. The Deployment Layer
integrations.py:
Instead of just printing code, it saves the generated Workflow JSON into the workflows table in SQLite.
Simulates the "Save & Activate" flow of a real SaaS platform.
๐Ÿš€ Deployment Instructions
1. Hardware Requirements
GPU: NVIDIA GPU with 24GB VRAM minimum (Recommended: L4, A10g, RTX 3090/4090).
RAM: 16GB System RAM.
Disk: 50GB free space (Model weights are large).
2. Environment Setup
It is recommended to use conda or a virtual environment.
code
Bash
# 1. Create Environment
conda create -n project_a python=3.10
conda activate project_a
# 2. Install Dependencies
pip install -r requirements.txt
requirements.txt content:
code
Text
torch
transformers
accelerate
bitsandbytes
duckduckgo-search
sentence-transformers
sqlite3
lunardate
protobuf
3. Running the Agent
The project is self-contained. The first run will automatically:
Download the Qwen-14B model (approx. 9-10GB).
Initialize the SQLite Database.
Seed mock data (User: Nguyen Van A).
code
Bash
python src/main.py
๐Ÿงช Testing the Capabilities
Once the system shows โœ… Ready, try these scenarios:
Scenario A: Business Intelligence (Data + Context)
Input: "Hรดm nay doanh thu thแบฟ nร o?" (How is revenue today?)
Logic: Manager detects DATA_INTERNAL -> Calls saas_api.get_sales_report -> Formats response.
Output: "Doanh thu hรดm nay cแปงa BabyWorld lร  3.700.000 VND..."
Scenario B: Contextual Advice (Lunar Tool + Profile)
Input: "Hรดm nay lร  ngร y bao nhiรชu รขm? Cรณ nรชn khuyแบฟn mรฃi khรดng?"
Logic: Manager calls RetailTools.get_lunar_date -> Checks Profile (Baby Store) -> Suggests advice.
Output: "Hรดm nay lร  15 ร‚m lแป‹ch... Nรชn chแบกy chฦฐฦกng trรฌnh nhแบน nhร ng..."
Scenario C: Automation Building (The "Meta-Agent")
Input: "Tแปฑ ฤ‘แป™ng gแปญi email cแบฃm ฦกn khi cรณ ฤ‘ฦกn hร ng mแป›i."
Logic: Manager detects TECHNICAL (Specific) -> Coder generates JSON -> Integrations saves to DB.
Output: "ฤรฃ thiแบฟt kแบฟ xong. Workflow ID: 5. โœ… ฤรƒ LฦฏU THร€NH Cร”NG."
๐Ÿ”ฎ Future Roadmap (Beyond Phase 22)
Real API Hookup: Replace saas_api.py methods with real SQL queries to your Postgres/MySQL production DB.
Frontend Integration: Connect this Python backend to your Website via FastAPI.
User types in Web Chat -> FastAPI sends to main.py -> Agent Returns Text/JSON.
Vision Support: Upgrade to Qwen-VL to allow users to upload photos of invoices or products for auto-entry.