Why We Built AISA ?
A Reference Architecture for Agentic AI Systems
Agentic AI systems are rapidly moving from research prototypes to real-world deployments. Large language models can now reason over multiple steps, call tools, retrieve information, and interact with complex environments. However, as these systems grow more autonomous, they also become harder to design, evaluate, and control.
In practice, many agentic systems today are built in an ad hoc manner. Prompt logic, tool execution, orchestration, memory, and evaluation are often intertwined in ways that make failures difficult to diagnose and systems difficult to scale. When something goes wrong, it is rarely clear whether the root cause lies in the model, the prompt, the tool interface, or the surrounding infrastructure.
We built AISA (Agentic AI Systems Architecture) to address this gap.
The Problem: Agentic AI Is a Systems Problem
As autonomy increases, agentic AI stops being a model-level problem and becomes a systems-level problem.
Traditional machine learning systems typically:
- operate on fixed inputs,
- produce a single prediction,
- and can be evaluated with static metrics.
Agentic systems are different. They:
- act over long horizons,
- make intermediate decisions,
- invoke external tools,
- maintain state and memory,
- and produce side effects.
Failures in these systems rarely come from a single bad model output. Instead, they emerge from interactions between components over time. A small retrieval error early in a task can cascade into incorrect reasoning, unsafe tool use, or inconsistent state later on.
Without a clear architectural structure, these failures are difficult to analyze, reproduce, or prevent.
What Is AISA?
AISA (Agentic AI Systems Architecture) is a layered, implementation-neutral reference architecture for agentic AI systems.
It is not:
- a framework,
- a library,
- or a new model.
Instead, AISA provides:
- explicit separation of concerns,
- clearly defined responsibilities,
- and a shared vocabulary for discussing agent design, evaluation, and governance.
AISA is designed to be:
- Model-agnostic (any LLM),
- Framework-agnostic (LangChain, AutoGen, custom stacks),
- Deployment-agnostic (research prototypes or production systems).
AISA Architecture Overview
At a high level, AISA decomposes an agentic system into the following layers:
LLM Foundation Layer
The language model itself and how it is prompted, constrained, and grounded.Tool & Environment Layer
The controlled execution boundary between the agent and external systems.Cognitive Agent Layer
Planning, reasoning, memory, reflection, and decision-making.Agentic Infrastructure Layer
Orchestration, state propagation, coordination, and observability.Evaluation & Feedback Layer
Trajectory-level evaluation, monitoring, and error analysis.Development & Deployment Layer
Versioning, testing, releases, and reproducibility.Governance, Ethics & Policy Layer
Permissions, safety rules, auditability, and human oversight.
Each layer has a clear role. No layer is responsible for everything.
This separation is intentional.
Why Layering Matters
Many existing agent implementations mix:
- Reasoning logic with tool permissions,
- Prompt content with execution control,
- Evaluation with runtime logic.
This makes systems fragile.
By separating concerns:
- Failures can be localized,
- Responsibilities are clearer,
- Evaluation becomes more meaningful,
- and systems scale more safely.
AISA does not require every system to implement every layer fully.
Instead, it provides a conceptual scaffold that grows with system complexity.
End-to-End Agent Flow in AISA
Using AISA, an agent operates in a structured loop:
Context Assembly
Relevant instructions, memory, and retrieved knowledge are gathered within explicit budgets.Reasoning & Action Proposal
The cognitive agent plans next steps and proposes actions using the LLM.Controlled Tool Execution
Proposed actions are validated and executed through the tool layer.State Update
Results update memory and state through infrastructure mechanisms.Evaluation & Feedback
Behavior is monitored across the full trajectory, not just individual outputs.
This loop makes long-horizon behavior observable and analyzable.
A Simple Example: RAG as an Agentic System
Consider a retrieval-augmented generation (RAG) assistant.
Using AISA:
- Retrieval and embeddings live in the Tool & Environment Layer.
- Prompting and generation live in the LLM Foundation Layer.
- Decisions about whether to retrieve more context or answer live in the Cognitive Agent Layer.
- Session state, caching, and tracing live in the Infrastructure Layer.
- Answer correctness and citation coverage are handled by the Evaluation Layer.
This decomposition makes it clear:
- where failures originate,
- how to improve the system,
- and how to scale it safely.
The same architecture extends naturally to enterprise agents and multi-agent systems without changing the core abstraction.
Who Is AISA For?
AISA is designed to be useful for multiple audiences:
- Researchers studying LLM-based agents, long-horizon reasoning, and evaluation.
- Engineers building tool-augmented or autonomous AI systems.
- Practitioners deploying agentic systems that must be reliable, auditable, and governed.
You do not need to “adopt” AISA to benefit from it.
Even using it as a thinking tool can clarify design decisions and failure modes.
Relationship to Our Research Paper
The ideas in this article are formalized in our paper:
AISA: A Unified Architecture for Agentic AI Systems
📄 Extended version (Zenodo): https://doi.org/10.5281/zenodo.18161880
A condensed 8-page version of this work is currently under peer review at ACL.
Our Goal
Our goal with AISA is not to prescribe a single “correct” way to build agents.
Instead, we aim to provide a shared architectural language that helps the community reason more clearly about agentic AI systems as they grow in complexity and impact.
We hope this architecture helps:
- reduce ad hoc design,
- improve evaluation practices,
- and support more responsible, scalable agentic AI.
Status: Active development · License: CC BY 4.0


