File size: 7,301 Bytes
8da024f
 
a90c302
8da024f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a90c302
 
 
 
 
ac752cb
8da024f
 
 
 
 
 
 
 
 
 
ac752cb
a90c302
8da024f
a90c302
 
ac752cb
8da024f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a90c302
 
8da024f
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
# System Registry & Wiring Architecture
**Status**: Active / Canonical
**Last Updated**: 2025-12-06

This document serves as the **Source of Truth** for the architectural wiring of the agent framework. It defines the strict rules for decorators, protocol markers, and the tool registry to prevent regression and ensure correct system behavior.

---

## 1. Decorator Registry

The agent framework relies on a strict decorator stack to inject functionality into `ChatClient` implementations. The **order of application** is critical for correct behavior.

### Standard Stack (Bottom-Up Order)

| Order | Decorator | Purpose | Source | Critical Notes |
|:--|:---|:---|:---|:---|
| **1 (Inner)** | `@use_chat_middleware` | Handles request/response middleware processing (e.g. logging, filtering). | `agent_framework._middleware` | Must be closest to the class. |
| **2** | `@use_observability` | Injects tracing and metrics (OpenTelemetry/logging). | `agent_framework.observability` | Wraps the middleware-enhanced client. |
| **3 (Outer)** | `@use_function_invocation` | **CRITICAL**: Intercepts `FunctionCallContent` in responses, **executes the Python function**, and recursively calls the LLM with the result. | `agent_framework._tools` | **MUST NOT** be used if `__function_invoking_chat_client__ = True` is set (see Markers). |

### Correct Usage Example

```python
@use_function_invocation  # <--- 3. Handles tool execution loop
@use_observability        # <--- 2. Adds tracing
@use_chat_middleware      # <--- 1. Adds middleware support
class MyChatClient(BaseChatClient):
    ...
```

---

## 2. Protocol Markers

Special class attributes (dunder methods/variables) that control framework behavior.

| Marker | Value | Purpose | Set By | Read By | Impact of Misuse |
|:---|:---|:---|:---|:---|:---|
| `__function_invoking_chat_client__` | `bool` | Signals that this client **natively handles** the tool execution loop internally. | `ChatClient` Class Body | `@use_function_invocation` | **CRITICAL BUG**: If set to `True` but the client *doesn't* execute tools, tool calls will be generated by the LLM but **never executed**. The agent will hang or hallucinate results. |

### Wiring Rules
*   **Default Clients (OpenAI/HuggingFace):** Should generally **NOT** set this marker. Rely on `@use_function_invocation` to handle execution.
*   **Special Clients:** Only set to `True` if you are implementing a custom loop that executes tools and feeds results back without the framework's help.

### Setting Responsibility
*   **Default:** Do not set `__function_invoking_chat_client__` in the class body. The `@use_function_invocation` decorator sets it automatically after wrapping.
*   **Custom Loop:** Only set to `True` if you have implemented a custom tool execution loop that does not rely on the framework's decorator.

---

## 3. Tool Inventory

### 3.1 AI Functions (Agent-Callable Tools)

These are the `@ai_function` decorated functions that agents can invoke. The framework executes these via `@use_function_invocation`.

| Function Name | File Path | Description |
|:---|:---|:---|
| `search_pubmed` | `src/agents/tools.py:20` | Searches PubMed for biomedical literature |
| `search_clinical_trials` | `src/agents/tools.py:80` | Searches ClinicalTrials.gov for clinical studies |
| `search_preprints` | `src/agents/tools.py:120` | Searches Europe PMC for preprints and papers |
| `get_bibliography` | `src/agents/tools.py:160` | Returns collected references for final report |
| `search_web` | `src/agents/retrieval_agent.py:17` | Searches web using DuckDuckGo |
| ~~`execute_python_code`~~ | ~~`src/agents/code_executor_agent.py`~~ | REMOVED in PR #130 (Modal deleted) |

### 3.2 Tool Classes (Internal Wrappers)

These are **internal implementation wrappers** used by the AI Functions. They are NOT directly callable by agents.

| Class | File Path | Used By |
|:---|:---|:---|
| `PubMedTool` | `src/tools/pubmed.py` | `search_pubmed` |
| `ClinicalTrialsTool` | `src/tools/clinicaltrials.py` | `search_clinical_trials` |
| `EuropePMCTool` | `src/tools/europepmc.py` | `search_preprints` |
| `OpenAlexTool` | `src/tools/openalex.py` | OpenAlex search (used in SearchHandler) |
| `WebSearchTool` | `src/tools/web_search.py` | `search_web` (DuckDuckGo) |
| `SearchHandler` | `src/tools/search_handler.py` | Orchestrates parallel searches |
| `RateLimiter` | `src/tools/rate_limiter.py` | Rate limiting via `limits` library |
| `BaseTool` | `src/tools/base.py` | Abstract base class for tools |
| ~~`ModalCodeExecutor`~~ | ~~`src/tools/code_execution.py`~~ | REMOVED in PR #130 |

---

## 4. Client Implementation Guide

When adding a new LLM provider, follow this strict pattern:

### A. The "Native Execution" Fallacy
Do not assume that because an API supports "function calling" (parsing JSON), the client supports "function execution" (running Python code).
*   **Function Calling:** LLM -> JSON (Client responsibility)
*   **Function Execution:** JSON -> Python Result -> LLM (Framework responsibility via `@use_function_invocation`)

### B. Reference Implementation

```python
from agent_framework import BaseChatClient
from agent_framework._tools import use_function_invocation
from agent_framework.observability import use_observability
from agent_framework._middleware import use_chat_middleware

# 1. Apply decorators in this EXACT order
@use_function_invocation
@use_observability
@use_chat_middleware
class NewProviderChatClient(BaseChatClient):
    
    # 2. DO NOT set this unless you know what you are doing
    # __function_invoking_chat_client__ = True  <-- DELETE THIS
    
    async def _inner_get_response(self, ...):
        # 3. Parse API response -> FunctionCallContent
        # 4. Return ChatResponse with contents=[FunctionCallContent(...)]
        pass
        
    async def _inner_get_streaming_response(self, ...):
        # 5. Yield FunctionCallContent when tool calls are detected
        pass
```

---

## 5. Known Issues & Gotchas

*   **~~P1 Bug - Premature Marker Setting~~ (FIXED):** The `HuggingFaceChatClient` previously set `__function_invoking_chat_client__ = True` in the class body, which caused `@use_function_invocation` to skip wrapping. **Resolution:** Marker removed; decorator now sets it correctly.
*   **HuggingFace Provider Routing:** Large models (70B+) may be routed to third-party inference providers (Novita, Hyperbolic) instead of native HF infrastructure. See `CLAUDE.md` for current model recommendations.
*   **Model Hallucination:** If tool execution fails (due to incorrect wiring), models like Qwen2.5-7B will often **hallucinate** fake tool results as text. Always verify `AgentRunResponse` contains actual `FunctionResultContent`.

---

## 6. Verification Checklist

When adding or modifying a ChatClient:

- [ ] Decorators applied in correct order: `@use_function_invocation` β†’ `@use_observability` β†’ `@use_chat_middleware`
- [ ] `__function_invoking_chat_client__` is NOT set in class body (unless implementing custom execution loop)
- [ ] Verify `@use_function_invocation` decorator actually wraps methods (check `__wrapped__` attribute at runtime)
- [ ] Tool calls parsed into `FunctionCallContent` objects
- [ ] Streaming yields `FunctionCallContent` at end of stream
- [ ] Run `make check` to verify all tests pass