shekkari21 commited on
Commit
577d875
Β·
1 Parent(s): 9e2b74c

updated files

Browse files
misc/tutorials/NEXT_STEPS.md ADDED
@@ -0,0 +1,442 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Next Steps: Complementary Skills & Learning Path
2
+
3
+ After building this agent framework from scratch, here's what to learn next to become a complete Agentic AI Engineer.
4
+
5
+ ---
6
+
7
+ ## Why This Matters
8
+
9
+ You've built the fundamentals. But in the real world:
10
+ - Agents need to retrieve knowledge (RAG)
11
+ - Complex tasks need multiple agents
12
+ - Production systems need observability
13
+ - Safety is non-negotiable
14
+
15
+ This guide helps you add value beyond "I built an agent framework."
16
+
17
+ ---
18
+
19
+ ## Priority 1: RAG (Retrieval Augmented Generation)
20
+
21
+ You have basic embeddings, but production RAG is much deeper.
22
+
23
+ ### Key Concepts
24
+
25
+ | Concept | Description | Why It Matters |
26
+ |---------|-------------|----------------|
27
+ | **Chunking Strategies** | Fixed-size, semantic, recursive splitting | Affects retrieval quality dramatically |
28
+ | **Hybrid Search** | Combine vector + keyword (BM25) | Better results than vector-only |
29
+ | **Re-ranking** | Cross-encoders to improve top-k | Fixes retriever mistakes |
30
+ | **Vector Databases** | Pinecone, Weaviate, Qdrant, Chroma | Each has different tradeoffs |
31
+ | **Query Transformation** | HyDE, step-back, multi-query | Improve query-document matching |
32
+ | **Agentic RAG** | Agent decides when/what to retrieve | Most flexible approach |
33
+
34
+ ### Add to Your Project
35
+
36
+ ```python
37
+ @tool
38
+ def rag_search(query: str, top_k: int = 5) -> str:
39
+ """Search knowledge base with hybrid retrieval."""
40
+ # 1. Vector search
41
+ vector_results = vector_db.search(embed(query), top_k=top_k*2)
42
+
43
+ # 2. Keyword search (BM25)
44
+ keyword_results = bm25_search(query, top_k=top_k*2)
45
+
46
+ # 3. Merge and dedupe
47
+ combined = merge_results(vector_results, keyword_results)
48
+
49
+ # 4. Re-rank with cross-encoder
50
+ reranked = cross_encoder.rerank(query, combined, top_k=top_k)
51
+
52
+ return format_results(reranked)
53
+ ```
54
+
55
+ ### Resources
56
+ - [LlamaIndex](https://docs.llamaindex.ai/) - Best RAG framework
57
+ - [LangChain RAG Tutorial](https://python.langchain.com/docs/tutorials/rag/)
58
+ - Paper: "Retrieval-Augmented Generation for Large Language Models: A Survey"
59
+
60
+ ---
61
+
62
+ ## Priority 2: Multi-Agent Systems
63
+
64
+ Your framework is single-agent. The industry is moving to multi-agent architectures.
65
+
66
+ ### Patterns
67
+
68
+ | Pattern | Description | Use Case |
69
+ |---------|-------------|----------|
70
+ | **Supervisor** | One agent delegates to specialists | Complex tasks with clear subtasks |
71
+ | **Debate** | Agents argue, synthesize best answer | Reduce hallucination, improve reasoning |
72
+ | **Pipeline** | Agent A -> Agent B -> Agent C | Sequential processing |
73
+ | **Swarm** | Agents coordinate dynamically | Open-ended exploration |
74
+ | **Reflection** | Agent critiques own output | Self-improvement loop |
75
+
76
+ ### Example: Supervisor Pattern
77
+
78
+ ```python
79
+ class SupervisorAgent(Agent):
80
+ def __init__(self, specialists: List[Agent]):
81
+ self.specialists = {agent.name: agent for agent in specialists}
82
+ super().__init__(
83
+ instructions="""You are a supervisor.
84
+ Delegate tasks to specialists:
85
+ - researcher: for information gathering
86
+ - coder: for code tasks
87
+ - writer: for content creation
88
+ """
89
+ )
90
+
91
+ async def delegate(self, task: str, specialist_name: str):
92
+ specialist = self.specialists[specialist_name]
93
+ return await specialist.run(task)
94
+ ```
95
+
96
+ ### Frameworks to Study
97
+ - **LangGraph** - Stateful multi-agent workflows
98
+ - **CrewAI** - Role-based agent teams
99
+ - **AutoGen** - Microsoft's multi-agent framework
100
+ - **Swarm** - OpenAI's experimental framework
101
+
102
+ ---
103
+
104
+ ## Priority 3: Observability & Tracing
105
+
106
+ You have `format_trace`, but production systems need more.
107
+
108
+ ### Tools
109
+
110
+ | Tool | Type | Best For |
111
+ |------|------|----------|
112
+ | **LangSmith** | SaaS | LangChain users, enterprise |
113
+ | **LangFuse** | Open Source | Self-hosted, privacy-focused |
114
+ | **Weights & Biases** | SaaS | Experiment tracking |
115
+ | **OpenTelemetry** | Standard | Distributed tracing |
116
+ | **Arize Phoenix** | Open Source | LLM observability |
117
+
118
+ ### Key Metrics to Track
119
+
120
+ ```python
121
+ @dataclass
122
+ class AgentMetrics:
123
+ # Latency
124
+ total_duration_ms: float
125
+ llm_call_duration_ms: float
126
+ tool_execution_duration_ms: float
127
+
128
+ # Token Usage
129
+ prompt_tokens: int
130
+ completion_tokens: int
131
+ total_tokens: int
132
+
133
+ # Cost
134
+ estimated_cost_usd: float
135
+
136
+ # Quality
137
+ steps_to_completion: int
138
+ tool_calls_count: int
139
+ errors_count: int
140
+ ```
141
+
142
+ ### Add to Your Project
143
+
144
+ ```python
145
+ # In agent.py
146
+ class Agent:
147
+ async def run(self, ...):
148
+ start_time = time.time()
149
+
150
+ try:
151
+ result = await self._run_internal(...)
152
+
153
+ # Log metrics
154
+ self.log_metrics(AgentMetrics(
155
+ total_duration_ms=(time.time() - start_time) * 1000,
156
+ steps_to_completion=result.context.current_step,
157
+ # ... other metrics
158
+ ))
159
+
160
+ return result
161
+ except Exception as e:
162
+ self.log_error(e)
163
+ raise
164
+ ```
165
+
166
+ ---
167
+
168
+ ## Priority 4: Evaluation & Benchmarking
169
+
170
+ You have GAIA. Go deeper with systematic evaluation.
171
+
172
+ ### Evaluation Types
173
+
174
+ | Type | What It Measures | How |
175
+ |------|------------------|-----|
176
+ | **Task Completion** | Did agent solve the problem? | Binary success/fail |
177
+ | **Accuracy** | Is the answer correct? | Compare to ground truth |
178
+ | **Faithfulness** | Is answer grounded in retrieved context? | LLM-as-Judge |
179
+ | **Relevance** | Is answer relevant to question? | LLM-as-Judge |
180
+ | **Latency** | How fast is the agent? | Time measurement |
181
+ | **Cost** | How much did it cost? | Token tracking |
182
+
183
+ ### LLM-as-Judge Pattern
184
+
185
+ ```python
186
+ JUDGE_PROMPT = """
187
+ You are evaluating an AI agent's response.
188
+
189
+ Question: {question}
190
+ Agent's Answer: {answer}
191
+ Ground Truth: {ground_truth}
192
+
193
+ Rate the answer on a scale of 1-5:
194
+ 1 = Completely wrong
195
+ 2 = Partially wrong
196
+ 3 = Partially correct
197
+ 4 = Mostly correct
198
+ 5 = Completely correct
199
+
200
+ Provide your rating and reasoning.
201
+ """
202
+
203
+ async def evaluate_with_llm(question: str, answer: str, ground_truth: str) -> int:
204
+ response = await llm.generate(JUDGE_PROMPT.format(...))
205
+ return extract_rating(response)
206
+ ```
207
+
208
+ ### Frameworks
209
+ - **Ragas** - RAG evaluation
210
+ - **DeepEval** - LLM evaluation framework
211
+ - **Promptfoo** - Prompt testing
212
+ - **Evalica** - Comparative evaluation
213
+
214
+ ---
215
+
216
+ ## Priority 5: Safety & Guardrails
217
+
218
+ Production agents need safety layers.
219
+
220
+ ### Input Guardrails
221
+
222
+ ```python
223
+ class InputGuardrails:
224
+ def __init__(self):
225
+ self.blocked_patterns = [
226
+ r"ignore previous instructions",
227
+ r"you are now",
228
+ r"pretend to be",
229
+ ]
230
+
231
+ def check(self, input: str) -> bool:
232
+ for pattern in self.blocked_patterns:
233
+ if re.search(pattern, input, re.IGNORECASE):
234
+ return False
235
+ return True
236
+ ```
237
+
238
+ ### Output Guardrails
239
+
240
+ ```python
241
+ class OutputGuardrails:
242
+ async def check(self, output: str) -> tuple[bool, str]:
243
+ # Check for PII
244
+ if self.contains_pii(output):
245
+ return False, "Response contains PII"
246
+
247
+ # Check for harmful content
248
+ if await self.is_harmful(output):
249
+ return False, "Response contains harmful content"
250
+
251
+ return True, ""
252
+ ```
253
+
254
+ ### Integration with Your Framework
255
+
256
+ ```python
257
+ # Add as callbacks
258
+ agent = Agent(
259
+ model=LlmClient(model="gpt-4o-mini"),
260
+ tools=[...],
261
+ before_llm_callback=input_guardrails.check,
262
+ after_llm_callback=output_guardrails.check,
263
+ )
264
+ ```
265
+
266
+ ### Tools
267
+ - **Guardrails AI** - Structured output validation
268
+ - **NeMo Guardrails** - NVIDIA's safety framework
269
+ - **Lakera Guard** - Prompt injection detection
270
+ - **Rebuff** - Self-hardening prompt injection detector
271
+
272
+ ---
273
+
274
+ ## Priority 6: LLM Routing & Optimization
275
+
276
+ ### Smart Model Selection
277
+
278
+ ```python
279
+ class ModelRouter:
280
+ def __init__(self):
281
+ self.models = {
282
+ "simple": "gpt-4o-mini", # Fast, cheap
283
+ "complex": "gpt-4o", # Powerful
284
+ "coding": "claude-sonnet-4-5", # Best for code
285
+ }
286
+
287
+ async def route(self, query: str) -> str:
288
+ # Classify query complexity
289
+ complexity = await self.classify_complexity(query)
290
+
291
+ if "code" in query.lower():
292
+ return self.models["coding"]
293
+ elif complexity == "high":
294
+ return self.models["complex"]
295
+ else:
296
+ return self.models["simple"]
297
+ ```
298
+
299
+ ### Semantic Caching
300
+
301
+ ```python
302
+ class SemanticCache:
303
+ def __init__(self, similarity_threshold: float = 0.95):
304
+ self.cache = {}
305
+ self.embeddings = {}
306
+ self.threshold = similarity_threshold
307
+
308
+ async def get(self, query: str) -> str | None:
309
+ query_embedding = embed(query)
310
+
311
+ for cached_query, cached_response in self.cache.items():
312
+ similarity = cosine_similarity(
313
+ query_embedding,
314
+ self.embeddings[cached_query]
315
+ )
316
+ if similarity > self.threshold:
317
+ return cached_response
318
+
319
+ return None
320
+
321
+ async def set(self, query: str, response: str):
322
+ self.cache[query] = response
323
+ self.embeddings[query] = embed(query)
324
+ ```
325
+
326
+ ---
327
+
328
+ ## Suggested Learning Path
329
+
330
+ ### Month 1: RAG Deep Dive
331
+ - [ ] Implement hybrid search (vector + BM25)
332
+ - [ ] Add re-ranking with cross-encoder
333
+ - [ ] Build RAGTool for your agent
334
+ - [ ] Experiment with different chunking strategies
335
+
336
+ ### Month 2: Multi-Agent Systems
337
+ - [ ] Study LangGraph architecture
338
+ - [ ] Implement supervisor pattern
339
+ - [ ] Build debate/reflection agents
340
+ - [ ] Add multi-agent orchestration layer
341
+
342
+ ### Month 3: Production Readiness
343
+ - [ ] Integrate LangFuse for observability
344
+ - [ ] Implement input/output guardrails
345
+ - [ ] Build evaluation suite with LLM-as-Judge
346
+ - [ ] Add cost tracking and alerts
347
+
348
+ ### Month 4: Advanced Topics
349
+ - [ ] Implement smart model routing
350
+ - [ ] Add semantic caching
351
+ - [ ] Experiment with fine-tuning
352
+ - [ ] Build monitoring dashboard
353
+
354
+ ---
355
+
356
+ ## Quick Wins to Add Now
357
+
358
+ These can be added to your framework in a few hours each:
359
+
360
+ ### 1. Semantic Caching
361
+ ```python
362
+ # In memory.py
363
+ class SemanticCache:
364
+ """Cache responses for similar queries."""
365
+ ...
366
+ ```
367
+
368
+ ### 2. Cost Tracker
369
+ ```python
370
+ # In agent.py
371
+ PRICING = {
372
+ "gpt-4o-mini": {"input": 0.15, "output": 0.60}, # per 1M tokens
373
+ "gpt-4o": {"input": 2.50, "output": 10.00},
374
+ }
375
+
376
+ def calculate_cost(model: str, input_tokens: int, output_tokens: int) -> float:
377
+ prices = PRICING.get(model, {"input": 0, "output": 0})
378
+ return (input_tokens * prices["input"] + output_tokens * prices["output"]) / 1_000_000
379
+ ```
380
+
381
+ ### 3. Streaming Support
382
+ ```python
383
+ # In llm.py
384
+ async def generate_streaming(self, request: LlmRequest):
385
+ """Stream tokens as they're generated."""
386
+ ...
387
+ ```
388
+
389
+ ### 4. Simple Guardrails
390
+ ```python
391
+ # In callbacks.py
392
+ def prompt_injection_detector(context, request):
393
+ """Block obvious prompt injection attempts."""
394
+ ...
395
+ ```
396
+
397
+ ### 5. Retry with Exponential Backoff
398
+ ```python
399
+ # In llm.py
400
+ async def generate_with_retry(self, request: LlmRequest, max_retries: int = 3):
401
+ """Retry failed LLM calls with exponential backoff."""
402
+ ...
403
+ ```
404
+
405
+ ---
406
+
407
+ ## Resources
408
+
409
+ ### Courses
410
+ - [DeepLearning.AI - Building Agentic RAG with LlamaIndex](https://www.deeplearning.ai/short-courses/building-agentic-rag-with-llamaindex/)
411
+ - [DeepLearning.AI - Multi AI Agent Systems with crewAI](https://www.deeplearning.ai/short-courses/multi-ai-agent-systems-with-crewai/)
412
+ - [LangChain Academy](https://academy.langchain.com/)
413
+
414
+ ### Papers
415
+ - "ReAct: Synergizing Reasoning and Acting in Language Models"
416
+ - "Toolformer: Language Models Can Teach Themselves to Use Tools"
417
+ - "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models"
418
+ - "Retrieval-Augmented Generation for Large Language Models: A Survey"
419
+
420
+ ### Communities
421
+ - [LangChain Discord](https://discord.gg/langchain)
422
+ - [LlamaIndex Discord](https://discord.gg/llamaindex)
423
+ - [Latent Space Podcast](https://www.latent.space/)
424
+ - [AI Engineer Newsletter](https://www.aiengineer.dev/)
425
+
426
+ ---
427
+
428
+ ## What Would Make Your Project Stand Out
429
+
430
+ 1. **RAG + Agents** - Agent that retrieves, reasons, and acts
431
+ 2. **Multi-Agent Orchestration** - Coordinator + specialists
432
+ 3. **Built-in Evaluation** - Self-testing agent framework
433
+ 4. **Safety Layer** - Production-grade guardrails
434
+ 5. **Observability Dashboard** - Visual trace explorer
435
+ 6. **Semantic Caching** - Cost optimization
436
+ 7. **Model Routing** - Smart model selection
437
+
438
+ ---
439
+
440
+ **Previous**: [Resume Guide](./RESUME_GUIDE.md)
441
+ **Back to**: [Tutorial Overview](./README.md)
442
+
misc/tutorials/README.md CHANGED
@@ -4,13 +4,18 @@ This directory contains all materials for the "Building an AI Agent Framework fr
4
 
5
  ---
6
 
7
- ## πŸ“š Documentation
8
 
9
  ### Core Documentation
10
  - **[FEATURE_DOCUMENTATION.md](./FEATURE_DOCUMENTATION.md)**: Complete inventory of all framework features
11
  - **[ARCHITECTURE_DIAGRAMS.md](./ARCHITECTURE_DIAGRAMS.md)**: Visual diagrams using Mermaid syntax
12
  - **[GITHUB_STRUCTURE.md](./GITHUB_STRUCTURE.md)**: Repository organization and branch strategy
13
  - **[EXERCISES.md](./EXERCISES.md)**: Exercises and challenges for each episode
 
 
 
 
 
14
 
15
  ---
16
 
@@ -114,15 +119,26 @@ Episode 10: Deployment
114
 
115
  ---
116
 
117
- ## πŸ“ File Structure
118
 
119
  ```
120
  misc/tutorials/
121
  β”œβ”€β”€ README.md (this file)
 
 
122
  β”œβ”€β”€ FEATURE_DOCUMENTATION.md
123
  β”œβ”€β”€ ARCHITECTURE_DIAGRAMS.md
124
  β”œβ”€β”€ GITHUB_STRUCTURE.md
 
 
125
  β”œβ”€β”€ EXERCISES.md
 
 
 
 
 
 
 
126
  β”œβ”€β”€ EPISODE_01_INTRODUCTION.md
127
  β”œβ”€β”€ EPISODE_02_LLM_CALL.md
128
  β”œβ”€β”€ EPISODE_03_DATA_MODELS.md
@@ -226,18 +242,36 @@ Questions or issues?
226
 
227
  ---
228
 
229
- ## πŸŽ‰ Series Completion
230
 
231
  After completing all 10 episodes, you will have:
232
 
233
- βœ… Built a complete AI agent framework
234
- βœ… Understand every component
235
- βœ… Created production-ready code
236
- βœ… Deployed a web application
237
- βœ… Gained deep understanding of agent architecture
238
 
239
  **Congratulations on your learning journey!**
240
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
241
  ---
242
 
243
  *Last Updated: 2026*
 
4
 
5
  ---
6
 
7
+ ## Documentation
8
 
9
  ### Core Documentation
10
  - **[FEATURE_DOCUMENTATION.md](./FEATURE_DOCUMENTATION.md)**: Complete inventory of all framework features
11
  - **[ARCHITECTURE_DIAGRAMS.md](./ARCHITECTURE_DIAGRAMS.md)**: Visual diagrams using Mermaid syntax
12
  - **[GITHUB_STRUCTURE.md](./GITHUB_STRUCTURE.md)**: Repository organization and branch strategy
13
  - **[EXERCISES.md](./EXERCISES.md)**: Exercises and challenges for each episode
14
+ - **[ADDITIONAL_EXERCISES.md](./ADDITIONAL_EXERCISES.md)**: Cross-topic challenges and integration exercises
15
+
16
+ ### Career & Next Steps
17
+ - **[RESUME_GUIDE.md](./RESUME_GUIDE.md)**: How to market this project for AI engineering roles
18
+ - **[NEXT_STEPS.md](./NEXT_STEPS.md)**: Complementary skills & learning path after completion
19
 
20
  ---
21
 
 
119
 
120
  ---
121
 
122
+ ## File Structure
123
 
124
  ```
125
  misc/tutorials/
126
  β”œβ”€β”€ README.md (this file)
127
+ β”‚
128
+ β”œβ”€β”€ # Core Documentation
129
  β”œβ”€β”€ FEATURE_DOCUMENTATION.md
130
  β”œβ”€β”€ ARCHITECTURE_DIAGRAMS.md
131
  β”œβ”€β”€ GITHUB_STRUCTURE.md
132
+ β”‚
133
+ β”œβ”€β”€ # Exercises
134
  β”œβ”€β”€ EXERCISES.md
135
+ β”œβ”€β”€ ADDITIONAL_EXERCISES.md
136
+ β”‚
137
+ β”œβ”€β”€ # Career & Learning
138
+ β”œβ”€β”€ RESUME_GUIDE.md
139
+ β”œβ”€β”€ NEXT_STEPS.md
140
+ β”‚
141
+ β”œβ”€β”€ # Episode Guides
142
  β”œβ”€β”€ EPISODE_01_INTRODUCTION.md
143
  β”œβ”€β”€ EPISODE_02_LLM_CALL.md
144
  β”œβ”€β”€ EPISODE_03_DATA_MODELS.md
 
242
 
243
  ---
244
 
245
+ ## Series Completion
246
 
247
  After completing all 10 episodes, you will have:
248
 
249
+ - Built a complete AI agent framework
250
+ - Understand every component
251
+ - Created production-ready code
252
+ - Deployed a web application
253
+ - Gained deep understanding of agent architecture
254
 
255
  **Congratulations on your learning journey!**
256
 
257
+ ### What's Next?
258
+
259
+ Check out **[NEXT_STEPS.md](./NEXT_STEPS.md)** for:
260
+ - RAG (Retrieval Augmented Generation)
261
+ - Multi-Agent Systems
262
+ - Observability & Tracing
263
+ - Evaluation & Benchmarking
264
+ - Safety & Guardrails
265
+ - LLM Routing & Optimization
266
+
267
+ ### Career Guidance
268
+
269
+ See **[RESUME_GUIDE.md](./RESUME_GUIDE.md)** for:
270
+ - How to market this project
271
+ - Resume bullet points (STAR method)
272
+ - Interview talking points
273
+ - Portfolio presentation tips
274
+
275
  ---
276
 
277
  *Last Updated: 2026*