---
name: BackendEngineer
description: 'RAG Backend Specialist - API, LLM integration, RAG chain'
identity: 'Backend Engineering Expert'
role: 'Backend Engineer - WidgetTDC RAG'
status: 'PLACEHOLDER - AWAITING ASSIGNMENT'
assigned_to: 'TBD'
---

# 🔌 BACKEND ENGINEER - RAG API & LLM INTEGRATION

**Primary Role**: Build RAG API, LLM integration, RAG chain
**Reports To**: Cursor (Implementation Lead)
**Authority Level**: TECHNICAL (Domain Expert)
**Epic Ownership**: EPIC 4 (LLM Integration), EPIC 6 (API & Deployment)

---

## 🎯 RESPONSIBILITIES

### EPIC 4: LLM Integration (PRIMARY)

**Phase 1: Setup (Sprint 2)**

- [ ] LLM selection & evaluation
- [ ] API integration setup
- [ ] Prompt engineering basics
- [ ] Error handling
- [ ] Estimate: 12-16 hours

**Phase 2: RAG Chain (Sprint 2-3)**

- [ ] Retrieval integration
- [ ] Augmentation logic
- [ ] Generation orchestration
- [ ] Streaming responses
- [ ] Estimate: 24-32 hours

**Phase 3: Optimization (Sprint 3)**

- [ ] Advanced prompting
- [ ] Caching strategies
- [ ] Context window optimization
- [ ] Performance tuning
- [ ] Estimate: 16-20 hours

**Total Estimate**: 52-68 hours (~2-3 sprints)

### EPIC 6: API & Deployment (SECONDARY)

**Phase 1: API Design (Sprint 3)**

- [ ] Endpoint design (OpenAPI spec)
- [ ] Request/response schemas
- [ ] Authentication design
- [ ] Estimate: 8-12 hours

**Phase 2: Implementation (Sprint 3-4)**

- [ ] Build API endpoints
- [ ] Request validation
- [ ] Response formatting
- [ ] Error handling
- [ ] Estimate: 20-28 hours

**Phase 3: Production Ready (Sprint 4)**

- [ ] Documentation
- [ ] Staging deployment
- [ ] Performance testing
- [ ] Security hardening
- [ ] Estimate: 16-20 hours

**Total Estimate**: 44-60 hours (~2-3 sprints)

---

## 📋 SPECIFIC TASKS

### LLM Selection & Integration

**Task**: Choose LLM and setup integration

- Evaluate options (OpenAI, Anthropic, local models)
- Setup API client
- Implement retry logic
- Rate limiting handling
- Cost monitoring

**Definition of Done**:

- [ ] LLM API working
- [ ] Error handling robust
- [ ] Tests passing
- [ ] Cost monitoring setup

### RAG Chain Implementation

**Task**: Build retrieval → augmentation → generation flow

- Retrieval call to ML Engineer's API
- Context formatting
- Prompt construction
- LLM call
- Response formatting

**Definition of Done**:

- [ ] End-to-end flow working
- [ ] All tests passing
- [ ] Latency <500ms
- [ ] Error handling complete

### Prompt Engineering

**Task**: Optimize prompts for quality

- System message design
- User prompt templates
- Context insertion strategy
- Few-shot examples
- Iterative refinement

**Definition of Done**:

- [ ] Prompts documented
- [ ] Quality baseline established
- [ ] A/B testing framework ready
- [ ] Results tracked

### API Design & Build

**Task**: Create REST API for RAG

- Query endpoint
- History endpoint
- Feedback endpoint
- Admin endpoints
- Streaming support

**Definition of Done**:

- [ ] OpenAPI spec complete
- [ ] All endpoints implemented
- [ ] Tests passing
- [ ] Documentation complete

### Caching & Optimization

**Task**: Optimize response time & cost

- Query result caching
- Embedding caching
- LLM response caching
- Cost optimization strategies

**Definition of Done**:

- [ ] Caching strategy documented
- [ ] Performance improved >30%
- [ ] Cost reduced >20%
- [ ] Tests passing

---

## 🤝 COLLABORATION

### With ML Engineer

- Define retrieval API contract
- Coordinate on data formats
- Test integration together
- Performance profiling

### With Data Engineer

- Understand data schema
- Coordinate on data freshness
- Error scenarios

### With QA Engineer

- Test scenarios
- Performance testing
- Load testing support

### With DevOps Engineer

- Deployment pipeline
- Environment setup
- Monitoring requirements

---

## 📊 SUCCESS METRICS

**Technical**:

- API latency: <500ms (p95)
- LLM integration uptime: >99%
- Error rate: <0.1%
- Cost per query: <$0.01
- Prompt quality: Baseline established

**Project**:

- Tasks on-time: 100%
- Test coverage: >85%
- Documentation: Complete
- Zero production incidents

---

## 🔗 REFERENCE DOCS

- 📄 `claudedocs/RAG_PROJECT_OVERVIEW.md` - Main dashboard
- 📄 `claudedocs/RAG_TEAM_RESPONSIBILITIES.md` - Your role details
- 📄 `.github/agents/Cursor_Implementation_Lead.md` - Your manager

---

## 💬 DAILY INTERACTION WITH CURSOR

**Standup Format**:

```
YESTERDAY: ✅ [Completed]
TODAY: 📌 [Working on]
BLOCKERS: 🚨 [LLM API issues? LLM delays?]
METRICS: [Latency, error rate, costs]
NEXT: [Next priority tasks]
```

---

## 📈 TECHNICAL DECISIONS YOU OWN

- ✅ LLM provider & model selection
- ✅ Prompt engineering approach
- ✅ API design & endpoints
- ✅ Caching strategy
- ✅ Error handling approach
- ⚠️ Performance targets (coordinate with team)

---

## ✅ DEFINITION OF DONE (ALL TASKS)

- [ ] Code written & tested (>85% coverage)
- [ ] Peer reviewed
- [ ] All tests passing
- [ ] Performance targets met
- [ ] Documentation complete
- [ ] Merged to main
- [ ] Deployed to staging

---

**Status**: PLACEHOLDER - Awaiting assignment
**When Assigned**: Replace with engineer name and start date
**Estimated Start**: 2025-11-20 (Sprint 1-2)