Kraft102's picture
fix: sql.js Docker/Alpine compatibility layer for PatternMemory and FailureMemory
5a81b95

Sentry Integration

Purpose: Real-time error tracking and production monitoring Status: 🟒 βœ… Fully Operational (NO ISSUES) Version: 1.0.0+ Maintainer: Infrastructure Team

What It Does

Sentry provides:

  • Real-time error capture and alerts
  • Stack trace analysis and grouping
  • User session tracking and replay
  • Performance monitoring
  • Release tracking and deployment tracking
  • Custom event logging
  • Alert routing and notifications

Status: PRODUCTION READY βœ…

No known issues. Works perfectly.

All error-finding libraries have workarounds documented, but Sentry needs nothing - it just works.

Integration Status

Location: app/integrations/sentry/ Status: βœ… Already integrated into cascade orchestrator

Sentry is:

  • βœ… Automatically capturing all errors
  • βœ… Logging to .claude/logs/sentry.log
  • βœ… Sending real-time alerts
  • βœ… Tracking deployments
  • βœ… Monitoring performance

No configuration needed - it's already active and working.

Quick Access

View Live Errors

# Recent errors logged locally
tail -f .claude/logs/sentry.log

# Or check Sentry dashboard:
# https://sentry.io/[your-org]/[your-project]/issues/

Manual Event Logging

import sentry_sdk
from sentry_sdk import capture_event

# Capture custom event
capture_event({
    "message": "Widget discovered",
    "level": "info",
    "tags": {
        "widget_type": "email",
        "discovery_source": "hugging_face"
    }
})

# Capture exception
try:
    result = process_widget(data)
except Exception as e:
    sentry_sdk.capture_exception(e)

Set User Context

import sentry_sdk

sentry_sdk.set_user({
    "id": "block_2_cloud_arch",
    "username": "agent_block_2",
    "email": "block2@widgettdc.local"
})

Dashboard

Access real-time metrics:

  • Error count: Total errors in system
  • Error rate: Errors per minute
  • User impact: How many sessions affected
  • Performance: Request latency percentiles
  • Releases: Code deployments and their error rates

Features You Get

Feature 1: Error Grouping

Errors are automatically grouped by type and location

Similar errors grouped together:
- "Timeout in widget discovery" (245 occurrences)
- "Invalid JSON in config" (98 occurrences)
- "Database connection pool exhausted" (12 occurrences)

Feature 2: Stack Traces

Full stack traces with source code context

Traceback:
  File "src/services/widget_discovery.py", line 45, in discover_widgets
    result = call_hugging_face_api(query)
  File "src/integrations/hf_api.py", line 23, in call_hugging_face_api
    response = requests.get(url, timeout=30)
  File "requests/__init__.py", line 61, in get
    return request('get', url, params=params, **kwargs)

Error: requests.exceptions.Timeout: Connection timeout after 30s

Feature 3: Release Tracking

Track which code changes caused issues

Release v1.0.0-alpha.5 deployed at 14:32
- 2 new errors introduced
- 1 error fixed (was appearing in v1.0.0-alpha.4)
- Overall health: Improving (3% fewer errors)

Feature 4: User Sessions

Track user journeys leading to errors

Session for: Block 5 QASpecialist Agent
Duration: 23 minutes
Actions:
  1. Started widget discovery (14:15)
  2. Scanned 5 Git repos (14:16-14:18)
  3. ERROR: Timeout in repo 4 (14:19)
  4. Retry discovered widgets (14:20)
  5. Conversion started (14:21)

Feature 5: Performance Monitoring

Track response times and bottlenecks

Endpoint Performance:
- POST /api/widgets/discover: 2.3s avg (p95: 8.2s)
- POST /api/widgets/convert: 15.2s avg (p95: 45.3s)
- GET /api/widgets/status: 0.1s avg (p95: 0.3s)

Slow transactions:
- Widget conversion pipeline (47% of time in ML inference)
- Database queries (23% of time in index scans)

Usage by Block

Block 1 - Frontend (UI/UX)

Sentry automatically captures:

  • Browser JavaScript errors
  • Frontend performance metrics
  • User interaction tracking

Benefit: See exactly what users experience

Block 2 - CloudArch (MCP Framework)

Sentry automatically captures:

  • MCP service errors
  • Message passing failures
  • Widget-to-widget communication issues

Benefit: Real-time visibility into widget triggering failures

Block 3 - Security (Error Handling)

Sentry helps validate error handling:

  • Confirms all errors are caught
  • Tracks unhandled exceptions
  • Monitors security-related errors

Benefit: Verify error handling effectiveness

Block 4 - Database (Registry)

Sentry automatically captures:

  • Database connection errors
  • Transaction failures
  • State synchronization issues

Benefit: Early warning of data integrity problems

Block 5 - QA (Widget Discovery)

Sentry automatically captures:

  • Discovery pipeline failures
  • API integration errors
  • Conversion process issues

Benefit: Track discovery success rate in real-time

Block 6 - Security & Compliance

Sentry automatically captures:

  • Security validation failures
  • Permission denial events
  • Compliance audit triggers

Benefit: Complete audit trail of security events

Daily Standup Integration

How to use Sentry in your daily report:

## Block X - Daily Standup

**Sentry Status**:
- βœ… 0 critical errors in last 24 hours
- 🟑 3 high-priority errors (need investigation)
- πŸ“Š Widget discovery success rate: 94.2%
- ⚑ Average response time: 2.3s

**Error Highlights**:
- Fixed: Timeout in repo scanning (was #1 issue)
- New: JSON parsing error in 2 discovered widgets (investigating)
- Monitoring: MCP state sync (previously unstable, now stable)

Troubleshooting

Q: How do I know errors are being captured?

# Check log file
tail -20 .claude/logs/sentry.log

# Should show recent events

Q: How do I send a custom event?

import sentry_sdk

sentry_sdk.capture_message(
    "Widget discovery completed: 45 widgets found",
    level="info"
)

Q: How do I track specific metrics?

import sentry_sdk

# Add breadcrumb (event in user's journey)
sentry_sdk.add_breadcrumb({
    "category": "widget_discovery",
    "message": "Found 3 new widgets",
    "level": "info"
})

# Later, if error occurs, breadcrumbs will be visible

Q: How do I correlate errors with my code changes?

# Set release in deployment
export SENTRY_RELEASE="v1.0.0-alpha.5"
python run.py real -c

# Then errors show which release they occurred in

Integration Points

With Cascade Orchestrator

Sentry automatically tracks:

  • Each cascade iteration start/end
  • Agent block execution success/failure
  • Token usage and costs
  • Performance metrics

With Error Libraries

Sentry receives findings from:

  • pytest-error-handler (test failures)
  • Hugging Face detector (security issues)
  • mypy-strict-mode (type errors)
  • Each logged as custom event

With Daily Standups

Sentry data appears in:

  • Block status (error count, trends)
  • Performance metrics (response times)
  • User impact (how many sessions affected)
  • Recommendations (which errors to fix first)

Success Indicators

Good Sentry usage:

  • βœ… Error count trending downward over time
  • βœ… New errors caught within 1 minute of occurrence
  • βœ… Stack traces clearly show cause
  • βœ… User sessions explain error context
  • βœ… Performance metrics stable or improving

Underutilized Sentry:

  • ❌ Errors logged but not acted upon
  • ❌ High error volume not decreasing
  • ❌ No correlation between releases and error spikes
  • ❌ Performance metrics not monitored

Next Steps

  1. Check dashboard: View recent errors at Sentry dashboard
  2. Review findings: Understand error patterns
  3. Report in standup: Include Sentry metrics in daily report
  4. Act on findings: Fix top errors in priority order
  5. Monitor improvement: Track error reduction over time

Questions?

Sentry is production-ready and working perfectly. If you have questions:

  1. Check Sentry dashboard for your org
  2. Review the integration code at app/integrations/sentry/
  3. File questions in daily standup (Sentry is not the blocker - others are)

Remember: Sentry is the ONLY error library with NO issues. The others (pytest, Hugging Face, mypy) all have documented workarounds, but Sentry just works.

Real-time error tracking enabled. βœ