# Steel Browser Architecture This document provides a comprehensive overview of Steel Browser's architecture, design decisions, and how the various components work together. ## ๐Ÿ—๏ธ High-Level Architecture Steel Browser follows a modular, plugin-based architecture designed for extensibility and maintainability: ``` โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Steel Browser โ”‚ โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค โ”‚ Frontend (React UI) โ”‚ Backend (Fastify API) โ”‚ โ”‚ โ”œโ”€โ”€ Session Management โ”‚ โ”œโ”€โ”€ CDP Service โ”‚ โ”‚ โ”œโ”€โ”€ Real-time Viewing โ”‚ โ”œโ”€โ”€ Session Management โ”‚ โ”‚ โ”œโ”€โ”€ DevTools Integration โ”‚ โ”œโ”€โ”€ File Storage โ”‚ โ”‚ โ””โ”€โ”€ Configuration UI โ”‚ โ””โ”€โ”€ Plugin System โ”‚ โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค โ”‚ Chrome/Chromium Browser โ”‚ โ”‚ โ”œโ”€โ”€ Chrome DevTools Protocol (CDP) โ”‚ โ”‚ โ”œโ”€โ”€ Browser Extensions โ”‚ โ”‚ โ””โ”€โ”€ Page Contexts โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ ``` ## ๐Ÿ”ง Core Components ### 1. CDP Service (`api/src/services/cdp/cdp.service.ts`) The Chrome DevTools Protocol (CDP) Service is the heart of Steel Browser, managing all browser interactions: **Responsibilities:** - Browser lifecycle management (launch, close, restart) - Page creation and navigation - WebSocket proxy for CDP connections - Plugin system coordination - Session state management - Context isolation and fingerprinting **Key Features:** ```typescript class CDPService extends EventEmitter { // Browser management async launch(options?: BrowserLauncherOptions): Promise async shutdown(): Promise async refreshPrimaryPage(): Promise // Plugin system registerPlugin(plugin: BasePlugin): void unregisterPlugin(pluginName: string): boolean // Page management async createPage(): Promise async getPages(): Promise } ``` ### 2. Plugin System Steel Browser's plugin architecture allows for extensible functionality without modifying core code. #### Base Plugin (`api/src/services/cdp/plugins/core/base-plugin.ts`) ```typescript abstract class BasePlugin { // Lifecycle hooks async onBrowserLaunch(browser: Browser): Promise async onPageCreated(page: Page): Promise async onPageNavigate(page: Page): Promise async onPageUnload(page: Page): Promise async onBrowserClose(browser: Browser): Promise async onBeforePageClose(page: Page): Promise async onShutdown(): Promise } ``` #### Plugin Manager (`api/src/services/cdp/plugins/core/plugin-manager.ts`) Coordinates plugin lifecycle and ensures error isolation: - **Registration**: Manages plugin registration and dependency injection - **Event Distribution**: Notifies all plugins of browser events - **Error Handling**: Isolates plugin errors to prevent system crashes - **Lifecycle Management**: Coordinates plugin startup and shutdown ### 3. Session Management (`api/src/services/session.service.ts`) Manages browser sessions with isolated contexts: **Features:** - Session creation with custom configurations - Context isolation (cookies, localStorage, sessionStorage) - Resource cleanup and garbage collection - Session persistence and restoration - Concurrent session management ```typescript interface SessionConfig { proxy?: ProxyConfig; userAgent?: string; viewport?: { width: number; height: number }; extensions?: string[]; fingerprint?: FingerprintOptions; } ``` ### 4. File Storage Service (`api/src/services/file.service.ts`) Handles file operations with session-scoped storage: - **Upload Management**: Handles multipart file uploads - **Download Coordination**: Manages browser downloads - **Storage Isolation**: Session-scoped file storage - **Cleanup**: Automatic file cleanup on session end ## ๐Ÿ”Œ Plugin Architecture Deep Dive ### Plugin Lifecycle 1. **Registration**: Plugins register with the PluginManager 2. **Initialization**: Service dependency injection 3. **Event Handling**: Respond to browser lifecycle events 4. **Cleanup**: Graceful shutdown and resource cleanup ### Event Flow ``` Browser Launch โ†’ Plugin.onBrowserLaunch() โ†“ Page Created โ†’ Plugin.onPageCreated() โ†“ Page Navigate โ†’ Plugin.onPageNavigate() โ†“ Page Unload โ†’ Plugin.onPageUnload() โ†“ Page Close โ†’ Plugin.onBeforePageClose() โ†“ Browser Close โ†’ Plugin.onBrowserClose() โ†“ System Shutdown โ†’ Plugin.onShutdown() ``` ### Example Plugin Implementation ```typescript import { BasePlugin, PluginOptions } from '@steel-browser/api/cdp-plugin'; import { Browser, Page } from 'puppeteer-core'; export class AdBlockPlugin extends BasePlugin { private blockedDomains: Set; constructor(options: PluginOptions & { blockedDomains?: string[] }) { super({ name: 'ad-blocker', ...options }); this.blockedDomains = new Set(options.blockedDomains || []); } async onPageCreated(page: Page): Promise { await page.setRequestInterception(true); page.on('request', (request) => { const url = new URL(request.url()); if (this.blockedDomains.has(url.hostname)) { request.abort(); } else { request.continue(); } }); } } ``` ## ๐ŸŒ API Architecture ### Fastify Plugin System Steel Browser uses Fastify's plugin architecture for modular API design: ```typescript // Main plugin registration await fastify.register(steelBrowserPlugin, { fileStorage: { maxSizePerSession: 100 * MB } }); // Individual plugins await fastify.register(browserInstancePlugin); await fastify.register(sessionPlugin); await fastify.register(fileStoragePlugin); ``` ### Route Organization Routes are organized by functionality: - **Actions** (`/v1/*`): Browser automation actions (scrape, screenshot, PDF) - **Sessions** (`/v1/sessions/*`): Session management - **CDP** (`/v1/cdp/*`): Direct CDP access - **Files** (`/v1/files/*`): File upload/download - **Selenium** (`/selenium/*`): Selenium WebDriver compatibility ### Schema Validation All API endpoints use Zod schemas for validation: ```typescript const ScrapeRequestSchema = z.object({ url: z.string().url().optional(), delay: z.number().optional(), format: z.array(z.enum(['html','cleaned_html','markdown','readability'])).optional(), screenshot: z.boolean().optional(), pdf: z.boolean().optional(), }); ``` ## ๐ŸŽจ Frontend Architecture ### React Component Structure ``` src/ โ”œโ”€โ”€ components/ # Reusable UI components โ”‚ โ”œโ”€โ”€ ui/ # Base UI components (buttons, inputs) โ”‚ โ”œโ”€โ”€ badges/ # Status badges โ”‚ โ”œโ”€โ”€ icons/ # Icon components โ”‚ โ””โ”€โ”€ sessions/ # Session-specific components โ”œโ”€โ”€ containers/ # Page-level containers โ”œโ”€โ”€ contexts/ # React contexts for state management โ”œโ”€โ”€ hooks/ # Custom React hooks โ””โ”€โ”€ steel-client/ # Auto-generated API client ``` ### State Management - **React Query**: Server state management and caching - **React Context**: Global application state - **Local State**: Component-specific state with hooks ### Real-time Updates WebSocket connections provide real-time updates: ```typescript // Session monitoring const { data: sessions } = useQuery({ queryKey: ['sessions'], queryFn: () => steelClient.sessions.getSessions(), refetchInterval: 1000 // Real-time updates }); ``` ## ๐Ÿ”’ Security Architecture ### Input Validation - **API Level**: Zod schema validation for all inputs - **Browser Level**: Content Security Policy (CSP) headers - **File Level**: File type validation and size limits ### Context Isolation Each session runs in an isolated browser context: ```typescript const context = await browser.createIncognitoBrowserContext(); context.setDefaultNavigationTimeout(30000); context.setDefaultTimeout(30000); ``` ### Resource Limits - **Memory**: Browser process memory limits - **CPU**: Process CPU throttling - **Storage**: Session-scoped file storage limits - **Network**: Request rate limiting and proxy support ## ๐Ÿ“Š Performance Considerations ### Browser Resource Management - **Process Isolation**: Each session in separate browser context - **Memory Cleanup**: Automatic page and context cleanup - **Connection Pooling**: Reuse CDP connections where possible ### Caching Strategy - **Static Assets**: Long-term caching for UI assets - **API Responses**: Short-term caching for session data - **Browser Cache**: Configurable per-session browser caching ### Scaling Considerations Current architecture supports: - **Vertical Scaling**: Multi-core CPU utilization - **Session Concurrency**: Multiple simultaneous sessions - **Resource Monitoring**: Memory and CPU usage tracking Future scaling options: - **Horizontal Scaling**: Multiple Steel instances - **Load Balancing**: Session distribution - **Distributed Storage**: Shared file storage ## ๐Ÿงช Testing Architecture ### Test Structure (Planned) ``` tests/ โ”œโ”€โ”€ unit/ # Unit tests for individual components โ”œโ”€โ”€ integration/ # API endpoint integration tests โ”œโ”€โ”€ e2e/ # End-to-end browser automation tests โ””โ”€โ”€ performance/ # Load and performance tests ``` ### Testing Strategy - **Unit Tests**: Core services and utilities - **Integration Tests**: API endpoints and database interactions - **E2E Tests**: Full browser automation workflows - **Performance Tests**: Load testing and benchmarking ## ๐Ÿ”ง Configuration Management ### Environment Variables Configuration through environment variables: ```typescript const envSchema = z.object({ NODE_ENV: z.enum(['development', 'production', 'test']), HOST: z.string().default('0.0.0.0'), PORT: z.string().default('3000'), CHROME_EXECUTABLE_PATH: z.string().optional(), CHROME_HEADLESS: z.boolean().default(true), // ... more configuration options }); ``` ### Runtime Configuration - **Browser Options**: Per-session browser configuration - **Plugin Configuration**: Dynamic plugin options - **Feature Flags**: Runtime feature toggling ## ๐Ÿš€ Deployment Architecture ### Containerization Multi-stage Docker builds for optimization: ```dockerfile # Build stage FROM node:22-slim AS build # ... build steps # Production stage FROM node:22-slim AS production # ... production setup ``` ### Service Dependencies - **Chrome/Chromium**: Browser engine - **Node.js**: Runtime environment - **Nginx**: Reverse proxy (in containers) - **File System**: Session storage ## ๐Ÿ”„ Development Workflow ### Hot Reloading Development environment supports hot reloading: ```bash npm run dev # Starts both API and UI with hot reload ``` ### Debug Configuration Built-in debugging support: ```bash # API debugging node --inspect ./api/build/index.js # Enable verbose logging ENABLE_VERBOSE_LOGGING=true npm run dev -w api ``` ## ๐Ÿ“ˆ Monitoring and Observability ### Logging Structured logging with Pino: ```typescript fastify.log.info({ sessionId, action: 'page_created', url: page.url() }, 'New page created'); ``` ### Metrics (Planned) - **Session Metrics**: Creation, duration, success rates - **Performance Metrics**: Response times, resource usage - **Error Tracking**: Error rates and categorization ### Health Checks Built-in health check endpoints: ```typescript // Basic health check GET /health // Detailed readiness check GET /ready ``` --- This architecture provides a solid foundation for browser automation while maintaining flexibility for future enhancements and scaling requirements.